Writing Linked Data Event Streams in LDP Basic Containers

Living Document,

Previous Versions:
Editors:
Pieter Colpaert
Wout Slabbinck
Issue Tracking:
Issues
Github

Abstract

Writing to a time-based fragmented Linked Data Event Stream that is stored on a Linked Data Platform.

1. Introduction

We coin the term LDES in LDP to describe a time-based fragmented [LDES] that is stored on a Linked Data Platform [LDP]. Which allows interacting with LDESs using the LDP API. Furthermore, when a [Solid] server is used as [LDP], authorisation over the LDES is provided.

2. LDES in LDP Protocol

Architecture of a Linked Data Event Stream in an LDP Container

The LDES in LDP Protocol consists of the structure and the rules to initialise, append to and create a new fragment to a continuously growing time-based fragmented [LDES] that is stored in an [LDP].

The structure is visualized in the figure above and can be summarized in the following components:

An LDES in LDP is initialized in a data pod as an ldp:BasicContainer which contains the root of the LDES, a first fragment and extra metadata about this container. More info can be found in the initialisation section

The information about the about the Event Stream and its view (which uses the [TREE] hypermedia and [LDES] specification) can be found in the metadata resource of the root {container} of the LDES. The view consists of several tree:relations which contain information about the fragments of the Event Stream.

Each Fragment of an LDES in LDP is an ldp:BasicContainer. The LDP Resources present in a fragment (indicated by ldp:contains), are the members of the Event Stream.

Below is an example of a root. It consists of two tree:relations. The first one indicating that all members, which were created after December the 15th and before December the 16th, can be found by traversing to node http://example.org/{container}/1639526400000/.

@prefix : <http://example.org/{container}/#> .
<http://example.org/{container}/> rdf:type tree:Node ;
    tree:relation [ 
        a tree:GreaterThanOrEqualToRelation ;
        tree:node <http://example.org/{container}/1639526400000/> ;
        tree:path dct:modified ;
        tree:value "2021-12-15T00:00:00.000Z"^^xsd:dateTime
        ], [
        a tree:GreaterThanOrEqualToRelation ;
        tree:node <http://example.org/{container}/1639612800000/> ;
        tree:path dct:modified ;
        tree:value "2021-12-16T00:00:00.000Z"^^xsd:dateTime
        ] .

:EventStream a <https://w3id.org/ldes#EventStream> ;
    tree:shape <http://example.org/{container}/shape> ;
    tree:view <http://example.org/{container}/root.ttl>.

Note: It is also allowed to have a root resource called root in the root container which contains the information about the Event Stream and its view.

2.1. Initialising

To initialise an LDES in LDP, the following steps must be performed.

  1. Create the root container

  2. Create the Even Stream + view information

  3. Create the first fragment (more information can be found in its section)

  4. Add the inbox triple to the root container to make sure the write location is discoverable

2.2. Appending a member

The method for appending a member to the LDES is executed via an HTTP POST request to the write location with as body the member. To indicate the write location, a property already defined in the LDP specification is reused: the LDP Inbox (ldp:inbox, originated from the [LDN] specification).

A triple of the form <baseContainer> ldp:inbox <locationURL>. is added to the metadata of the base container. This location URL is retrieved via the Link Header as the link value to the relation ldp:inbox when sending a GET or HEAD HTTP request (as is defined in [LDN]).

An example of the inbox triple.
<http://example.org/{container}/> ldp:inbox <http://example.org/{container}/1639612800000/>.
A HEAD request to the base of an LDES in LDP.
HEAD /{container}/ HTTP/1.1
Host: http://example.org

The Link Header in the response of the HEAD request to the LDES in LDP base indicating that http://example.org/{container}/1639612800000/ is the write location of the LDES in LDP

HTTP/1.1 200
link: <http://example.org/{container}/1639612800000/>; rel="http://www.w3.org/ns/ldp#inbox"

Note: Add that in the body of the POST request, there must be a <LILBase> tree:member <resource> triple.

2.3. Creating a new fragment

The following steps need to be executed to create a new fragment:

  1. Create a new ldp:Container

  2. Add relation triples to the view of the LDES in LDP

  3. Update the inbox to link to the newly created container from step 1

§ 5.2 View Description provides extra information/instructions about when to create a new fragment.

2.4. Improving Interoperability

The tree:shape property of a [TREE] Collection indicates the data model that all of its members conform to. This data model is called a shape and is expressed in a shape language like [SHACL] or [SHEX].

When it is known a priori that the LDES will only have members with a certain predefined data model, it is possible to initialise the LDES in LDP with a shape.

To enforce shape validation executed by the LDP, the validator requires to know which shape resource to use. Therefore, the constrained by property of LDP (ldp:constrainedBy) will be used to encode an URI to the shape resource in the metadata of each fragment container.

Since all requests to add data that does not conform to the shape will be rejected, the resulting Event Stream consists of members that all conform to the shape.

Example of the metadata a fragment of an LDES in LDP that is constrained by a shape.
<http://example.org/{container}/1639526400000/> tree:shape <http://example.org/{container}/shape>.

3. Versioned LDES in LDP

As stated in the [LDES] specification, members are immutable. However, through version-objects, it is possible to indicate changes. More information can be found in the [LDES] specification in the section Version Materializations.

Two properties MUST be added to the Event Stream description for a versioned [LDES].

In the examples below, dct:isVersionOf is being used to define that a tree:member is a version of another member and dct:issued is used to denote the DateTime of when a this version was added to the Event Stream.

An Event Stream with one member which supports versioning (both ldes:versionOfPath and ldes:timestampPath are defined).
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
Here, ex:resource1v0 is the first version of ex:resource1.
An Event Stream where a member has been updated with a newer version.
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0, ex:resource1v1.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
    
ex:resource1v1 
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T12:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".
Here, a newer version of ex:resource1 has been created (ex:resource1v1), where the title has been changed.

As can be seen in the example, each member has two version-object triples. Thus in a versioned LDES in LDP, each member MUST have these version-object triples as well.

With versioning it SHOULD be possible to mark that a member of an Event Stream has become obsolete.

To mark a member of an Event Stream obsolete, this specification introduces the specific type ldes:DeletedLDPResource. This type for a tree:member states, when an LDES is used in the context for LDESinLDP, that it is marked as deleted from the [LDES].

An Event Stream where the most recent version of ex:resource1 is marked as deleted.
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0, ex:resource1v1, ex:resource1v2.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
    
ex:resource1v1 
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T12:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".

ex:resource1v2
    a ldes:DeletedLDPResource;
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T14:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".

Note: It is preferred to mark members as deleted with a custom domain specific type. ldes:DeletedLDPResource is used to mark a member to be deleted in the case of LDESinLDP. Therefore, clients MUST also copy the last contents of the member when a domain specific type is added (which is done in this example as the title is copied from ex:resource1v1).

This specification introduces two ways to interact with a versioned LDES in LDP.

3.1. Client managed versioned LDES in LDP

A client can interact with a versioned LDES in LDP by following the protocol either by manually sending the requests or by libraries that have implemented it.

The basic requirements for the client and server are stated in the following subsections.

3.1.1. Client Implications

In this approach, clients are aware of the LDES in LDP Protocol. They know:

3.1.2. Server Implications

As a result, any LDP Server Implementation MUST be able to handle the operations executed by the client.

For completeness: the minimum requirements for the LDP Server to comply with the LDES in LDP Protocol:

3.2. Server managed versioned LDES in LDP

The versioned LDES in LDP protocol is abstracted away by the LDP API, which enables clients to work with CRUD operation to a versioned LDES.

The client is agnostic that version-objects were used to realise the resources. Though the use of those version-objects allows the client to query the history of a given ldp:Resource, which can be achieved by incorporating the Memento specification [RFC7089] on top of the LDP (§ 5 Extensions). Furthermore, it is possible to engineer additional views on top of the LDES through an LDES server.

An LDP server on top of a versioned LDES in LDP must translate the LDP operations to append and query operations to a versioned LDES. To illustrate how the operations can be translated to a versioned LDES, the complete behaviour of interactions with an ldp:Container is explained in the following subsection: § 3.2.1 LDP Container.

With this explanation the requirements for such an LDP server (that abstracts away a versioned LDES in LDP) are elaborated, such that they can be build.

3.2.1. LDP Container

A bi-directional mapping from an LDP Container to a versioned LDES in LDP is provided.

This means that when such an ldp:container is created, a versioned ldes in ldp is initialised whereas the container that you request is just a view of the versioned LDES.

An architectural overview of a view of an ldp:BasicContainer with two resources which uses a versioned LDES in LDP as backend can be seen in the following figure.

Architecture where a versioned LDES in LDP is used as a backend for LDP.

Operating the CRUD methods to the LDP results into respective interaction with the versioned LDES, which is further elaborated per operation in the following paragraphs.

3.2.1.1. Creating a Resource

An HTTP POST request is used to create a resource in an LDP. When a POST request is sent to an LDP Container three things happen: an identifier is created by the server for the created LDP Resource, the body of the request becomes the LDP Resource and metadata is added in the parent ldp:Container to indicate that it contains the new resource.

For the server, the LDP behaviour for a POST is just the first step. The second step consists of combining the body of the request with two extra triples to indicate the version-specific representation. The newly composed body is then added to the feed LDES.

The first triple is to indicate the time of the creation of the Resource, the second triple is the reference to the identifier of the resource. An example of those triples when the server chose http://example.org/\{container\}/resource2 as identifier is shown in the example below.

An HTTP POST request to create an ldp:Resource
POST /{container}/ HTTP/1.1
Host: http://example.org
Slug: resource1
Content-Type: text/turtle

<resource1> dct:title "First version of the title.".
The effect of the POST request that now persists in the versioned LDES in LDP.
@prefix : <http://example.org/{container}/> .

:feed/1639612800000/{uuid} tree:member.
:feed/1639612800000/{uuid} dct:title "First version of the title." .
:feed/1639612800000/{uuid} dct:issued "2021-12-16T10:00:00.000Z"^^xsd:dateTime .
:feed/1639612800000/{uuid} dct:isVersionOf :resource1 .
3.2.1.2. Reading a Resource

Reading the above created resource with a GET request using the LDP API will result in the following triple: <http://example.org/{container}/resource1> dct:title "First version of the title." .

Note: When the memento specification ([RFC7089]) is used as an extension, the Accept-Datetime request reader can be used to read previous versions of the resource. Additional information can be found in the § 5 Extensions

3.2.1.3. Updating a Resource

Updating LDP Resources can be done in two ways. First, there is an HTTP PUT request which replaces the resource with the body that accompanies the request. The second option is using an HTTP PATCH request that uses a [SPARQL-UPDATE] or N3 Patch query, where first the server applies the changes and then the result is stored as the updated resource.

An LDP with a versioned LDES in LDP backend stores those updates in the feed as the newest version. Thus when using PUT, the whole body of the request together with the version-specific triples are added to feed. After applying the changes using a PATCH request, the resulting resource is accompanied by the version-specific triples and appended to the feed.

3.2.1.4. Deleting a Resource

An HTTP DELETE request to an identifier of a resource results in the removal of that resource and its corresponding metadata in the parent container. In the feed however, all the versions are not removed because of two reasons. The first one is that an LDES is immutable, meaning that members can not be edited once they are in an LDES. The second reason is that the history of this resource would be removed as well.

Thus next to the LDP behaviour, an LDP Resource consisting of three triples is added to the feed to indicate the resource has been removed.

An example of such three triples can be seen in the example. They are the triples with as subject ex:resource1v2.

4. Examples

4.1. Metadata notifications

At https://tree.linkeddatafragments.org/announcements/, a public, shape constrained LDES in LDP can be found which is used for publishing metadata of DCAT Application Profiles [VOCAB-DCAT-3] about datasets or data services, or metadata of a [TREE] View.

As LDP, a Community Solid Server [CSS] instance is used with shape support. On the server where the CSS resides, the Basic LDES Orchestrator runs. The trigger for creating new containers is when the current fragment container contains 100 resources or more.

Note: The CSS with shape support can be found at https://github.com/woutslabbinck/community-server on branch feat/shape-support

4.2. Solid Event Sourcing project

A repository to publish raw gpx data to a Solid pod. Here each location point is encapsulated as a version-object in a versioned LDES in LDP.

Furthermore, it provides functions with documentation to store any kind of streaming data to a Solid pod.

5. Extensions

5.1. Memento

The Fedora API Specification and Trellis Linked Data Server implement the memento specification ([RFC7089]) on top of [LDP]. This allows dateTime negotiation over ldp:Resources, which provides the historical values of that resource.

In a server managed versioned LDES in LDP, the memento specification can also be provided on top of that abstract LDP implementation. This provides an alternative (complementary to [LDES]) to retrieve a historical version of a given ldp:Resource. Implementation wise, when an Accept-DateTime header is provided, the correct version-object is then queried in the versioned LDES in LDP and provided as response.

5.2. View Description

The § 2 LDES in LDP Protocol provides instructions on how to create a new fragment. However, it does not state when to create one.

The View Description solves this problem by encapsulating the explicit strategy used to create an [LDES] or how it was created.

Using following example, the interpretation of the View Description is explained. The view of an Event Stream has a property that links to the View Description (ldes:viewDescription).

A View Description consists of three core properties:

The entity that maintains the structure is a client that conforms to the § 2 LDES in LDP Protocol and MAY have a Bucketize Strategy.

In this example, there is a ldes:BucketizeStrategy and it states that the view is a timestamp fragmentation is which uses the value of tree:path property. Furthermore it states that each fragment contains 100 members.

This means that if there are 100 members in a given fragment, a new one MUST be created (following the instructions in § 2.3 Creating a new fragment) by an LDES in LDP client.

@prefix : <http://example.org/{container}/#> .

<http://example.org/{container}/> rdf:type tree:Node ;
    tree:viewDescription :Fragmentation ;
    tree:relation [ 
        a tree:GreaterThanOrEqualToRelation ;
        tree:node <http://example.org/{container}/1639526400000/> ;
        tree:path dct:modified ;
        tree:value "2021-12-15T00:00:00.000Z"^^xsd:dateTime
        ], [
        a tree:GreaterThanOrEqualToRelation ;
        tree:node <http://example.org/{container}/1639612800000/> ;
        tree:path dct:modified ;
        tree:value "2021-12-16T00:00:00.000Z"^^xsd:dateTime
        ] .

:EventStream a <https://w3id.org/ldes#EventStream> ;
    tree:shape <http://example.org/{container}/shape> ;
    tree:view <http://example.org/{container}/>.
  
:Fragmentation a tree:ViewDescription ; 
    dcat:endpointURL <http://example.org/{container}/> ;
    dcat:servesDataset :EventStream;
    ldes:managedBy <client>. 

<client> a ldes:LDESinLDPClient;
    ldes:bucketizeStrategy ex:BucketizeStrategy.

:BucketizeStrategy a ldes:BucketizeStrategy;
    ldes:bucketType ldes:timestampFragmentation;
    tree:path dct:created; 
    ldes:pageSize 100.

To summarize, an LDES in LDP client that appends members to the LDES is thus responsible for creating new fragments as described in the View Description.

5.3. B+-TREE implementation

Currently, new Fragments are made as a new container under the base container. This might become a bottleneck if lots of Fragments are present. A possibility is have nested fragments, which results in the LDES in LDP having a B+ tree structure.

5.4. Multiple members per LDP Resource

Due to a high velocity of incoming data, the number of HTTP POST requests might add too much overhead.

To overcome this, multiple version-objects can be grouped together into one ldp:Resource.

6. Namespaces

Commonly used namespace prefixes used in this specification:

@prefix acl: 	<http://www.w3.org/ns/auth/acl#> .
@prefix dct: 	<http://purl.org/dc/terms/> .
@prefix ldes: 	<https://w3id.org/ldes#> .
@prefix ldp: 	<http://www.w3.org/ns/ldp#> .
@prefix rdf: 	<http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix tree: 	<https://w3id.org/tree#> .
@prefix xsd: 	<http://www.w3.org/2001/XMLSchema#> .

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[LDES]
Pieter Colpaert. Linked Data Event Streams. LS. URL: https://w3id.org/ldes/specification
[LDN]
Sarven Capadisli; Amy Guy. Linked Data Notifications. URL: https://linkedresearch.org/ldn/
[LDP]
Steve Speicher; John Arwe; Ashok Malhotra. Linked Data Platform 1.0. URL: https://www.w3.org/2012/ldp/hg/ldp.html
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[RFC7089]
H. Van de Sompel; M. Nelson; R. Sanderson. HTTP Framework for Time-Based Access to Resource States -- Memento. December 2013. Informational. URL: https://www.rfc-editor.org/rfc/rfc7089
[SHACL]
Holger Knublauch; Dimitris Kontokostas. Shapes Constraint Language (SHACL). URL: https://w3c.github.io/data-shapes/shacl/
[SHEX]
Eric Prud'hommeaux; et al. Shape Expressions Language 2.1. URL: http://shex.io/shex-semantics/index.html
[Solid]
Sarven Capasdisli; et al. Solid Protocol. URL: https://solidproject.org/TR/protocol
[SPARQL-UPDATE]
Paula Gearon; Alexandre Passant; Axel Polleres. SPARQL 1.1 Update. 21 March 2013. REC. URL: https://www.w3.org/TR/sparql11-update/
[TREE]
Pieter Colpaert. The TREE hypermedia specification. LS. URL: https://w3id.org/tree/specification
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[VOCAB-DCAT-3]
Riccardo Albertoni; et al. Data Catalog Vocabulary (DCAT) - Version 3. URL: https://w3c.github.io/dxwg/dcat/