Writing Linked Data Event Streams in LDP Basic Containers

Living Document,

Editors:
Pieter Colpaert
Wout Slabbinck
Issue Tracking:
Issues
Github

Abstract

Writing to a time-based fragmented Linked Data Event Stream that is stored on a Linked Data Platform.

1. Introduction

We coin the term LDES in LDP to describe a time-based fragmented [LDES] that is stored on a Linked Data Platform [LDP]. Which allows interacting with LDESs using the LDP API.

2. LDES in LDP Protocol

Architecture of a Linked Data Event Stream in an LDP Container
The LDES in LDP Protocol consists of the structure and the rules to create, update and interpret a continuously growing time-based fragmented [LDES] that is stored in an [LDP].

The structure is visualized in the figure above and can be summarized in the following components:

An LDES in LDP is initialized in a data pod as an ldp:BasicContainer which contains the root of the LDES, a first fragment and extra metadata about this container. In the subsections, more details about this metadata will be given.

The root resource contains metadata about the Event Stream and its view using the [TREE] hypermedia and [LDES] vocabulary. The view consists of several tree:relations which contain information about the fragments of the Event Stream.

Each Fragment of an LDES in LDP is an ldp:BasicContainer. The LDP Resources present in a fragment (indicated by ldp:contains), are the members of the Event Stream.

Below is an example of a root . It consists of one tree:relation, where its class and properties indicate that all members, which were created after December the 15th, can be found by traversing to node http://example.org/{container}/1639526400000/.

@prefix : <http://example.org/{container}/root.ttl#> .
<http://example.org/{container}/root.ttl> rdf:type tree:Node ;
    tree:relation [ 
        a tree:GreaterThanOrEqualToRelation ;
        tree:node <http://example.org/{container}/1639526400000/> ;
        tree:path dct:modified ;
        tree:value "2021-12-15T00:00:00.000Z"^^xsd:dateTime
        ] .

:Collection a <https://w3id.org/ldes#EventStream> ;
    tree:shape <http://example.org/{container}/shape> ;
    tree:view <http://example.org/{container}/root.ttl>

2.1. Adding Resources

The method for adding a resource remains the same as for a normal LDP Resource creation: with an HTTP POST request. However, an application that adds a member to the Event Stream must know where to write to. To indicate the write location, a property already defined in the LDP specification is reused: the LDP Inbox (ldp:inbox, originated from the [LDN] specification).

Thus a triple of the form <baseContainer> ldp:inbox <locationURI>. is added to the metadata of the base container. This location URI is retrieved via the Link Header as the link value to the relation ldp:inbox when sending a GET or HEAD HTTP request.

Finally, a member can be added to the LDES with an HTTP POST request to the obtained location URI.

The following example shows the inbox triple. When performing a HEAD request to the base URL (http://example.org/{container}/), the Link Header with the write location (http://example.org/{container}/1639526400000/) is present in the corresponding response.

<http://example.org/{container}/> ldp:inbox <http://example.org/{container}/1639526400000/>.
HEAD /{container}/ HTTP/1.1
Host: http://example.org
HTTP/1.1 200
link: <http://example.org/{container}/1639526400000/>; rel="http://www.w3.org/ns/ldp#inbox"

2.2. Improving Interoperability

The tree:shape property of a [TREE] Collection indicates the data model that all of its members conform to. This data model is called a shape and is expressed in a shape language like [SHACL] or [SHEX].

When it is known a priori that the LDES will only have members with a certain predefined data model, it is possible to initialise the LDES in LDP with a shape.

To enforce shape validation executed by the LDP, the validator requires to know which shape resource to use. Therefore, the constrained by property of LDP (ldp:constrainedBy) will be used to encode an URI to the shape resource in the metadata of each fragment container.

Since all requests to add data that does not conform to the shape will be rejected, the resulting Event Stream consists of members that all conform to the shape.

Example of the metadata a fragment of an LDES in LDP that is constrained by a shape.
<http://example.org/{container}/1639526400000/> tree:shape <http://example.org/{container}/shape>.

3. Basic LDES Orchestrator

The Basic LDES Orchestrator is introduced to reduce overhead for the client and perform the operations that not any client is allowed to perform.

This Basic LDES Orchestrator has four roles:

3.1. Create Containers

Downloading a document on the internet takes time proportional to the location of the server versus the location of the client, the bandwidth and the size of the document. Designing LDES in LDP while minimizing that time, results in controlling the size of documents where possible: the container size. When a container contains a large number of resources, the serialization of the information of that container is large as well. This results in a bottleneck for the performance as loading the container page takes longer.

To overcome this bottleneck, every time the current container page is deemed full, a new, empty container is created. Furthermore, when the LDES in LDP is initialised with a shape, metadata must be added to this container to further impose this constraint, see § 2.2 Improving Interoperability.

3.2. Writable Container Indication

When a new container is created, the Inbox must be updated as well. Clients that want to add a member to the LDES can then find the container where they can write new resources, see § 2.1 Adding Resources.

It is the responsibility of the Orchestrator to update that triple in the metadata.

3.3. Maintain the View

The [TREE] hypermedia specification states that a view of a collection must reach all its members. Therefore on each creation of a new container, which is a new fragment of the collection, the view must be updated. Thus a relation is added in the root by the Orchestrator for each new fragment.

3.4. Update ACL Files

In case a [Solid] pod is used as a back-end, ACL resources (defined by the Web Acces Control [WAC] specification) are responsible for making sure that it is impossible to add new resources to containers that are not indicated as writeable. With an ACL resource in place in the current fragment container, it is enforced that only new resources may be added there. This is done by providing read (acl:read) and append (acl:append) rights in the ACL resource of that container.

Note: The orchestrator must have acl:Control for the base container and each fragment container to be able to update the ACL resources.

3.5. Sequence Diagram

The figure below shows the operations that the Orchestrator performs each time a new fragment is created for the case of a public LDES in LDP.

Sequence diagram of the Basic LDES Orchestrator for a public LDES in LDP

Note: An implementation of the Basic LDES Orchestrator can be found on npm: LDES Orchestrator

4. Versioning Approaches

There are two approaches to use LDES in LDP:

4.1. Versioning

[LDES] supports versioning of resources through Version Materializations. To support versioning, an ldes:EventStream MUST define two properties: ldes:versionOfPath and ldes:timestampPath.

ldes:versionOfPath declares the property that is used to define that a tree:member of an ldes:EventStream is a version.

ldes:timestampPath declares the property that is used to define the DateTime of a tree:member.

In the examples below, dct:isVersionOf is being used to define that a tree:member is a version of another member and dct:issued is used to denote the DateTime of when a this version was added to the Event Stream.

An Event Stream with one member which supports versioning (both ldes:versionOfPath and ldes:timestampPath are defined).
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
Here, ex:resource1v0 is the first version of ex:resource1.
An Event Stream where a member has been updated with a newer version.
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0, ex:resource1v1.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
    
ex:resource1v1 
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T12:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".
Here, a newer version of ex:resource1 has been created (ex:resource1v1), where the title has been changed.

4.1.1. Deleting a member

The [LDES] specification states that all members in an ldes:EventStream are immutable. This indicates that a member MUST NOT be changed and implicates that it MUST NOT be deleted.

With versioning, however, it SHOULD be possible to mark that a member of an Event Stream has become obsolete.

Therefore, this specification introduces the specific type ldes:DeletedLDPResource. This type for a tree:member states, when an LDES is used in the context for LDESinLDP, that it is marked as deleted from the [LDES].

An Event Stream where the most recent version of ex:resource1 is marked as deleted.
ex:ES a ldes:EventStream;
    ldes:versionOfPath dct:isVersionOf;
    ldes:timestampPath dct:issued;
    tree:member ex:resource1v0, ex:resource1v1, ex:resource1v2.

ex:resource1v0
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T10:00:00.000Z"^^xsd:dateTime;
    dct:title "First version of the title".
    
ex:resource1v1 
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T12:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".

ex:resource1v2
    a ldes:DeletedLDPResource;
    dct:isVersionOf ex:resource1;
    dct:issued "2021-12-15T14:00:00.000Z"^^xsd:dateTime;
    dct:title "Title has been updated once".

Note: It is preferred to mark members as deleted with a custom domain specific type. ldes:DeletedLDPResource is used to mark a member to be deleted in the case of LDESinLDP. Therefore, clients MUST also copy the last contents of the member when a domain specific type is added (which is done in this example as the title is copied from ex:resource1v1).

4.2. Version-Aware Approach

4.2.1. Client Implications

In this approach, clients are aware of the LDES in LDP Protocol. They know:

4.2.2. Server Implications

As a result, any LDP Server Implementation SHOULD be able to handle the operations executed by the client and the Basic LDES Orchestrator without modifications.

For completeness: the minimum requirements for the LDP Server to comply with the LDES in LDP Protocol and the Basic LDES Orchestrator are listed below:

Note: Currently the Basic LDES Orchestrator only works with the Community Solid Server

4.3. Version-Agnostic Approach

In contrast to the Version-Aware Approach, where a client is required to know everything, with a Version-Agnostic approach only knowledge about the LDP API is required.

4.3.1. Architecture

The architecture figure shows the structure when the LDP is combined with the LDES in LDP. The resources that are present in the {container} are a view derived from the LDES in feed, which is stored as an LDES in LDP.

More specifically {container} is a view, represented as an ldp:BasicContainer, that contains links to the original members of the Event Stream via ldp:contains. Dereferencing that link, leads to an LDP Resource that has the latest version of that member as content.

When a resource has multiple versions (e.g. due to it being edited), only the latest version will be shown as the LDP Resource.

Note: The whole history of those resources can be retrieved from the feed.

Architecture when LDES is used as base in a version-agnostic approach

The abstraction of the LDES in LDP through the LDP API results in several modifications to Creating, Reading, Updating and Deleting a Resource.

How to read the resources is already explained in the first paragraphs of the architecture. The other operations are explained in the subsections below.

4.3.2. Creating

An HTTP POST request is used to create a resource in an LDP. When a POST request is sent to an LDP Container three things happen: an identifier is created by the server for the created LDP Resource, the body of the request becomes the LDP Resource and metadata is added in the parent ldp:Container to indicate that it contains the new resource.

For Version-Agnostic implementations, the LDP behaviour for a POST is just the first step. The second step consists of combining the body of the request with two extra triples to indicate the version-specific representation. The newly composed body is then added to the feed LDES.

The first triple is to indicate the time of the creation of the Resource, the second triple is the reference to the identifier of the resource. An example of those triples when the server chose http://example.org/\{container\}/resource2 as identifier is shown in the example below.

@prefix : <http://example.org/{container}/> .

:feed/1639612800000/{uuid} dct:issued "2021-12-16T10:00:00.000Z"^^xsd:dateTime .
:feed/1639612800000/{uuid} dct:isVersionOf :resource2 .

Note: When a slug is provided in the Header of a POST request, a server can choose to use that slug as an identifier.

4.3.3. Updating

Updating LDP Resources can be done in two ways. First, there is an HTTP PUT request which replaces the resource with the body that accompanies the request. The second option is using an HTTP PATCH request that uses a [SPARQL-UPDATE] query, where first the server applies the changes and then the result is stored as the updated resource.

An LDP with Version-Agnostic LDES in LDP support stores those updates in the feed as the newest version. Thus when using PUT, the whole body of the request together with the version-specific triples are added to feed. After applying the changes using a PATCH request, the resulting resource is accompanied by the version-specific triples and appended to the feed.

4.3.4. Deleting

An HTTP DELETE request to an identifier of a resource results in the removal of that resource and its corresponding metadata in the parent container. In the feed however, all the versions are not removed because of two reasons. The first one is that an LDES is immutable, meaning that members can not be edited once they are in an LDES. The second reason is that the history of this resource would be removed as well.

Thus next to the LDP behaviour, an LDP Resource consisting of three triples is added to the feed to indicate the resource has been removed.

An example of such three triples can be seen in the example in § 4.1.1 Deleting a member. They are the triples with as subject ex:resource1v2.

5. Examples

5.1. Metadata notifications

At https://tree.linkeddatafragments.org/announcements/, a public, shape constrained LDES in LDP can be found which is used for publishing metadata of DCAT Application Profiles [VOCAB-DCAT-3] about datasets or data services, or metadata of a [TREE] View.

As LDP, a Community Solid Server [CSS] instance is used with shape support. On the server where the CSS resides, the Basic LDES Orchestrator runs. The trigger for creating new containers is when the current fragment container contains 100 resources or more.

Note: The CSS with shape support can be found at https://github.com/woutslabbinck/community-server on branch feat/shape-support

Note: The Orchestrator uses the following package: LDES Orchestrator

6. Namespaces

Commonly used namespace prefixes used in this specification:

@prefix acl: 	<http://www.w3.org/ns/auth/acl#> .
@prefix dct: 	<http://purl.org/dc/terms/> .
@prefix ldes: 	<https://w3id.org/ldes#> .
@prefix ldp: 	<http://www.w3.org/ns/ldp#> .
@prefix rdf: 	<http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix tree: 	<https://w3id.org/tree#> .
@prefix xsd: 	<http://www.w3.org/2001/XMLSchema#> .

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[LDES]
Pieter Colpaert. Linked Data Event Streams. LS. URL: https://w3id.org/ldes/specification
[LDN]
Sarven Capadisli; Amy Guy. Linked Data Notifications. 2 May 2017. REC. URL: https://www.w3.org/TR/ldn/
[LDP]
Steve Speicher; John Arwe; Ashok Malhotra. Linked Data Platform 1.0. 26 February 2015. REC. URL: https://www.w3.org/TR/ldp/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SHACL]
Holger Knublauch; Dimitris Kontokostas. Shapes Constraint Language (SHACL). 20 July 2017. REC. URL: https://www.w3.org/TR/shacl/
[SHEX]
Eric Prud'hommeaux; et al. Shape Expressions Language 2.1. URL: http://shex.io/shex-semantics/index.html
[Solid]
Sarven Capasdisli; et al. Solid Protocol. URL: https://solidproject.org/TR/protocol
[SPARQL-UPDATE]
Paula Gearon; Alexandre Passant; Axel Polleres. SPARQL 1.1 Update. 21 March 2013. REC. URL: https://www.w3.org/TR/sparql11-update/
[TREE]
Pieter Colpaert. The TREE hypermedia specification. LS. URL: https://w3id.org/tree/specification
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[VOCAB-DCAT-3]
Riccardo Albertoni; et al. Data Catalog Vocabulary (DCAT) - Version 3. 11 January 2022. WD. URL: https://www.w3.org/TR/vocab-dcat-3/
[WAC]
Sarven Capasdisli. Web Access Control. Draft. URL: https://solidproject.org/TR/wac