By Mads Holten Rasmussen
v0
This tutorial is part of a series in the Linked Building Data (LBD) School π«
The slides are quite detailed and text-heavy as they are supposed to be read rather than presented.
As I have been using the technologies for a while there might be topics that I take for granted and therefore skip over too fast. Please let me know if this is the case! π
Also, since this is something I do in my freetime, please support me by buying me a coffee π€βπ
Linked Building Data (LBD) encapsulates all the ontologies that are being developed and maintained by the W3C Linked Building Data Community Group
They are an alternative to the Web Ontology Language (OWL) version of IFC (ifcOWL) that are easier to understand, extend and query π
...but before we can dig into the content of these we will need to learn the basics. So jump to next slide and we will get to it! πͺππ¨
The concept of Linked Data was coined in 2006 by sir Tim Berners-Lee, the inventor of the web, and is therfore not entirely new π΄π΅
It basically refers to a set of best practices for publishing structured data on the Web
A network of resources that are interlinked and identified by web addresses.
Beware that you will often see resorces that are described in a namespace that doesn't really exist (e.g. https://example.com/my-resource). This is not preferred but yet perfectly okay in order to learn and get started. π©Ή
Also note that not all linked data needs to be open data. You can restrict the access to it π¦π¦Ί
As the name reveils, RDF is a framework and not a format. It includes the terminology you need to describe Linked Data resources (e.g. the type of resource, its attributes and its relationships to other resources).
This is all done by creating statements: so-called "triples".
Multiple triples form a graph if the object of one triple matches the subject of another
This is often referred to as a "Knowledge Graph"
Predicates that connect resources are called "object properties"
"has_window" and "adjacent_element" from the previous graphs are examples of object properties
RDF also describes simple properties that point to an object which is simply a literal value. Those are called "datatype properties"
As a last thing, RDF allows you to classify a thing by assigning one or more types to it.
Adding classes to things provide a better way for others to understand your dataset. And for computers!
You and I could probably deduce that Wall_002 is a Wall, but now it is explicitly stated.
That's it! Now you know the basic principles of RDF! π₯³π
Oh, not too fast! The Linked Data principles state that we should use HTTP URIs as names for things, so actually it looks more like shown on next slide... π±π€―
But don't worry! With the use of prefixes, it will look more like this π
The prefixes are abbreviations for the full URI so:
inst:Wall_002 = https://my-company.com/Wall_002
inst:Space_013 = https://my-company.com/Space_013
ont:has_window = https://my-ontology.com#has_window
In the example we use two prefixes (inst: and ont:) and therefore these refere to our instance namespace and the ontology namespace.
...and this brings us to the concepts ABox and TBox.
TBox is the "terminology layer" (i.e. the dictionaries) where predicates and classes are defined. These can later be used by various people to describe something. π
ABox is our "assertions layer" (i.e. the instance data) where we make actual statements about the world by using the TBox.
It is common to see multiple ontologies to describe one dataset. This is one of the absolute strengths about Linked Data. That it can be both an ifc:Building as defined by buildingSMART, an citygml:Building as described by OGC and a dbpedia:Building referring to Wikipedias description of a building.
In LBD we would typically at least describe it as a bot:Building, but more on this later.
Also the instance data can be distributed so part of the data can reside on one server and other parts on another.
Thereby there is a huge decentralization potential in the technologies that we will hopefully cover in a later class! πππ
Most of us are used to storing information in spreadsheets (or relational databases (RDBs) for the more advanced), but in RDF we use a graph structure. So how do those compare?
Let's take a look at a very simple table example to find out!
From an RDF perspective we see too subjects. One for each data row. And they are even identified with an ID.
Each column contains a statement about those subjects and the predicate value is given by the column header.
The value of the subject-predicate pair can then be found in the cell.
And with RDF we can combine table data as long as we can find a common identifier for the rows.
For example, we could have another table with middle names.
It's not a problem that there is no middle name for Alexander. RDF builds on an open world assumption meaning that if we have the information that's great and if we don't that's also alright. Then Alexanders middle name is simply unknown to us.
In RDBs it's common to have dedicated tables that deal with many-to-many relationships. These are also easily translated.
So now we also understand the difference and similarities between RDF and tables. And you can probably see how this data structure will save us for a ton of null values. π
And compared to RDBs there is the benefit that the data model makes sense from a logical perspective and not only from a data management perspective. This benefit will be more obvious when we start to query the datasets later!
This is a more advanced topic for those that are familiar with XML or JSON.
You can press the right arrow to skip this section, but if you're up for it, hit the down arow! πͺ
Let's first take a look at what RDF looks like in the simplest of all serializations called NTriples.
<https://my-company.com/Wall_002> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Wall> .
<https://my-company.com/Wall_002> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Element> .
<https://my-company.com/Wall_002> <https://my-ontology.com#has_window> <https://my-company.com/Window_022> .
<https://my-company.com/Wall_002> <https://my-ontology.com#U_value> "0.21 W/m2K" .
<https://my-company.com/Wall_002> <https://my-ontology.com#thickness> "240 mm" .
<https://my-company.com/Space_013> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Space> .
<https://my-company.com/Space_013> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Zone> .
<https://my-company.com/Space_013> <https://my-ontology.com#adjacent_element> <https://my-company.com/Wall_002> .
<https://my-company.com/Space_013> <https://my-ontology.com#area> "12.4 m2" .
<https://my-company.com/Space_013> <https://my-ontology.com#number_of_occupants> 2 .
<https://my-company.com/Space_013> <https://my-ontology.com#heated> true .
<https://my-company.com/Window_022> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Window> .
<https://my-company.com/Window_022> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Element> .
<https://my-company.com/Window_022> <https://my-ontology.com#height> "1200 mm" .
<https://my-company.com/Window_022> <https://my-ontology.com#width> "900 mm" .
This syntax is incredibly simple and super fast for computers to read.
Triples are stated in their full length and a dot indicates the end of a triple.
However, for humans it is impossible to read and it takes up a lot of space!
The Terse RDF Triple Language Turtle simplifies this a whole lot by introducing some syntactic sugar.
All this sugar is not covered here, but part of it is since this is also the syntax used in queries. Let's start by defining the prefixes as described in previous section.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix inst: <https://my-company.com/> .
@prefix ont: <https://my-ontology.com#> .
inst:Wall_002 rdf:type ont:Wall .
inst:Wall_002 rdf:type ont:Element .
inst:Wall_002 ont:has_window inst:Window_022 .
inst:Wall_002 ont:U_value "0.21 W/m2K" .
inst:Wall_002 ont:thickness "240 mm" .
inst:Space_013 rdf:type ont:Space .
inst:Space_013 rdf:type ont:Zone .
inst:Space_013 ont:adjacent_element inst:Wall_002 .
inst:Space_013 ont:area "12.4 m2" .
inst:Space_013 ont:number_of_occupants 2 .
inst:Space_013 ont:heated true .
inst:Window_022 rdf:type ont:Window .
inst:Window_022 rdf:type ont:Element .
inst:Window_022 ont:height "1200 mm" .
inst:Window_022 ont:width "900 mm" .
In turtle a can be used as an abbreviation for rdf:type, so we can further simplify it to.
@prefix inst: <https://my-company.com/> .
@prefix ont: <https://my-ontology.com#> .
inst:Wall_002 a ont:Wall .
inst:Wall_002 a ont:Element .
inst:Wall_002 ont:has_window inst:Window_022 .
inst:Wall_002 ont:U_value "0.21 W/m2K" .
inst:Wall_002 ont:thickness "240 mm" .
inst:Space_013 a ont:Space .
inst:Space_013 a ont:Zone .
inst:Space_013 ont:adjacent_element inst:Wall_002 .
inst:Space_013 ont:area "12.4 m2" .
inst:Space_013 ont:number_of_occupants 2 .
inst:Space_013 ont:heated true .
inst:Window_022 a ont:Window .
inst:Window_022 a ont:Element .
inst:Window_022 ont:height "1200 mm" .
inst:Window_022 ont:width "900 mm" .
We can further use ";" instead of "." to state the end of a triple where the same subject is to be reused in next triple.
@prefix inst: <https://my-company.com/> .
@prefix ont: <https://my-ontology.com#> .
inst:Wall_002 a ont:Wall ;
a ont:Wall ;
ont:has_window inst:Window_022 ;
ont:U_value "0.21 W/m2K" ;
ont:thickness "240 mm" .
inst:Space_013 a ont:Space .
a ont:Zone ;
ont:adjacent_element inst:Wall_002 ;
ont:area "12.4 m2" ;
ont:number_of_occupants 2 ;
ont:heated true .
inst:Window_022 a ont:Window ;
a ont:Wall ;
ont:height "1200 mm" ;
ont:width "900 mm" .
Lastly, we can use "," to state the end of a triple where the same subject-predicate pair is to be reused in next triple.
@prefix inst: <https://my-company.com/> .
@prefix ont: <https://my-ontology.com#> .
inst:Wall_002 a ont:Wall ,
ont:Element ;
ont:has_window inst:Window_022 ;
ont:U_value "0.21 W/m2K" ;
ont:thickness "240 mm" .
inst:Space_013 a ont:Space ,
ont:Zone ;
ont:adjacent_element inst:Wall_002 ;
ont:area "12.4 m2" ;
ont:number_of_occupants 2 ;
ont:heated true .
inst:Window_022 a ont:Window ,
ont:Wall ;
ont:height "1200 mm" ;
ont:width "900 mm" .
Since line breaks are ignored we can also write it more compact as shown on next slide. Further, the order is of no importance.
@prefix inst: <https://my-company.com/> .
@prefix ont: <https://my-ontology.com#> .
inst:Wall_002
a ont:Wall , ont:Element ;
ont:has_window inst:Window_022 ;
ont:U_value "0.21 W/m2K" ;
ont:thickness "240 mm" .
inst:Space_013
a ont:Space , ont:Zone ;
ont:adjacent_element inst:Wall_002 ;
ont:area "12.4 m2" ;
ont:number_of_occupants 2 ;
ont:heated true .
inst:Window_022
a ont:Window , ont:Wall ;
ont:height "1200 mm" ;
ont:width "900 mm" .
RDF can also be serialized as XML and JSON. RDF/XML will not be covered (who uses XML anyways?), but the new cool kid on the block JSON-LD will for sure.
Like Turtle JSON-LD can have many shapes and forms. The simplest form is the expanded version .
JSON-LD is just JSON but it adds some special keys like @id to define the subject and @type to assign rdf:type.
Similarly to NTriples, the expanded JSON-LD uses the full URIs. Since RDF allows for multiple assignments of classes or any other value for that matter, the value of any key is always an array.
[
{
"@id": "https://my-company.com/Wall_002",
"@type": [
"https://my-ontology.com#Wall",
"https://my-ontology.com#Element"
],
"https://my-ontology.com#U_value": [
{
"@value": "0.21 W/m2K"
}
],
"https://my-ontology.com#has_window": [
{
"@id": "https://my-company.com/Window_022"
}
],
"https://my-ontology.com#thickness": [
{
"@value": "240 mm"
}
]
},
{
"@id": "https://my-company.com/Space_013",
"@type": [
"https://my-ontology.com#Space",
"https://my-ontology.com#Zone"
],
"https://my-ontology.com#adjacent_element": [
{
"@id": "https://my-company.com/Wall_002"
}
],
"https://my-ontology.com#area": [
{
"@value": "12.4 m2"
}
],
"https://my-ontology.com#heated": [
{
"@value": true
}
],
"https://my-ontology.com#number_of_occupants": [
{
"@value": 2
}
]
},
{
"@id": "https://my-company.com/Window_022",
"@type": [
"https://my-ontology.com#Window",
"https://my-ontology.com#Element"
],
"https://my-ontology.com#height": [
{
"@value": "1200 mm"
}
],
"https://my-ontology.com#width": [
{
"@value": "900 mm"
}
]
}
]
By stating a @context and use the concept of compaction it is possible to simplify this tremendously.
The @graph component of the result presents all our resources in a flat list.
JSON-LD comes with tools to change between these representations on the fly which is a really strong tool. In a later tutorial we might take a look at framing!
{
"@context": {
"inst": "https://my-company.com/",
"ont": "https://my-ontology.com#"
},
"@graph": [
{
"@id": "inst:Wall_002",
"@type": ["ont:Wall", "ont:Element"],
"ont:has_window": {"@id": "inst:Window_022"},
"ont:U_value": "0.21 W/m2K",
"ont:thickness": "240 mm"
},
{
"@id": "inst:Space_013",
"@type": ["ont:Space", "ont:Zone"],
"ont:adjacent_element": {"@id": "inst:Wall_002"},
"ont:area": "12.4 m2",
"ont:number_of_occupants": 2,
"ont:heated": true
},
{
"@id": "inst:Window_022",
"@type": ["ont:Window", "ont:Element"],
"ont:height": "1200 mm",
"ont:width": "900 mm"
}
]
}
At the JSON-LD playground you can get a more detailed introduction to the possibilities. Next slide for example demonstrates a compaction that maps to danish keys in the JSON object.
To be continued...