LBD School lecture 1

Introduction to Linked Building Data

By Mads Holten Rasmussen

v0

This tutorial is part of a series in the Linked Building Data (LBD) School 🏫

The slides are quite detailed and text-heavy as they are supposed to be read rather than presented.

As I have been using the technologies for a while there might be topics that I take for granted and therefore skip over too fast. Please let me know if this is the case! πŸ™

Also, since this is something I do in my freetime, please support me by buying me a coffee πŸ€—β˜•πŸ’™

What you will learn

  • the Linked Data principles
  • intro to Resource Description Framework (RDF)
  • different serializations of RDF
  • the Building Topology Ontology (BOT) and other LBD ontologies

Linked Building Data? πŸ€”

Linked Building Data (LBD) encapsulates all the ontologies that are being developed and maintained by the W3C Linked Building Data Community Group

They are an alternative to the Web Ontology Language (OWL) version of IFC (ifcOWL) that are easier to understand, extend and query πŸ‘Œ

...but before we can dig into the content of these we will need to learn the basics. So jump to next slide and we will get to it! πŸ’ͺπŸƒπŸ’¨

Linked Data

The concept of Linked Data was coined in 2006 by sir Tim Berners-Lee, the inventor of the web, and is therfore not entirely new πŸ‘΄πŸ‘΅

It basically refers to a set of best practices for publishing structured data on the Web

Linked Data Principles πŸ€“

  1. Use URIs (Uniform Resource Identifiers) as names for things
  2. Use HTTP URIs (web addresses) so that people can look up those names πŸ‘€
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) (more on this later) πŸ’‘
  4. Include links to other URIs so that they can discover more things πŸ”­πŸ§­πŸ—ΊοΈ

Summary

A network of resources that are interlinked and identified by web addresses.

Beware that you will often see resorces that are described in a namespace that doesn't really exist (e.g. https://example.com/my-resource). This is not preferred but yet perfectly okay in order to learn and get started. 🩹

Also note that not all linked data needs to be open data. You can restrict the access to it 🚦🦺

The Resource Description Framework

As the name reveils, RDF is a framework and not a format. It includes the terminology you need to describe Linked Data resources (e.g. the type of resource, its attributes and its relationships to other resources).

This is all done by creating statements: so-called "triples".

Multiple triples form a graph if the object of one triple matches the subject of another

This is often referred to as a "Knowledge Graph"

Predicates that connect resources are called "object properties"

"has_window" and "adjacent_element" from the previous graphs are examples of object properties

RDF also describes simple properties that point to an object which is simply a literal value. Those are called "datatype properties"

As a last thing, RDF allows you to classify a thing by assigning one or more types to it.

Adding classes to things provide a better way for others to understand your dataset. And for computers!

You and I could probably deduce that Wall_002 is a Wall, but now it is explicitly stated.

That's it! Now you know the basic principles of RDF! πŸ₯³πŸŽ‰

Oh, not too fast! The Linked Data principles state that we should use HTTP URIs as names for things, so actually it looks more like shown on next slide... 😱🀯

But don't worry! With the use of prefixes, it will look more like this 😌

The prefixes are abbreviations for the full URI so:

inst:Wall_002  =  https://my-company.com/Wall_002

inst:Space_013  =  https://my-company.com/Space_013

ont:has_window  =  https://my-ontology.com#has_window

In the example we use two prefixes (inst: and ont:) and therefore these refere to our instance namespace and the ontology namespace.

...and this brings us to the concepts ABox and TBox.

TBox is the "terminology layer" (i.e. the dictionaries) where predicates and classes are defined. These can later be used by various people to describe something. πŸ“™

ABox is our "assertions layer" (i.e. the instance data) where we make actual statements about the world by using the TBox.

It is common to see multiple ontologies to describe one dataset. This is one of the absolute strengths about Linked Data. That it can be both an ifc:Building as defined by buildingSMART, an citygml:Building as described by OGC and a dbpedia:Building referring to Wikipedias description of a building.

In LBD we would typically at least describe it as a bot:Building, but more on this later.

Also the instance data can be distributed so part of the data can reside on one server and other parts on another.

Thereby there is a huge decentralization potential in the technologies that we will hopefully cover in a later class! πŸš€πŸš€πŸš€

RDF vs. tables

Most of us are used to storing information in spreadsheets (or relational databases (RDBs) for the more advanced), but in RDF we use a graph structure. So how do those compare?

Let's take a look at a very simple table example to find out!

From an RDF perspective we see too subjects. One for each data row. And they are even identified with an ID.

Each column contains a statement about those subjects and the predicate value is given by the column header.

The value of the subject-predicate pair can then be found in the cell.

And with RDF we can combine table data as long as we can find a common identifier for the rows.

For example, we could have another table with middle names.

It's not a problem that there is no middle name for Alexander. RDF builds on an open world assumption meaning that if we have the information that's great and if we don't that's also alright. Then Alexanders middle name is simply unknown to us.

In RDBs it's common to have dedicated tables that deal with many-to-many relationships. These are also easily translated.

So now we also understand the difference and similarities between RDF and tables. And you can probably see how this data structure will save us for a ton of null values. πŸ˜ƒ

And compared to RDBs there is the benefit that the data model makes sense from a logical perspective and not only from a data management perspective. This benefit will be more obvious when we start to query the datasets later!

RDF Serializations

This is a more advanced topic for those that are familiar with XML or JSON.

You can press the right arrow to skip this section, but if you're up for it, hit the down arow! πŸ’ͺ

Let's first take a look at what RDF looks like in the simplest of all serializations called NTriples.


								<https://my-company.com/Wall_002> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Wall> .
								<https://my-company.com/Wall_002> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Element> .
								<https://my-company.com/Wall_002> <https://my-ontology.com#has_window> <https://my-company.com/Window_022> .
								<https://my-company.com/Wall_002> <https://my-ontology.com#U_value> "0.21 W/m2K" .
								<https://my-company.com/Wall_002> <https://my-ontology.com#thickness> "240 mm" .
								<https://my-company.com/Space_013> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Space> .
								<https://my-company.com/Space_013> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Zone> .
								<https://my-company.com/Space_013> <https://my-ontology.com#adjacent_element> <https://my-company.com/Wall_002> .
								<https://my-company.com/Space_013> <https://my-ontology.com#area> "12.4 m2" .
								<https://my-company.com/Space_013> <https://my-ontology.com#number_of_occupants> 2 .
								<https://my-company.com/Space_013> <https://my-ontology.com#heated> true .
								<https://my-company.com/Window_022> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Window> .
								<https://my-company.com/Window_022> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://my-ontology.com#Element> .
								<https://my-company.com/Window_022> <https://my-ontology.com#height> "1200 mm" .
								<https://my-company.com/Window_022> <https://my-ontology.com#width> "900 mm" .
							  

This syntax is incredibly simple and super fast for computers to read.

Triples are stated in their full length and a dot indicates the end of a triple.

However, for humans it is impossible to read and it takes up a lot of space!

The Terse RDF Triple Language Turtle simplifies this a whole lot by introducing some syntactic sugar.

All this sugar is not covered here, but part of it is since this is also the syntax used in queries. Let's start by defining the prefixes as described in previous section.


								@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
								@prefix inst: <https://my-company.com/> .
								@prefix ont:  <https://my-ontology.com#> .

								inst:Wall_002      rdf:type                 ont:Wall .
								inst:Wall_002      rdf:type                 ont:Element .
								inst:Wall_002      ont:has_window           inst:Window_022 .
								inst:Wall_002      ont:U_value              "0.21 W/m2K" .
								inst:Wall_002      ont:thickness            "240 mm" .
								inst:Space_013     rdf:type                 ont:Space .
								inst:Space_013     rdf:type                 ont:Zone .
								inst:Space_013     ont:adjacent_element     inst:Wall_002 .
								inst:Space_013     ont:area                 "12.4 m2" .
								inst:Space_013     ont:number_of_occupants  2 .
								inst:Space_013     ont:heated               true .
								inst:Window_022    rdf:type                 ont:Window .
								inst:Window_022    rdf:type                 ont:Element .
								inst:Window_022    ont:height               "1200 mm" .
								inst:Window_022    ont:width                "900 mm" .
							  

In turtle a can be used as an abbreviation for rdf:type, so we can further simplify it to.


								@prefix inst: <https://my-company.com/> .
								@prefix ont:  <https://my-ontology.com#> .

								inst:Wall_002      a                        ont:Wall .
								inst:Wall_002      a                        ont:Element .
								inst:Wall_002      ont:has_window           inst:Window_022 .
								inst:Wall_002      ont:U_value              "0.21 W/m2K" .
								inst:Wall_002      ont:thickness            "240 mm" .
								inst:Space_013     a                        ont:Space .
								inst:Space_013     a                        ont:Zone .
								inst:Space_013     ont:adjacent_element     inst:Wall_002 .
								inst:Space_013     ont:area                 "12.4 m2" .
								inst:Space_013     ont:number_of_occupants  2 .
								inst:Space_013     ont:heated               true .
								inst:Window_022    a                        ont:Window .
								inst:Window_022    a                        ont:Element .
								inst:Window_022    ont:height               "1200 mm" .
								inst:Window_022    ont:width                "900 mm" .
							  

We can further use ";" instead of "." to state the end of a triple where the same subject is to be reused in next triple.


								@prefix inst: <https://my-company.com/> .
								@prefix ont:  <https://my-ontology.com#> .

								inst:Wall_002      a                        ont:Wall ;
								                   a                        ont:Wall ;
								                   ont:has_window           inst:Window_022 ;
								                   ont:U_value              "0.21 W/m2K" ;
								                   ont:thickness            "240 mm" .
								inst:Space_013     a                        ont:Space .
								                   a                        ont:Zone ;
								                   ont:adjacent_element     inst:Wall_002 ;
								                   ont:area                 "12.4 m2" ;
								                   ont:number_of_occupants  2 ;
								                   ont:heated               true .
								inst:Window_022    a                        ont:Window ;
								                   a                        ont:Wall ;
								                   ont:height               "1200 mm" ;
								                   ont:width                "900 mm" .
							  

Lastly, we can use "," to state the end of a triple where the same subject-predicate pair is to be reused in next triple.


								@prefix inst: <https://my-company.com/> .
								@prefix ont:  <https://my-ontology.com#> .

								inst:Wall_002      a                        ont:Wall , 
								                                            ont:Element ;
								                   ont:has_window           inst:Window_022 ;
								                   ont:U_value              "0.21 W/m2K" ;
								                   ont:thickness            "240 mm" .
								inst:Space_013     a                        ont:Space ,
								                                            ont:Zone ;
								                   ont:adjacent_element     inst:Wall_002 ;
								                   ont:area                 "12.4 m2" ;
								                   ont:number_of_occupants  2 ;
								                   ont:heated               true .
								inst:Window_022    a                        ont:Window ,
								                                            ont:Wall ;
								                   ont:height               "1200 mm" ;
								                   ont:width                "900 mm" .
							  

Since line breaks are ignored we can also write it more compact as shown on next slide. Further, the order is of no importance.


								@prefix inst: <https://my-company.com/> .
								@prefix ont:  <https://my-ontology.com#> .

								inst:Wall_002
									a ont:Wall , ont:Element ;
								    ont:has_window  inst:Window_022 ;
								    ont:U_value  "0.21 W/m2K" ;
								    ont:thickness  "240 mm" .
								inst:Space_013
								    a ont:Space , ont:Zone ;
								    ont:adjacent_element  inst:Wall_002 ;
								    ont:area  "12.4 m2" ;
								    ont:number_of_occupants  2 ;
								    ont:heated  true .
								inst:Window_022
								    a ont:Window , ont:Wall ;
								    ont:height  "1200 mm" ;
								    ont:width  "900 mm" .
							  

RDF can also be serialized as XML and JSON. RDF/XML will not be covered (who uses XML anyways?), but the new cool kid on the block JSON-LD will for sure.

Like Turtle JSON-LD can have many shapes and forms. The simplest form is the expanded version .

JSON-LD is just JSON but it adds some special keys like @id to define the subject and @type to assign rdf:type.

Similarly to NTriples, the expanded JSON-LD uses the full URIs. Since RDF allows for multiple assignments of classes or any other value for that matter, the value of any key is always an array.


						[
							{
								"@id": "https://my-company.com/Wall_002",
								"@type": [
									"https://my-ontology.com#Wall",
									"https://my-ontology.com#Element"
								],
								"https://my-ontology.com#U_value": [
									{
										"@value": "0.21 W/m2K"
									}
								],
								"https://my-ontology.com#has_window": [
									{
										"@id": "https://my-company.com/Window_022"
									}
								],
								"https://my-ontology.com#thickness": [
									{
										"@value": "240 mm"
									}
								]
							},
							{
								"@id": "https://my-company.com/Space_013",
								"@type": [
									"https://my-ontology.com#Space",
									"https://my-ontology.com#Zone"
								],
								"https://my-ontology.com#adjacent_element": [
									{
										"@id": "https://my-company.com/Wall_002"
									}
								],
								"https://my-ontology.com#area": [
									{
										"@value": "12.4 m2"
									}
								],
								"https://my-ontology.com#heated": [
									{
										"@value": true
									}
								],
								"https://my-ontology.com#number_of_occupants": [
									{
										"@value": 2
									}
								]
							},
							{
								"@id": "https://my-company.com/Window_022",
								"@type": [
									"https://my-ontology.com#Window",
									"https://my-ontology.com#Element"
								],
								"https://my-ontology.com#height": [
									{
										"@value": "1200 mm"
									}
								],
								"https://my-ontology.com#width": [
									{
										"@value": "900 mm"
									}
								]
							}
							]
							  

By stating a @context and use the concept of compaction it is possible to simplify this tremendously.

The @graph component of the result presents all our resources in a flat list.

JSON-LD comes with tools to change between these representations on the fly which is a really strong tool. In a later tutorial we might take a look at framing!


						{
							"@context": {
								"inst": "https://my-company.com/",
								"ont": "https://my-ontology.com#"
							},
							"@graph": [
								{
									"@id": "inst:Wall_002",
									"@type": ["ont:Wall", "ont:Element"],
									"ont:has_window": {"@id": "inst:Window_022"},
									"ont:U_value": "0.21 W/m2K",
									"ont:thickness": "240 mm"
								},
								{
									"@id": "inst:Space_013",
									"@type": ["ont:Space", "ont:Zone"],
									"ont:adjacent_element": {"@id": "inst:Wall_002"},
									"ont:area": "12.4 m2",
									"ont:number_of_occupants": 2,
									"ont:heated": true
								},
								{
									"@id": "inst:Window_022",
									"@type": ["ont:Window", "ont:Element"],
									"ont:height": "1200 mm",
									"ont:width": "900 mm"
								}
							]
						}
							  

At the JSON-LD playground you can get a more detailed introduction to the possibilities. Next slide for example demonstrates a compaction that maps to danish keys in the JSON object.

The Building Topology Ontology (BOT)

To be continued...

Buy Me A Coffee