Planet RDF

It's triples all the way down

October 12

AKSW Group - University of Leipzig: AKSW Colloquium

Andre Valdestilhas will present the current state of his work “A High Performance Approach to String Distance using Most Frequent K Characters”.


The idea is to reduce the computation in Link Discovery using and improving a String Distance Function called Most Frequent K Characters in order to get the Euclidean distance between two strings with a high performance. In order to do that we did experiments implementing a threshold and filters that allow skip cases when the distance between two strings are so much that no need to do a the whole computation. Also, we did experiments with parallel and GPU, also with a performance gain.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 12:07

October 11

John Goodwin: On Beyond OWL: challenges for ontologies on the Web by James Hendler

Posted at 09:08

October 09

Semantic Web Company (Austria): Ensure data consistency in PoolParty

Semantic Web Company and its PoolParty team are participating in the H2020 funded project ALIGNED. This project evaluates software engineering and data engineering processes in the context of how these both worlds can be aligned in an efficient way. All project partners are working on several use cases, which shall result in a set of detailed requirements for combined software and data engineering. The ALIGNED project framework also includes work and research on data consistency in PoolParty Thesaurus Server (PPT).

ALIGNED: Describing, finding and repairing inconsistencies in RDF data sets

When using RDF to represent the data model of applications, inconsistencies can occur. Compared with the schema approach of relational databases, a data model using RDF offers much more flexibility. Usually, the application’s business logic produces and modifies the model data and, therefore, can guarantee the consistency needed for its operations. However, information may not only be created and modified by the application itself but may also originate from external sources like RDF imports into the data model’s triple store. This may result in inconsistent model data causing the application to fail. Therefore, constraints have to be specified and enforced to ensure data consistency for the application. In Phase 1 of the ALIGNED project, we outline the problem domain and requirements for the PoolParty Thesaurus Server use case with the goal of establishing a solution for describing, finding and repairing inconsistencies in RDF data sets. We propose a framework as a basis for integrating RDF consistency management into PoolParty Thesaurus Server software components. The approach is a work in progress that aims for adopting technologies developed by the ALIGNED project partners and refine them for usage in an industrial-strength application.

Technical View

Users of PoolParty often wish to import arbitrary datasets, vocabularies, or ontologies. But these datasets do not always meet these constraints PoolParty impose. Currently, when users attempt to import data which violates the constraints, the data will simply fail to display, or in the worst case, cause unexpected behaviour and lead to/reflect errors in the application. Enhanced PoolParty will feedback the user why the import has failed, suggest ways in which the user can fix the problem and also identify potential new constraints that could be applied to the data structure. Apart from the import functionality, different other software components, like the taxonomy editor, or the reasoning engine drive RDF data constraints and vice versa. The following figure outlines utilization and importance of data consistency constraints in the PoolParty application:


click for larger view

Approaches and solutions for many of these components already exist. However, the exercise within ALIGNED is to integrate them in an easy-to-use way to comply with the PoolParty environment. Consistency constraints, for example, can be formulated using RDF Data Shapes or interpreting RDFS/OWL constructs with constraints-based semantics. RDFUnit already partly supports these techniques. Repair strategies and curation interfaces are covered by the Seshat Global History Databank project. Automated repair of large datasets can be managed by the UnifiedViews ETL tool, whereas immediate notification on data inconsistencies can be disseminated via the rsine semantic notification framework.


Within the ALIGNED projects, all project partners demand simple (i.e. maintainable and usable) data quality and consistency management and work on solutions to meet their requirements. Our next steps will encompass research on how to apply these technologies to the PoolParty problem domain, and to take part in unifying and integrating the different existing tools and approaches. The immediate challenge to address will be to build an interoparable catalog of formalized PoolParty data consistency constraints and repair strategies so that they are machine-processable in a (semi-)automatic way.

Posted at 15:12

October 08

David Robillard: Sord 0.14.0

Sord 0.14.0 is out. Sord is a lightweight C library for storing RDF statements in memory.

Developer note: this release does not break the ABI, but the semantics of iterators has changed: any modification to a model invalidates iterators on that model. Applications that keep iterators while modifying a model will need to be fixed. This is a consequence of a new underlying data structure, which significantly reduces memory overhead and improves performance.


  • Reduce memory usage and increase performance with a better data structure
  • Add sord_erase() for erasing statements via an iterator
  • Fix bugs with stores that contain both graphs and default graph statements
  • Fix crash caused by multiple deletion of datatype nodes
  • Fix compilation on compilers that do not support -pthread flag
  • Fix minor memory leak in sordi
  • Fix using sordi with stdin
  • Show sordi errors in standard format
  • sord_validate: More extensive validation, including cardinality, PlainLiteral, and someValuesFrom restrictions.
  • Improve test coverage
  • Upgrade to waf 1.8.14
  • Flattr this!

Posted at 21:27

David Robillard: Serd 0.22.0

Serd 0.22.0 is out. Serd is a lightweight, high-performance, dependency-free C library for RDF syntax which supports reading and writing Turtle and NTriples.


  • Remove dependence on fmax() to avoid portability issues
  • Fix serd_reader_read_file() for URIs with escaped characters (spaces)
  • Add serd_reader_set_strict() and -l (lax) option to serdi to tolerate parsing URIs with escaped characters
  • Fix reading statements ending with a blank then dot with no space
  • Fix clash resolution when a blank node ID prefix is set
  • Fix serializing fractional decimals that would round up
  • Add support for Turtle named inline nodes extension
  • Report errors for invalid IRI characters and missing terminators
  • Show serdi errors in standard format
  • Fix warnings when building with ISO C++ compilers
  • Upgrade to waf 1.8.14

Flattr this!

Posted at 21:19

Leigh Dodds: “The woodcutter”, an open data parable

In a time long past, in a land far away, there was once a great forest. It was a huge sprawling forest containing every known species of tree. And perhaps a few more.

The forest was part of a kingdom that had been ruled over by an old mad king for many years. The old king had refused anyone access to the forest. Only he was allowed to hunt amongst its trees. And the wood from the trees was used only to craft things that the king desired.

But there was now a new king. Where the old king was miserly, the new king was generous. Where the old king was cruel, the new king was wise.

As his first decree, the king announced that the trails that meandered through the great forest might be used by anyone who needed passage. And that the wood from his forest could be used by anyone who needed it, provided that they first ask the king’s woodcutter.

Several months after his decree, whilst riding on the edge of the forest, the king happened upon a surprising scene.

Gone was the woodcutter’s small cottage and workshop. In its place had grown up a collection of massive workshops and storage sheds. Surrounding the buildings was a large wooden palisade in which was set some heavily barred gates. From inside the palisade came the sounds of furious activity: sawing, chopping and men shouting orders.

All around the compound, filling the nearby fields, was a bustling encampment. Looking at the array of liveries, flags and clothing on display, the king judged that there were people gathered here from all across his lands. From farms, cities, and towns. From the coast and the mountains. There were also many from neighbouring kingdoms.

It was also clear that many of these people had been living here for some time.

Perplexed, the king rode to the compound, making his way through the crowds waiting outside the gates. Once he had been granted entry, he immediately sought out the woodcutter, finding him directing activities from a high vantage point.

Climbing to stand beside the woodcutter the king asked, “Woodcutter, why are all these people waiting outside of your compound? Where is the wood that they seek?”

Flustered, the woodcutter, mopped his brow and bowed to his king. “Sire, these people shall have their wood as soon as we are ready. But first we must make preparations.”

“What preparations are needed?”, asked the king. “Your people have provided wood from this forest for many, many years. While the old king took little, is it not the same wood?”

“Ah, but sire, we must now provide the wood to so many different peoples”. Gesturing to a small group of tents close to the compound, the woodcutter continued: “Those are the ship builders. They need the longest, straightest planks to build their ships. And great trees to make their keels”.

“Over there are the house builders”, the woodcutter gestured, “they too need planks. But of a different size and from a different type of tree. This small group here represents the carpenters guild. They seek only the finest hard woods to craft clever jewellery boxes and similar fine goods.”

The king nodded. “So you have many more people to serve and many more trees to fell.”

“That is not all”, said the woodcutter pointing to another group. “Here are the river people who seek only logs to craft their dugout boats. Here are the toy makers who need fine pieces. Here are the fishermen seeking green wood for their smokers. And there the farmers and gardeners looking for bark and sawdust for bedding and mulch”.

The king nodded. “I see. But why are they still waiting for their wood? Why have you recruited men to build this compound and these workshops, instead of fetching the wood that they need?”

“How else are we to serve their needs sire? In the beginning I tried to handle each new request as it came in. But every day a new type and shape of wood. If I created planks, then the river people needed logs. If I created chippings, the house builders needed cladding.

Everyone saw only their own needs. Only I saw all of them. To fulfil your decree, I need to be ready to provide whatever the people needed.

And so unfortunately they must wait until we are better able to do so. Soon we will be, once the last dozen workshops are completed. Then we will be able to begin providing wood once more.”

The king frowned in thought. “Can the people not fetch their own wood from the forest?”

Sadly, the woodcutter said, “No sire. Outside of the known trails the woods are too dangerous. Only the woodcutters know the safe paths. And only the woodcutters know the art of finding the good wood and felling it safely. It is an art that is learnt over many years”.

“But don’t you see?” said the King, “You need only do this and then let others do the rest. Fell the trees and bring the logs here. Let others do the making of planks and cladding. Let others worry about running the workshops. There is a host of people here outside your walls who can help. Let them help serve each others needs. You need only provide the raw materials”.

And with this the king ordered the gates to the compound to be opened, sending the relieved woodcutter back to the forest.

Returning to the compound many months later, the king once again found it to be a hive of activity. Except now the house builders and ship makers were crafting many sizes and shapes of planks. The toy makers took offcuts to shape the small pieces they needed, and the gardeners swept the leavings from all into sacks to carry to their gardens.

Happy that his decree had at last been fulfilled, the king continued on his way.

Read the first open data parable, “

Posted at 20:23

Dydra: NXP Wins the EU Linked Data Award

We are pleased to be able to report that the Product Marketing group at NXP Semiconductors has been awarded first prize in both the Dutch Best Linked Data Application of 2015 contest and the 2015 1st European Linked Data Award contest. Both awards were given in the category Linked Enterprise Data, for demonstrating how, by applying the linked data paradigm to their marketing and product data, NXP increased its value facilitating and accelerating a variety of sales, publication, and reporting processes.

NXP Enterprise Data Hub

The international European Linked Data Contest (ELDC) award jury from over 15 European countries elected the NXP Enterprise Data Hub project as the winner from 53 submissions in 22 countries:

The NXP Enterprise Data Hub integrates data and metadata from several enterprises systems, to provide a single, up-to-date ‘canonical’ source of information that is easy to use and that data consumers—be they human or machine—can trust. NXP Semiconductors have taken the Linked Data principles to heart and successfully rolled out a Linked Data solution across the whole company. The jury finds that this is worth to [be] honored by the ELDC 2015.

Posted at 17:09

October 01

Tetherless World Constellation group RPI: Historic launch of the Global Partnership for Sustainable Development Data

An information email in early September from Simon Hodson, the CODATA Executive Director, attracted my deep interest. His email was about the high-level political launch for the Global Partnership for Sustainable Development Data. I was interested because I have worked on Open Data in the past few years and the experience shows that Open Data much more comprehensive than a sole technical issue. I was excited to see that there will be such an event initiated by political partners and focusing on social impacts. And thanks to the support from the CODATA Early Career Data Professionals Working Group, which made it possible for me to head to New York City to attend the forum in person on September 28th.

The forum was held in the Jade Room of the Waldorf Astoria hotel, and lasted for three hours from 2 to 5PM, with a tight but well-organized schedule of about 10 lightning talks, four panels and about 30 commitment introductions from the partners. The panels and lightning talks focused on why open data is needed, how to make data open and, especially, what and the value of open data for The 17 Global Goals for Sustainable Development and the social impact that the data can generate. I was happy to see that the successful stories of open geospatial data were mentioned several times in the lightening talks and the panels. For example, delegates from the World Resources Institute presented the Global Forest Watch-Fires (GFW-Fires), which provides near-real time information from various resources that can enable people to take prompt response before the fire be out of control. During the partner introductions, I heard more exciting news about the actions that the stakeholders in governments, academia, industry and non-profit organizations are going to take actions to support the joint efforts of the Global Partnership for Sustainable Development Data. For example, the Children’s Investment Fund Foundation will invest $20m to improve data on coverage of nutrition interventions and other key indicators by 2020 in several countries; the DigitalGlobe commits to provide three countries with evaluation licenses to their BaseMap service as well as training sessions for human resources; the Planet Labs commits $60 million in geospatial imagery to support the global community; and the William and flora Hewlett Foundation is proposing to commit about $3m to the start-up support of the secretariat for a Global Partnership for Sustainable Development Data. A list of the current partners is accessible on the partnership’s website.

The Global Partnership for Sustainable Development Data has a long-term vision for the year 2030: A world in which everyone is able to engage in solving the world’s greatest problems by (1) Effectively Using Data and (2) Fostering Trust and Accountability in the Sharing of Data. The pioneering partners in this effort have already committed to deliver more than 100 data driven projects worldwide to pave the pathway for the vision 2030. For the first year, the partnership will work together to achieve these goals: (1) Improve the Effective Use of Data, (2) Fill Key Data Gaps, (3) Expand Data Literacy and Capacity, (4) Increase Openness and leverage of Existing Data, and (5) Mobilize Political Will and Resources.

The forum was chaired by Prof. Sanjeev Khagram, with over 200 attendees from various backgrounds. During the reception time after the forum, I had a brief chat with Prof. Khagram about CODATA and also the Early Career Data Professionals Working Group, as well as the potential collaborations. He informed me that the partnership is open and invites broad participation to address the sustainable development goals. Prof. Khagram also mentioned that a bigger event, the World Data Forum, will take place in 2016. I also had the opportunity to catch up with Dr. Bob Chen from CIESIN, Columbia University about recent activities. It seems that ‘climate change’ is the topic of focus for several conferences in the year 2015, such as the International Scientific Conference, the Research Data Alliance Sixth Plenary Meeting and the United Nations Climate Change Conference, and Paris is the city for all these three events.

The report A World That Counts: Mobilising The Data Revolution for Sustainable Development, prepared by the United Nation Secretary-General’s Independent Expert Advisory Group on a Data Revolution for Sustainable Development, provides more background information about the Global Partnership for Sustainable Development Data.

Posted at 00:54

September 30

Dublin Core Metadata Initiative: Joseph Tennis begins role as Chair of the DCMI Governing Board

2015-09-30, Joseph Tennis became Chair of the DCMI Governing Board and Paul Walk its Chair-Elect during the closing ceremony of DC-2015 in São Paulo, Brazil. Eric Childress became Immediate Past Chair and Michael Crandall retired from the Governing Board and his role as Immediate Past Chair. Tennis will be Chair of the Governing Board and its Executive Committee through DCMI's Annual Meeting and International Conference in Copenhagen in October 2016. Tennis is also the President of the International Society for Knowledge Management (ISKO) and is serving a four year term from 2014-2018. Information about the DCMI Governing Board and its members can be found at

Posted at 23:59

W3C Read Write Web Community Group: Read Write Web — Q3 Summary — 2015


A relatively quiet quarter in world of (read-write) web standards.  Generally my impression is that the emphasis is shifting more towards implementation, with most of the hard parts of design out, now of the way.  Andrei gave a great presentation at the re-decentralize web conference in Brazil.

There was quite a bit of discussion regarding decentralized identity, same origin policy and public / private key provisioning in the browser.  In particular, the KEYGEN element was looked at, with some wanting to deprecate it.  However, it was established that it’s still in use and would not be deprecated just yet, hopefully something better will soon be available.

More work this month on the SoLiD platform, apps (see below), some discussion on PKI, trust and reputation systems, on the web.  MIT / Crosscloud are also going to continue their great work and announced they are hiring two developers.  Please take a look if interested.

Communications and Outreach

There was some more discussions with the Social Web Working Group.  There was great news as it was announced that Sarven Capadisli and Amy Guy would be joining MIT with some more good work in this area.

Community Group

The SoLiD framework now has it’s own logo and area on github.  There was a short discussion regarding creating a decentralized a web of trust.  Based on this I put together some tools to convert public keys between formats and use them to encrypt, decrypt and sign messages.  We also started a wiki page for getting started with the SoLiD framework.



Some steady work this month on applications.  To illustrate different aspects of development with the SoLiD platform I put together a short series of apps, each created in under 24 hours.  The first was a hello world app, then a clipboard, a video display app, a simple chess game and a data explorer.

Lots of work was completed on ldnode, gold and rdflib.  And also the start of an excellent realtime collaboration pad by timbl, which has turned out to be very useful already!


Last but not Least…

More great work from openlink as they prepare to release to production an excellent generic linked data document reader and editor.  Feel free to give it a try, preferably using your own data space!

Posted at 23:19

September 29

Semantic Web Company (Austria): SPARQL analytics proves boxers live dangerously

You have always thought that SPARQL is only a query language for RDF data? Then think again, because SPARQL can also be used to implement some cool analytics. I show here two queries that demonstrate that principle.

For simplicity we use a publicly available dataset of DBpedia on an open SPARQL endpoint: (execute with default graph =

Mean life expectancy for different sports

The query shown here starts from the class dbp:Athlete and retrieves sub classes thereof that cover different sports. With that athletes of that areas are obtained and their birth and death dates (i.e. we only take into account deceased individuals). From the dates the years are extracted. Here a regular expression is used because the SPARQL function to extract years from a literal of a date type returned errors and could not be used. From the birth and death years the age is calculated (we filter for a range of 20 to 100 years because in data sources like this erroneous entries have always to be accounted for). Then the data is simply grouped and we count for each sport the number of athletes that were selected and the average age they reached.

prefix dbp:<>
select ?athleteGroupEN (count(?athlete) as ?count) (avg(?age) as ?ageAvg)
where {
filter(?age >= 20 && ?age <= 100) .
select distinct ?athleteGroupEN ?athlete (?deathYear - ?birthYear as ?age)
where {
?subOfAthlete rdfs:subClassOf dbp:Athlete .
?subOfAthlete rdfs:label ?athleteGroup filter(lang(?athleteGroup) = "en") .
bind(str(?athleteGroup) as ?athleteGroupEN)
?athlete a ?subOfAthlete .
?athlete dbp:birthDate ?birth filter(datatype(?birth) = xsd:date) .
?athlete dbp:deathDate ?death filter(datatype(?death) = xsd:date) .
bind (strdt(replace(?birth,"^(\\d+)-.*","$1"),xsd:integer) as ?birthYear) .
bind (strdt(replace(?death,"^(\\d+)-.*","$1"),xsd:integer) as ?deathYear) .
} group by ?athleteGroupEN having (count(?athlete) >= 25) order by ?ageAvg

The results are not unexpected and show that athletes in the area of motor sports, wresting and boxing die at younger age. On the other hand horse riders, but also tennis and golf players live on average clearly longer.

wrestler 693 58.962481962481962
winter sport Player 1775 66.60169014084507
tennis player 577 71.483535528596187
table tennis player 45 68.733333333333333
swimmer 402 68.674129353233831
soccer player 6572 63.992391965916007
snooker player 25 70.12
rugby player 1452 67.272038567493113
rower 69 63.057971014492754
poker player 30 66.866666666666667
national collegiate athletic association athlete 44 68.090909090909091
motorsport racer 1237 58.117219078415521
martial artist 197 67.157360406091371
jockey (horse racer) 139 65.992805755395683
horse rider 181 74.651933701657459
gymnast 175 65.805714285714286
gridiron football player 4247 67.713680244878738
golf player 400 71.13
Gaelic games player 95 70.589473684210526
cyclist 1370 67.469343065693431
cricketer 4998 68.420368147258904
chess player 45 70.244444444444444
boxer 869 60.352128883774453
bodybuilder 27 52
basketball player 822 66.165450121654501
baseball player 9207 68.611382643640708
Australian rules football player 2790 69.52831541218638

This is especially relevant when that data is large and one would have to extract it from the database and import it into another tool to do the counting and calculations.

Simple statistical measures over life expectancy

Another standard statistical measure is the standard deviation. A good description about how to calculate it can be found for example here. We start again with the class dbp:Athlete and calculate the ages they reached (this time for the entire class dbp:Athlete not its sub classes). Another thing we need are the squares of the ages that we calculate with “(?age * ?age as ?ageSquare)”. At the next stage we count the number of athletes in the result, and calculate the average age, the square of the sums and the sum of the squares. With those values we can calculate in the next step the standard deviation of the ages in our data set. Note that SPARQL does not specify a function for calculating square roots but RDF stores like Virtuoso (that hosts the DBpedia data) provide additional functions like bif:sqrt for calculating the square root of a value.

prefix dbp:<>
select ?count ?ageAvg (bif:sqrt((?ageSquareSum - (strdt(?ageSumSquare,xsd:double) / ?count)) / (?count - 1)) as ?standDev)
where {
select (count(?athlete) as ?count) (avg(?age) as ?ageAvg) (sum(?age) * sum(?age) as ?ageSumSquare) (sum(?ageSquare) as ?ageSquareSum)
where {
select ?subOfAthlete ?athlete ?age (?age * ?age as ?ageSquare)
where {
filter(?age >= 20 && ?age <= 100) .
select distinct ?subOfAthlete ?athlete (?deathYear - ?birthYear as ?age)
where {
?subOfAthlete rdfs:subClassOf dbp:Athlete .
?athlete a ?subOfAthlete .
?athlete dbp:birthDate ?birth filter(datatype(?birth) = xsd:date) .
?athlete dbp:deathDate ?death filter(datatype(?death) = xsd:date) .
bind (strdt(replace(?birth,"^(\\d+)-.*","$1"),xsd:integer) as ?birthYear) .
bind (strdt(replace(?death,"^(\\d+)-.*","$1"),xsd:integer) as ?deathYear) .


38542 66.876290799647138 17.6479

These examples show that SPARQL is quite powerful and a lot more than “just” a query language for RDF data but that it is possible to implement basic statistical methods directly at the level of the triple store without the need to extract the data and import it into another tool.

Posted at 14:30

Ebiquity research group UMBC: Beyond NER: Towards Semantics in Clinical Text

Clare Grasso, Anupam Joshi and ELior Siegel, Beyond NER: Towards Semantics in Clinical Text, Biomedical Data Mining, Modeling, and Semantic Integration (BDM2I); co-located with the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA.

While clinical text NLP systems have become very effective in recognizing named entities in clinical text and mapping them to standardized terminologies in the normalization process, there remains a gap in the ability of extractors to combine entities together into a complete semantic representation of medical concepts that contain multiple attributes each of which has its own set of allowed named entities or values. Furthermore, additional domain knowledge may be required to determine the semantics of particular tokens in the text that take on special meanings in relation to this concept. This research proposes an approach that provides ontological mappings of the surface forms of medical concepts that are of the UMLS semantic class signs/symptoms. The mappings are used to extract and encode the constituent set of named entities into interoperable semantic structures that can be linked to other structured and unstructured data for reuse in research and analysis.

Posted at 14:14

September 28

AKSW Group - University of Leipzig: AKSW Colloquium, 28 September, 3pm, Overcoming Challenges of Question Answering in the Semantic Web

Konrad Höffner will present the current state of the survey “Overcoming Challenges of Question Answering in the Semantic Web”, which will be submitted to the Semantic Web Journal. It is a joined work with the QA experts in the group and a researcher from Uni Bielefeld. The survey will form the related work chapter of his thesis.


Semantic Question Answering (SQA) removes two major access requirements to the Semantic Web: the mastery of a formal query language like SPARQL and knowledge of a specific vocabulary. Because of the complexity of natural language, SQA presents difficult challenges and many research opportunities. Instead of a shared effort, however, many essential components are redeveloped, which is an inefficient use of researcher’s time. This survey analyzes 62 different SQA systems, which are systematically and manually selected using predefined inclusion and exclusion criteria, leading to 72 selected publications out of 1960 candidates. We identify common challenges, structure solutions, and provide recommendations for future systems. This work, based on publications from the end of 2010 to July 2015, is also compared to older, similar surveys.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 10:56

September 27

Egon Willighagen: Coding an OWL ontology in HTML5 and RDFa

There are many fancy tools to edit ontologies. I like simple editors, like nano. And like any hacker, I can hack OWL ontologies in nano. The hacking implies OWL was never meant to be hacked on a simple text editor; I am not sure that is really true. Anyways, HTML5 and RDFa will do fine, and here is a brief write up. This post will not cover the basics of RDFa and does assume you already know how triples work. If not, read this RDFa primer first.

The BridgeDb DataSource Ontology
This example uses the BridgeDb DataSource Ontology, created by BridgeDb developers from Manchester University (Christian, Stian, and Alasdair). The ontology covers describing data sources of identifiers, a technology outlined in the BridgeDb paper by Martijn (see below) as well as terms from the Open PHACTS Dataset Descriptions for the Open Pharmacological Space by Alasdair et al.

Because I needed to put this online for Open PHACTS (BTW, the project won a big award!) and our previous solution did not work well enough anymore. You may also see the HTML of the result first. You may also want to verify it really is HTML: here is the HTML5 validation report. Also, you may be interested in what the ontology in RDF looks like: here is the extracted RDF for the ontology. Now follow the HTML+RDFa snippets. First, the ontology details (actually, I have it split up):

<div about=""
<h1>The <span property="rdfs:label">BridgeDb DataSource Ontology</span>
(version <span property="owl:versionInfo">2.1.0</span>)</h1>
This page describes the BridgeDb ontology. Make sure to visit our
<a property="rdfs:seeAlso" href="">homepage</a> too!
<p about="">
The OWL ontology can be extracted
<a property="owl:versionIRI"
The Open PHACTS specification on
<a property="rdf:seeAlso"
>Dataset Descriptions</a> is also useful.

This is the last time I show the color coding, but for a first time it is useful. In red are basically the predicates, where @about indicates a new resource is started, @typeof defines the rdf:type, and @property indicates all other predicates. The blue and green blobs are literals and object resources, respectively. If you work this out, you get this OWL code (more or less):

bridgedb: a owl:Ontology;
rdfs:label "BridgeDb DataSource Ontology"@en;
rdfs:seeAlso <>;
owl:versionInfo "2.1.0"@en .

An OWL class
Defining OWL classes are using the same approach: define the resource it is @about, define the @typeOf and giving is properties. BTW, note that I added a @id so that ontology terms can be looked up using the HTML # functionality. For example:

<div id="DataSource"
<h3 property="rdfs:label">Data Source</h3>
<p property="dc:description">A resource that defines
identifiers for some biological entity, like a gene,
protein, or metabolite.</p>

An OWL object property
Defining an OWL data property is pretty much the same, but note that we can arbitrary add additional things, making use of <span>, <div>, and <p> elements. The following example also defines the rdfs:domain and rdfs:range:

<div id="aboutOrganism"
<h3 property="rdfs:label">About Organism</h3>
<p><span property="dc:description">Organism for all entities
with identifiers from this datasource.</span>
This property has
<a property="rdfs:domain"
as domain and
<a property="rdfs:range"
as range.</p>

So, now anyone can host an OWL ontology with dereferencable terms: to remove confusion, I have used the full URLs of the terms in @about attributes.

 Van Iersel, M. P., Pico, A. R., Kelder, T., Gao, J., Ho, I., Hanspers, K., Conklin, B. R., Evelo, C. T., Jan. 2010. The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services. BMC Bioinformatics 11 (1), 5+.

Posted at 18:04

September 20

AKSW Group - University of Leipzig: AKSW Colloquium, 21 September, 3pm, An Open Question Answering Framework

This Colloquium will be presented by Edgard Marx and Diego Moussallem.

Edgard MarxEdgard Marx will present the progress of his PhD titled “An Open Question Answering Framework”. Billions of facts pertaining to a multitude of domains are now available on the Web as RDF data. However, accessing this data is still a difficult endeavour for non-expert users. The goal of this PhD is the development of a framework for Question Answering over Linked Data.


Edgard MarxDiego Moussallem will present a survey about machine translation using semantic web technologies as well as his future works. Although Machine Translation systems have achieved significant results int the last decade, there are still space for improvement. One of the main challenges in Machine Translation is the ambiguity. However, the use of semantic web technologies can overcome this obstacles.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 20:19

September 19

Bob DuCharme: My data science glossary

Complete with a dot org domain name.

Posted at 15:23

September 16

Dydra: Looking for an RDF Patch Format

As we move towards “Linked Data Platform”[0] support, we need to provide a mechanism to modify selectively repository content which represents containers and collections. Yes, in principle, the SPARQL and Graph Store protocols provide such facilities, but there is a wide-spread reluctance to admit their suitability due to one common issue: the restrictions placed on blank node designators. Any sorts of models which involve blank nodes require special procssing in order to specify the statements to modify. We note, for this, there is no effective standard method and have considered whether the “Linked Data Patch Format” could serve this purpose.

Based on the information to hand, despite both the extensive analysis which has been devoted to the abstract issues, and the purposeful deliberation leading up to the LDPatch proposal itself, we conclude that the suggested mechanism is inappropriate for inclusion in an RDF data management service. First, it does not appear possible for it to fulfill its relative preformance guarantees, second, it requires additional state and process control management from the client and, finally, it encumbers the server implementation and access protocol with elements which, given the other factors, serve no useful purpose.

In order to decide how to proceed, we consider the deliberations which led to the proposal and whether alternatives exist. To begin, there are several perspectives:

  • the original essay from Tim Berners-Lee and Dan Connoly [1],
  • the notes from James Carroll [2], concerning graph matching and isomorphism,
  • a much longer exploration of the complexities introduced by blank nodes from Axel Polleres [3],
  • a talk by Patrick Hayes which includes alternative notions of blank node semantics and, in particular, handles the salient issue neglected by Polleres: scope.
  • a note about a “Linked Data Patch Format” from the Linked Data Platform Working Group to cover a proposal which failed to achieve recommendation status [4], and
  • a shorter note from one of that note’s authors, Alexandre Bertails, which seeks to justify the “Patch Format” approach [5].

Despite the repeated analyses, none yields a standard approach to the problem. All rely on a misapprehension of the nature of “blank nodes” in a “physical symbol system”, fabricate a problem for which they then fail to find a solution, where neither need exist.

Posted at 02:06

September 14

Libby Miller: Biscuit projects

This is a quote by David Mitchell

Posted at 08:52

September 13

AKSW Group - University of Leipzig: AKSW Colloquium, 14 September, 3pm, Learning Metrics for Link Discovery

Tommaso SoruIn this Colloquium, Tommaso Soru will present the progress of his PhD titled “Learning Metrics for Link Discovery”. The discovery of new links is essential for the construction of the Linked Data cloud. The use of links to other URIs was initially suggested by Tim Berners-Lee and is known as the 4th Linked Data Principle. The goal of this PhD is to research new approaches based on machine- and statistical-learning techniques for Link Discovery.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 21:23

September 12

Ebiquity research group UMBC: talk: Attribute-based Fine Grained Access Control for Triple Stores


In the 14-09-2015 ebiquity meeting, Ankur Padia will talk about his recent work aimed at providing access control for an RDF triple store.

Attribute-based Fine Grained Access Control for Triple Stores

Ankur Padia, UMBC

The maturation of semantic web standards and associated web-based data representations like have made RDF a popular model for representing graph data and semi-structured knowledge. However, most existing SPARQL endpoint supports simple access control mechanism preventing its use for many applications. To protect the data stored in RDF stores, we describe a framework to support attribute-based fine grained access control and explore its feasibility. We implemented a prototype of the system and used it to carry out an initial analysis on the relation between access control policies, query execution time, and size of the RDF dataset.

For more information, see: Ankur Padia Tim Finin and Anupam Joshi, Attribute-based Fine Grained Access Control for Triple Stores, 3rd Society, Privacy and the Semantic Web – Policy and Technology workshop (PrivOn 2015), 14th Int. Semantic Web Conf., Oct. 2015.

Posted at 13:01

September 11

Leigh Dodds: Basic questions about data

Over the past couple of years I’ve written several posts that each focus on trying to answer a simple question relating to data and/or open data.

I’ve collected them together into a list here for easier reference. I’ll update the list as I write more related posts:

Posted at 17:28

September 09

Dublin Core Metadata Initiative: DC-2016 to take place in Copenhagen, 13-16 October 2016

2015-09-09, DC-2016 will be collocated with the ASIS&T 2016 Annual Meeting in Copenhagen. The four days of DC-2016 will be comprised of: (1) pre- and post-conference full- and half-day Workshops and Tutorials; (2) a peer reviewed Technical Program of Papers, Project Reports and Posters; (3) a Professional Program of Special Sessions and Panels and Best Practice Posters and Demonstrations addressing innovation in metadata design, implementation, management, and use; and (4) the DCMI Annual Meeting. DC-2016 conference will take place 13-16 October 2016 and will overlap the ASIS&T 2016 Annual Meeting running 14-18 October 2016. Both conferences will be held at the Crown Plaza, Copenhagen Towers. Mark your calendars!

Posted at 23:59

Copyright of the postings is owned by the original blog authors. Contact us.