Planet RDF

It's triples all the way down

June 27

AKSW Group - University of Leipzig: Accepted Papers of AKSW Members @ Semantics 2016

logo-semantics-16This year’s SEMANTiCS conference which is taking place between September 12 – 15, 2016 in Leipzig recently invited for the submission of research papers on semantic technologies. Several AKSW members seized the opportunity and got their submitted papers accepted for presentation at the conference.

These are listed below:

  • Executing SPARQL queries over Mapped Document Stores with SparqlMap-M (Jörg Unbehauen, Michael Martin )
  • Distributed Collaboration on RDF Datasets Using Git: Towards the Quit Store (Natanael Arndt, Norman Radtke and Michael Martin)
  • Towards Versioning of Arbitrary RDF Data (Marvin Frommhold, Ruben Navarro Piris, Natanael Arndt, Sebastian Tramp, Niklas Petersen and Michael Martin)
  • DBtrends: Exploring query logs for ranking RDF data (Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg)
  • MEX Framework: Automating Machine Learning Metadata Generation (Diego Esteves, Pablo N. Mendes, Diego Moussallem, Julio Cesar Duarte, Maria Claudia Cavalcanti, Jens Lehmann, Ciro Baron Neto and Igor Costa)

logo-www.leds-projekt.deAnother AKSW-driven event of the SEMANTiCS 2016 will be the Linked Enterprise Data Services (LEDS) Track taking place between September 13-14, 2016. This track is specifically organized by the BMBF-funded LEDS project which is part of the Entrepreneurial Regions program – a BMBF Innovation Initiative for the New German Länder. Focus is on discussing with academic and industrial partners new approaches to discover and integrate background knowledge into business and governmental environments.

DBpediaLogoFullSEMANTiCS 2016 will also host the 7th edition of the DBpedia Community Meeting on the last day of the conference (September 15 – ‘DBpedia Day‘). DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and link the different data sets on the Web to Wikipedia data.

So come and join SEMANTiCS 2016, talk and discuss with us!

More information on the program can be found here.

LEDS is funded by:                      Part of:

Wachstumskern Region

Posted at 10:50

June 26

AKSW Group - University of Leipzig: AKSW Colloquium, 27.06.2016, When owl:sameAs isn’t the Same + Towards Versioning for Arbitrary RDF Data

In the next Colloquium, June the 27th at 3 PM, two papers will be presented:

When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data

andre_terno_itaAndré Valdestilhas will present the paper “When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data” by Halpin et al. [PDF]:

Abstract:  In Linked Data, the use of owl:sameAs is ubiquitous in interlinking data-sets. There is however, ongoing discussion about its use, and potential misuse, particularly with regards to interactions with inference. In fact, owl:sameAs can be viewed as encoding only one point on a scale of similarity, one that is often too strong for many of its current uses. We describe how referentially opaque contexts that do not allow inference exist, and then outline some varieties of referentially-opaque alternatives to owl:sameAs. Finally, we report on an empirical experiment over randomly selected owl:sameAs statements from the Web of data. This theoretical apparatus and experiment shed light upon how owl:sameAs is being used (and misused) on the Web of data.

Towards Versioning for Arbitrary RDF Data

marvin-frommhold-foto.256x256Afterwards, Marvin Frommhold will practice the presentation of his paper “Towards Versioning for Arbitrary RDF Data” (Marvin Frommhold, Rubén Navarro Piris, Natanael Arndt, Sebastian Tramp, Niklas Petersen, and Michael Martin) [PDF] which is accepted at the main conference of the Semantics 2016 in Leipzig.

Abstract: Coherent and consistent tracking of provenance data and in particular update history information is a crucial building block for any serious information system architecture. Version Control Systems can be a part of such an architecture enabling users to query and manipulate versioning information as well as content revisions. In this paper, we introduce an RDF versioning approach as a foundation for a full featured RDF Version Control System. We argue that such a system needs support for all concepts of the RDF specification including support for RDF datasets and blank nodes. Furthermore, we placed special emphasis on the protection against unperceived history manipulation by hashing the resulting patches. In addition to the conceptual analysis and an RDF vocabulary for representing versioning information, we present a mature implementation which captures versioning information for changes to arbitrary RDF datasets.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 13:46

June 25

Egon Willighagen: New Paper: "Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources"

Andra Waagmeester published a paper on his work on a semantic web version of the WikiPathways (doi:10.1371/journal.pcbi.1004989). The paper outlines the design decisions, shows the SPARQL endpoint, and several examples SPARQL queries. These include federates queries, like a mashup with DisGeNET (doi:10.1093/database/bav028) and EMBL-EBI's Expression Atlas. That results in nice visualisations like this:

If you have the relevant information in the pathway, these pathways can help a lot in helping understanding of what is biologically going on. And, of course, used for exactly that a lot.

Press release
Because press releases have become an interesting tool in knowledge dissemination, I wanted to learn what it involved to get one out. This involved the people as PLOS Computational Biology and the press offices of the Gladstone Institutes and our Maastricht University (press release 1, press release 2 EN/NL). There is already one thing I learned in retrospect, and I am pissed with myself that I did not think of this: you should always have a graphics supporting your story. I have been doing this for a long time in my blog now (sometimes I still forget), but did not think of that in the press release. The press release was picked up by three outlets, though all basically as we presented it to them (thanks to

But what makes me appreciate this piece of work, and WikiPathways itself, is how it creates a central hub of biological knowledge. Pathway databases capture knowledge not easily embedded an generally structured (relational) databases. As such, expression this in the RDF format seems simple enough. The thing I really love about this approach, is that your queries become machine readable stories, particularly when you start using human readable variants of SPARQL for this. And you can share these queries with the online scientific community with, for example, myExperiment.

There are two applications how I have used SPARQL on WikiPathways data for metabolomics: 1. curation; 2. statistics. Data analysis is harder, because in the RDF world resources scientific lenses are needed to accommodate for the chemical structural-temporal complexity of metabolites. For curation, we have long used SPARQL for unit tests to support the curation of WikiPathways. Moreover, I have manually used the SPARQL end point to find curation tasks. But now that the paper is out, I can blog about this more. For now, many examples SPARQL queries can be found in the WikiPathways wiki. It features several queries showing statistics, but also some for curation. This is an example query I use to improve the interoperability of WikiPathways with Wikidata (also for BridgeDb):

  ?metabolite a wp:Metabolite .
  OPTIONAL { ?metabolite wp:bdbWikidata ?wikidata . }
  FILTER (!BOUND(?wikidata))

Feel free to give this query a go at!

This papers completes a nice triptych of three papers about WikiPathways in the past 6 months. Thanks to whole community and the very many contributors! All three papers are linked below.

Waagmeester, A., Kutmon, M., Riutta, A., Miller, R., Willighagen, E. L., Evelo, C. T., Pico, A. R., Jun. 2016. Using the semantic web for rapid integration of WikiPathways with other biological online data resources. PLoS Comput Biol 12 (6), e1004989+.
Bohler, A., Wu, G., Kutmon, M., Pradhana, L. A., Coort, S. L., Hanspers, K., Haw, R., Pico, A. R., Evelo, C. T., May 2016. Reactome from a WikiPathways perspective. PLoS Comput Biol 12 (5), e1004941+.
Kutmon, M., Riutta, A., Nunes, N., Hanspers, K., Willighagen, E. L., Bohler, A., Mélius, J., Waagmeester, A., Sinha, S. R., Miller, R., Coort, S. L., Cirillo, E., Smeets, B., Evelo, C. T., Pico, A. R., Jan. 2016. WikiPathways: capturing the full diversity of pathway knowledge. Nucleic Acids Research 44 (D1), D488-D494.

Posted at 09:44

June 23

Leigh Dodds: The state of open licensing

I spend a lot of time reading through licences and terms & conditions. Much more so than I thought I would when I first started getting involved with open data. After all, I largely just like making things with data.

But there’s still so much data that is

Posted at 19:03

June 22

AKSW Group - University of Leipzig: Should I publish my dataset under an open license?

Undecided, stand back we know flowcharts:

Did you ever try to apply the halting problem to a malformed flowchart?


Taken from my slides for my keynote  at TKE:

Linguistic Linked Open Data, Challenges, Approaches, Future Work from Sebastian Hellmann

Posted at 09:41

June 21

Leigh Dodds: From services to products

Over the course of my career I’ve done a variety of consulting projects as both an employee and freelancer. I’ve helped found and run a small consulting team. And, through my experience leading engineering teams, some experience of designing products and platforms. I’ve been involved in a few discussions, particularly over the last 12 months or so, around how to generate repeatable products off the back of consulting engagements.

I wanted to jot down a few thoughts here based on my own experience and a bit of background reading. I don’t claim to have any special insight or expertise, but the topic is one that I’ve encountered time and again. And as I’m trying to write things down more frequently, I thought I’d share my perspective in the hope that it may be useful to someone wrestling with the same issues.

Please comment if you disagree with anything. I’m learning too.

What are Products and Services?

Lets start with some definitions.

A service is a a bespoke offering that typically involves a high-level of expertise. In a consulting business you’re usually selling people or a team who have a particular set of skills that are useful to another organisation. While the expertise and skills being offered are common across projects, the delivery is usually highly bespoke and tailored for the needs of the specific client.

The outcomes of an engagement are also likely to be highly bespoke as you’re delivering to a custom specification. Custom software development, specially designed training packages, and research projects are all examples of services.

A product is a packaged solution to a known problem. A product will be designed to meet a particular need and will usually be designed for a specific audience. Products are often, but not always, software. I’m ignoring manufacturing here.

Products can typically be rapidly delivered as they can be installed or delivered via a well-defined process. While a product may be tailored for a specific client they’re usually very well-defined. Product customisation is usually a service in its own right. As is product support.

The Service-Product Spectrum

I think its useful to think about services and products being at opposite ends of a spectrum.

At the service end of the spectrum your offerings are:

  • are highly manual, because you’re reliant on expert delivery
  • are difficult to scale, because you need to find the people with the skills and expertise which are otherwise in short supply
  • have low repeatability, because you’re inevitably dealing with bespoke engagements

At the product end of the spectrum your offerings are:

  • highly automated, because you’re delivering a software product or following a well defined delivery process
  • scalable, because you need fewer (or at least different) skills to deliver the product
  • highly repeatable, because each engagement is well defined, has clear life-cycle, etc.

Products are a distillation of expertise and skills.

Actually, there’s arguably a stage before service. Lets call those “capabilities” to

Posted at 17:15

June 18

Leigh Dodds: “The Wizard of the Wash”, an open data parable

The fourth

Posted at 15:31

June 15

Dublin Core Metadata Initiative: Deadline of 15 July for DC-2016 Presentations and Best Practice Poster tracks

2016-06-15, The deadline of 15 July is approaching for abstract submissions for the Presentations on Metadata track and the Best Practice Poster track for DC-2016 in Copenhagen. Both tracks provide metadata practitioners and researchers the opportunity to present their work in Copenhagen. Neither of the tracks require a paper submission. Submit your proposal abstract for either track at Selections for presentation in Copenhagen will be made by the DC-2016 Organizing Team.

Posted at 23:59

Dublin Core Metadata Initiative: DCMI announces Workshop series for DC-2016

2016-06-15, DCMI is proud to announce a series of four workshops as part of the Professional program at DC-2016. Both half-day and full-day workshops are available. Abstracts of the workshops are available at Delegates to DC-2016 may register for both the International Conference and the Workshops individually at Day and half-day rates for individual workshops are also available.

Posted at 23:59

Dublin Core Metadata Initiative: DCMI opens registration for DC-2016

2016-06-15, Registration for DC-2016 is now open at The International Conference takes place on 13-14 October and the Workshop Series on 15-16. Separate registrations are available for the Conference and Workshops. DC-2016 in Copenhagen is collocated in the same venue with the ASIST Annual Meeting that takes place from 14-18 October. Special rates for the ASIST meeting are available to DCMI members. The program for DC-2016 will include papers, project reports, posters (research and best practice) and presentations on metadata. In addition, there will be a series of topical special sessions and two days of workshops. For more information and to register, visit the DC-2016 conference website at

Posted at 23:59

June 14

Leigh Dodds: Discussion document: archiving open data

This is a brief post to highlight a short discussion document that I recently published about

Posted at 16:50

AKSW Group - University of Leipzig: TKE 2016 has announced their invited speakers

Sebastian HellmannThe 12th International Conference on Terminology and Knowledge Engineering (TKE 2016) has announced their invited speakers, including Dr. Sebastian Hellmann, Head of the AKSW/KILT research group at Leipzig University and Executive Director of the DBpedia Association at the Institut for Applied Informatics (InfAI) e.V.. Sebastian Hellman will give a talk on Challenges, Approaches and Future Work for Linguistic Linked Open Data (LLOD).

The theme of the 12th International Conference on Terminology and Knowledge Engineering will be ‘Term Bases and Linguistic Linked Open Data’. So the main aims of TKE 2016 will be to bring together researchers from these related fields, provide an overview of the state-of-the-art, discuss problems and opportunities, and exchange information. TKE 2016 will also cover applications, ongoing and planned activities, industrial uses and needs, as well as requirements coming from the new e-society.

DownloadThe TKE 2016 conference will take place in Copenhagen, Denmark, between 22-24 June, 2016. Further information about the program and speakers confirmed so far can be found at the conference website.


Posted at 10:58

Leigh Dodds: What 3 Words? Jog on mate!


Posted at 07:11

June 12

AKSW Group - University of Leipzig: Two Papers accepted at ECAI 2016

Ecai-2016Hello Community! We are very pleased to announce that two of our papers were accepted for presentation at the biennial European Conference on Artificial Intelligence (ECAI). ECAI is Europe’s premier venue for presenting scientific results in AI and will be held from August 29th to September 02nd in The Hague, Netherlands.


In more detail, we will present the following papers:

An Efficient Approach for the Generation of Allen Relations                     (Kleanthi Georgala, Mohamed Sherif, Axel-Cyrille Ngonga Ngomo)

Abstract: Event data is increasingly being represented according to the Linked Data principles. The need for large-scale machine learning on data represented in this format has thus led to the need for efficient approaches to compute RDF links between resources based on their temporal properties. Time-efficient approaches for computing links between RDF resources have been developed over the last years. However, dedicated approaches for linking resources based on temporal relations have been paid little attention to. In this paper, we address this research gap by presenting AEGLE, a novel approach for the efficient computation of links between events according to Allen’s interval algebra. We study Allen’s relations and show that we can reduce all thirteen relations to eights simpler relations. We then present an efficient algorithm with a complexity of O(n log n) for computing these eight relations. Our evaluation of the runtime of our algorithms shows that we outperform the state of the art by up to 4 orders of magnitude while maintaining a precision and a recall of 100%.

Towards SPARQL-Based Induction for Large-Scale RDF Data sets             (Simon Bin, Lorenz Bühmann, Jens Lehmann, Axel-Cyrille Ngonga Ngomo)

Abstract: We show how to convert OWL Class Expressions to SPARQL queries where the instances of that concept are — with restrictions sensible in the considered concept induction scenario — equal to the SPARQL query result.  Furthermore, we implement and integrate our converter into the CELOE algorithm (Class Expression Learning for Ontology Engineering). Therein, it replaces the position of a traditional OWL reasoner, which most structured machine learning approaches assume knowledge to be loaded into. This will foster the application of structured machine learning to the Semantic Web, since most data is readily available in triple stores. We provide experimental evidence for the usefulness of the bridge. In particular, we show that we can improve the runtime of machine learning approaches by several orders of magnitude. With these results, we show that machine learning algorithms can now be executed on data on which in-memory reasoners could not be  use previously possible.

Come over to ECAI and enjoy the talks. For more information on the conference program and other papers please see here.

Sandra on behalf of AKSW

Posted at 20:22

Bob DuCharme: Emoji SPARQL😝!

If emojis have Unicode code points, then we can...

Posted at 16:46

AKSW Group - University of Leipzig: AKSW Colloquium, 13.06.2016, SPARQL query processing with Apache Spark

In the upcoming Colloquium, Simon Bin will discuss the paper “SimonSPARQL query processing with Apache Spark” by H. Naacke that has been submitted to ISWC2016.  Abstract

The number of linked data sources and the size of the linked open data graph keep growing every day.  As a consequence, semantic RDF services are more and more confronted to various big data problems.  Query processing is one of them and needs to be efficiently addressed with executions over scalable, highly available and fault tolerant frameworks.  Data management systems requiring these properties are rarely built from scratch but are rather designed on top of an existing cluster computing engine.  In this work, we consider the processing of SPARQL queries with Apache Spark.
We propose and compare five different query processing approaches based on different join execution models and Spark components.  A detailed experimentation, on real-world and synthetic data sets, emphasizes that two approaches tailored for the RDF data model outperform the other ones on all major query shapes, i.e star, snowflake, chain and hybrid.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 06:53

June 08

AKSW Group - University of Leipzig: AKSW at ESWC 2016


We are very pleased to report that 4 of our papers were accepted for presentation as full papers at ESWC 2016. These are

In addition, we organised the first HOBBIT community meeting. Many thanks to all who participated. Get involved in the project by going here. Our survey pertaining to benchmarking is still open and we’d love to have your feedback on what you would want benchmarking Linked Data to look like.

We also presented  three research projects, i.e., HOBBIT, QAMEL and DIESEL during the EU networking sessions. Many thanks for the fruitful discussions and ideas.

Finally, we thank all the systems which participated to QALD-6 and OKE and made these challenges so interesting. Little perk: We are still to find a system to beat CETUS at the OKE challenge :)

FYI, a full list of accepted conference papers can be found here.


In addition to the main conference, we were active during the workshops. Axel gave the keynote at the Profiles workshop (many thanks to the organizers for the invite). The following papers were accepted as full papers.

  • DBtrends : Publishing and Benchmarking RDF Ranking Functions by Edgard Marx, Amrapali J. Zaveri, Mofeed Mohammed, Sandro Rautenberg, Jens Lehmann, Axel-Cyrille Ngonga Ngomo and Gong Cheng, SumPre2016 Workshop at ESWC 2016
  • Towards Sustainable view-based Extract-Transform-Load (ETL) Fusion of Open Data by Kay Mueller, Claus Stadler, Ritesh Kumar Singh and Sebastian Hellmann, LDQ2016 [pdf]
  • UPSP: Unique Predicate-based Source Selection for SPARQL Endpoint Federation by Ethem Cem Ozkan, Muhammad Saleem, Erdogan Dogdu and Axel-Cyrille Ngonga Ngomo  PROFILES Workshop at ESWC 2016 [pdf]
  • Federated Query Processing: Challenges and Opportunities by Axel-Cyrille Ngonga Ngomo and Muhammad Saleem Keynote at PROFILES Workshop at ESWC 2016 [pdf]

Quo Vadis?

We are now looking forward to EDF 2016, where we will present HOBBIT as a poster as well as organise a post conference event (see Thereafter, you can meet us at  ISWC 2016, where we will present two tutorials (Link Discovery and Federated SPARQL queries) and organise the BLINK [[]] workshop. Your submissions are welcome.


Posted at 14:38

May 31

AKSW Group - University of Leipzig: AKSW@LREC2016

Since the first edition held in Granada in 1998, LREC has become one of the the major events on Language Resources (LRs) and Language Technologies (LT). At the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), held from 23-28 May 2016 in Portorož (Slovenia), the AKSW/KILT members Bettina Klimek, Milan Doichinovski and Sebastian Hellmann took active participation. At the conference they presented their most recent research results and project outcomes in the areas of Linked Data and Language Technologies. With over 1250 paper submissions and 744 accepted papers, we are pleased to have contributed to the research field with the following contributions:

  • DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus, by Brümmer, Martin; Dojchinovski, Milan and Hellmann, Sebastian [PDF]
  • FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies, by Dojchinovski, Milan; Sasaki, Felix; Gornostaja,Tatjana;  Hellmann, Sebastian; Mannens, Erik; Salliau, Frank; Osella, Michele; Ritchie, Phil; Stoitsis, Giannis; Koidl, Kevin; Ackermann, Markus and Chakraborty, Nilesh [PDF]
  • Creating Linked Data Morphological Language Resources with MMoOn – The Hebrew Morpheme Inventory, by Klimek, Bettina and Arndt, Natanael and Krause, Sebastian and Arndt, Timotheus [PDF]
  • The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud, by McCrae, John P.; Chiarcos, Christian; Bond, Francis; Cimiano, Philipp; Declerck, Thierry; de Melo, Gerard; Gracia, Jorge; Hellmann, Sebastian; Klimek, Bettina; Moran, Steven; Osenova, Petya; Pareja-Lora, Antonio and Pool, Jonathan [PDF]

At the  main conference Bettina Klimek gave an oral presentation of the Hebrew Morpheme Inventory that is based on the MMoOn project. The audience showed high interest in the data and the underlying MMoOn ontology including questions about possible applications such as creating MMoOn based lemmatizers.

Bettina Klimek presenting @LREC2016

Milan Dojchinovski @LREC 2016


Further, Milan Dojchinovski gave two poster presentations summarizing the latest results from the FREME project. He presented the “DBpedia Abstracts” – a large-scale, open, multilingual NLP training corpus. The presentation attracted huge interest from the audience which has shown particular interest in its use. Several requests on availability of the corpora in other languages (i.e. Welsh) have been also received.

Milan has also presented the latest developments within the FREME project and the framework itself. The presentation has been primarily focused on the technical aspects of the framework, its availability, active use a real-world scenarios and the future plans.

Also, being active members of the Open Knowledge Foundation’s Working Group on Open Data in Linguistics (OWLG),  Sebastian Hellmann and Bettina Klimek helped organizing the 5th Workshop on Linked Data in Linguistic (LDL-2016) which was one of the LREC conference workshops. Around 50 participants attended the workshop discussing topics dealing with managing, building and using linked language resources. In the workshop’s poster session Bettina Klimek introduced the MMoOn model for representing morphological language data to the various interested workshop attendants. In addition, Milan Dojchinovski also presented results from the FREME project which relate to the research presented at the LDL workshop and the Linked Data and Language Technologies community.

The LDL Workshop participants.

In continuation of OWLG organized events, the First Workshop on Knowledge Extraction and Knowledge Integration (KEKI 2016) will take place on the 17-18 October in conjunction with the 15th International Semantic Web Conference in Kobe (Japan). The topics of linguistic Linked Data creation and integration will be taken up in order to move the LLOD cloud to its next phase in which innovative applications will be developed overcoming the language barriers on the Web. Paper submission is still open until 1st of July!

During the main conference days 25-27 May, 2016, Milan Dojchinovski and Felix Sasaki (FREME project coordinator) have taken participation in the exhibition area with a booth dedicated to the FREME project. The ultimate goal of this participation was to meet people interested in understanding how the open framework deployed within the project may help in narrowing the gap between the actual business needs and the language and Linked Data technologies. For more on the FREME presence at LREC 2016 you can read here.

LREC has been a great event to meet the community, make new connections, discuss current research challenges, share ideas, and establish new collaborations. Having said that, we look forward to the next LREC conference, in two years from now!

Posted at 17:42

May 30

Dublin Core Metadata Initiative: Call for Participation and Demos: NKOS Dublin Core workshop

2016-05-30, The 16th European Networked Knowledge Organization Systems (NKOS) Workshop will take place at the DC-2016 conference in Copenhagen. Proposals are invited for the following: (a) Presentations (typically 20 minutes plus discussion time, potentially longer if warranted) on work related to the themes of the workshop (see below). An option for a short 5 minute project report presentation is also possible; and (b) Demos on work related to the themes of the workshop. The submission deadline is Friday, 1 July 2016 with notification of acceptance by Tuesday, 16th August 2016. The Call for Participation can be found on the conference website at and on the NKOS website at

Posted at 23:59

Dublin Core Metadata Initiative: Dublin Core at 21 (A celebration in Dublin, Ohio)

2016-05-30, IFLA Satellite Event. Dublin Core at 21 celebrates DC's amazing 21 year history and anticipates its future. The Dublin Core originated in 1995 at a meeting at OCLC (in the very room where this IFLA Satellite event will also take place). This special event will bring a historical view of key people who were there when the Web was young, and Dublin Core was new, and evolving rapidly. But the Web does not stand still. Presentations will also provide information on the latest metadata standards-related work underway by DCMI and OCLC's current work with metadata models, standards, and technologies advancing the state of the art for libraries and archives. Presenters will include metadata experts with long ties to Dublin Core including several who were at the original invitational meeting in 1995. A panel discussion will permit speakers to reflect on activities and trends past and present. Attendees are invited to attend a complimentary reception and special unveiling following the presentation portion of the day. For more information and to register for this IFLA Satellite event, go to

Posted at 23:59

Dublin Core Metadata Initiative: DCMI opens Presentation Track for DC-2016

2016-05-30, The Conference Committee for DC-2016 is pleased to announce that it has opened a Presentation Track for DC-2016 in Copenhagen to provide developers, practitioners, and researchers with the opportunity to present on interesting metadata topics in the areas of metadata services, development and deployment projects, metadata explorations underway, and other innovative developments in the metadata ecosystem. Selections will be made by the Professional Program Committee. The presentations will become part of the permanent record of the DC-2016 conference and will be openly available. Proposal deadline: Friday, 8 July 2016. For more information on the Presentation Track and to submit a proposal, see

Posted at 23:59

AKSW Group - University of Leipzig: AKSW Publishes Survey on Challenges of Question Answering in the Semantic Web

Semantic Web Journal Logo
We are happy to announce that our Survey on Challenges of Question Answering in the Semantic Web (Konrad  Höffner, Sebastian Walter, Edgard Marx, Ricardo Usbeck, Jens Lehmann and Axel Ngonga) has been accepted.


Semantic Question Answering (SQA) removes two major access requirements to the Semantic Web: the mastery of a formal query language like SPARQL and knowledge of a specific vocabulary. Because of the complexity of natural language, SQA presents difficult challenges and many research opportunities. Instead of a shared effort, however, many essential components are redeveloped, which is an inefficient use of researcher’s time and resources. This survey analyzes 62 different SQA systems, which are systematically and manually selected using predefined inclusion and exclusion criteria, leading to 72 selected publications out of 1960 candidates. We  identify common challenges, structure solutions, and provide recommendations for future systems. This work is based on publications from the end of 2010 to July 2015 and is also compared to older but similar surveys.

Posted at 12:08

May 27

Leigh Dodds: Beyond Publishers and Consumers: Some Example Ecosystems

Yesterday I wrote a post suggesting that we should move

Posted at 14:37

AKSW Group - University of Leipzig: AKSW Colloquium, 30.05.2016, PARIS: Probabilistic Alignment of Relations, Instances, and Schema

Mohamed Sherif

In the incoming colloquium, Mohamed Ahmed Sherif will present the paper “PARIS: Probabilistic Alignment of Relations, Instances, and Schema” from Suchanek et al., published in the proceedings of VLDB 2012 [PDF].


One of the main challenges that the Semantic Web faces is the integration of a growing number of independently designed ontologies. In this work, we present PARIS, an approach for the automatic alignment of ontologies. PARIS aligns not only instances, but also relations and classes. Alignments at the instance level cross-fertilize with alignments at the schema level. Thereby, our system provides a truly holistic solution to the problem of ontology alignment. The heart of the approach is probabilistic, i.e., we measure degrees of matchings based on probability estimates. This allows PARIS to run without any parameter tuning. We demonstrate the efficiency of the algorithm and its precision through extensive experiments. In particular, we obtain a precision of around 90% in experiments with some of the world’s largest ontologies.

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 09:33

May 26

Leigh Dodds: Beyond publishers and consumers

In the open data community we tend to focus a lot on Publishers and Consumers.

Publishers have the data we want. We must lobby or convince them that publishing the data would be beneficial. And we need to educate them about licensing and how best to publish data. And we get frustrated when they don’t do those things

Consumers are doing the work to extract value from data. Publishers want to encourage Consumers to do things with their data. But are often

Posted at 16:47

Leigh Dodds: Beyond publishers and consumers

In the open data community we tend to focus a lot on Publishers and Consumers.

Publishers have the data we want. We must lobby or convince them that publishing the data would be beneficial. And we need to educate them about licensing and how best to publish data. And we get frustrated when they don’t do those things

Consumers are doing the work to extract value from data. Publishers want to encourage Consumers to do things with their data. But are often

Posted at 16:47

Leigh Dodds: Designing for the open digital commons

I wanted to share some thinking I’ve been doing around how to create products and services that embrace and support the digital commons. The digital commons is

Posted at 12:00

Leigh Dodds: Designing for the open digital commons

I wanted to share some thinking I’ve been doing around how to create products and services that embrace and support the digital commons. The digital commons is

Posted at 12:00

May 24

Ebiquity research group UMBC: Managing Cloud Storage Obliviously

Vaishali Narkhede, Karuna Pande Joshi, Tim Finin, Seung Geol Choi, Adam Aviv and Daniel S. Roche, Managing Cloud Storage Obliviously, International Conference on Cloud Computing, IEEE Computer Society, June 2016.

Consumers want to ensure that their enterprise data is stored securely and obliviously on the cloud, such that the data objects or their access patterns are not revealed to anyone, including the cloud provider, in the public cloud environment. We have created a detailed ontology describing the oblivious cloud storage models and role based access controls that should be in place to manage this risk. We have developed an algorithm to store cloud data using oblivious data structure defined in this paper. We have also implemented the ObliviCloudManager application that allows users to manage their cloud data by validating it before storing it in an oblivious data structure. Our application uses role-based access control model and collection based document management to store and retrieve data efficiently. Cloud consumers can use our system to define policies for storing data obliviously and manage storage on untrusted cloud platforms even if they are unfamiliar with the underlying technology and concepts of oblivious data structures.

Posted at 18:29

May 23

AKSW Group - University of Leipzig: AKSW Colloquium, 23.05.2016, Instance Matching and RDF Dataset Similarity

In the incoming colloquium, Mofeed Hassan will present the paper “Semi-supervised Instance Matching Using Boosted Classifiers” from Kejriwal et al., published in the proceedings of ESWC 2015 [PDF].


Instance matching concerns identifying pairs of instances that refer to the same underlying entity. Current state-of-the-art instance matchers use machine learning methods. Supervised learning systems achieve good performance by training on significant amounts of manually labeled samples. To alleviate the labeling effort, this paper presents a minimally supervised instance matching approach that is able to deliver competitive performance using only 2% training data and little parameter tuning. As a first step, the classifier is trained in an ensemble setting using boosting. Iterative semi-supervised learning is used to improve the performance of the boosted classifier even further, by re-training it on the most confident samples labeled in the current iteration. Empirical evaluations on a suite of six publicly available benchmarks show that the proposed system outcompetes optimization-based minimally supervised approaches in 1-7 iterations. The system’s average F-Measure is shown to be within 2.5% of that of recent supervised systems that require more training samples for effective performance.

After that, Michael Röder will present his paper “Detecting Similar Linked Datasets Using Topic Modelling” that has been accepted by the upcoming ESWC 2016 [PDF].


The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting a dataset to five-star Linked Data however requires the publisher of this dataset to link it with the already available linked datasets. Given the size and growth of the Linked Data Cloud, the current mostly manual approach used for detecting relevant datasets for linking is obsolete. We study the use of topic modelling for dataset search experimentally and present TAPIOCA, a linked dataset search engine that provides data publishers with similar existing datasets automatically. Our search engine uses a novel approach for determining the topical similarity of datasets. This approach relies on probabilistic topic modelling to determine related datasets by relying solely on the metadata of datasets. We evaluate our approach on a manually created gold standard and with a user study. Our evaluation shows that our algorithm outperforms a set of comparable baseline algorithms including standard search engines significantly by 6% F1-score. Moreover, we show that it can be used on a large real world dataset with a comparable performance.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 12:24

Copyright of the postings is owned by the original blog authors. Contact us.