Planet RDF

It's triples all the way down

September 22

Leigh Dodds: The Lego Analogy

I think Lego is a great analogy for understanding the importance of data standards and registers.

Lego have been making plastic toys and bricks

Posted at 12:35

September 19

Leigh Dodds: Mapping wheelchair accessibility, how google could help

This month Google announced

Posted at 07:15

September 18

Leigh Dodds: Under construction

It’s been a while since I posted a more personal update here. But, as I announced this morning, I’ve

Posted at 19:27

September 17

Bob DuCharme: Understanding activation functions better

And making neural networks look a little less magic.

Posted at 18:11

September 16

Libby Miller: Leaving Flickr

I’m very sad about this, especially because of all the friends I have made on Flickr, but with Verizon’s acquisition of Yahoo (and so Flickr) and the

Posted at 21:21

September 12

Ebiquity research group UMBC: 2018 Ontology Summit: Ontologies in Context

2018 Ontology Summit: Ontologies in Context

The OntologySummit is an annual series of online and in-person events that involves the ontology community and communities related to each year’s topic. The topic chosen for the 2018 Ontology Summit will be Ontologies in Context, which the summit describes as follows.

“In general, a context is defined to be the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed. Some examples of synonyms include circumstances, conditions, factors, state of affairs, situation, background, scene, setting, and frame of reference. There are many meanings of “context” in general, and also for ontologies in particular. The summit this year will survey these meanings and identify the research problems that must be solved so that contexts can succeed in achieving the full understanding and assessment of an ontology.”

Each year’s Summit comprises of a series of both online and face-to-face events that span about three months. These include a vigorous three-month online discourse on the theme, and online panel discussions, research activities which will culminate in a two-day face-to-face workshop and symposium.

Over the next two months, there will be a sequence of weekly online meetings to discuss, plan and develop the 2018 topic. The summit itself will start in January with weekly online sessions of invited speakers. Visit the the 2018 Ontology Summit site for more information and to see how you can participate in the planning sessions.

Posted at 22:55

September 10

Ebiquity research group UMBC: Dissertation: Context-Dependent Privacy and Security Management on Mobile Devices

Context-Dependent Privacy and Security Management on Mobile Devices

Prajit Kumar Das, Context-Dependent Privacy and Security Management on Mobile Devices, Ph.D. Dissertation, University of Maryland, Baltimore County, September 2017.

There are ongoing security and privacy concerns regarding mobile platforms that are being used by a growing number of citizens. Security and privacy models typically used by mobile platforms use one-time permission acquisition mechanisms. However, modifying access rights after initial authorization in mobile systems is often too tedious and complicated for users. User studies show that a typical user does not understand permissions requested by applications or are too eager to use the applications to care to understand the permission implications. For example, the Brightest Flashlight application was reported to have logged precise locations and unique user identifiers, which have nothing to do with a flashlight application’s intended functionality, but more than 50 million users used a version of this application which would have forced them to allow this permission. Given the penetration of mobile devices into our lives, a fine-grained context-dependent security and privacy control approach needs to be created.

We have created Mithril as an end-to-end mobile access control framework that allows us to capture access control needs for specific users, by observing violations of known policies. The framework studies mobile application executables to better inform users of the risks associated with using certain applications. The policy capture process involves an iterative user feedback process that captures policy modifications required to mediate observed violations. Precision of policy is used to determine convergence of the policy capture process. Policy rules in the system are written using Semantic Web technologies and the Platys ontology to define a hierarchical notion of context. Policy rule antecedents are comprised of context elements derived using the Platys ontology employing a query engine, an inference mechanism and mobile sensors. We performed a user study that proves the feasibility of using our violation driven policy capture process to gather user-specific policy modifications.

We contribute to the static and dynamic study of mobile applications by defining “application behavior” as a possible way of understanding mobile applications and creating access control policies for them. Our user study also shows that unlike our behavior-based policy, a “deny by default” mechanism hampers usability of access control systems. We also show that inclusion of crowd-sourced policies leads to further reduction in user burden and need for engagement while capturing context-based access control policy. We enrich knowledge about mobile “application behavior” and expose this knowledge through the Mobipedia knowledge-base. We also extend context synthesis for semantic presence detection on mobile devices by combining Bluetooth, low energy beacons and Nearby Messaging services from Google.

Posted at 15:20

September 09

Egon Willighagen: New paper: "RDFIO: extending Semantic MediaWiki for interoperable biomedical data management"

Figure 10 from the article showing what the DrugMet wiki
with the pKa data looked like. CC-BY.
When I was still doing research at Uppsala University, I had a internship student, Samuel Lampa, who did wonderful work on knowledge representation and logic (check his thesis). In that same period he started RDFIO, a Semantic MediaWiki extension to provide a SPARQL end point and some clever feature to import and export RDF. As I was already using RDF in my research, and wikis are great way to explore how to model domain data, particularly when extracted from diverse literature, I was quite interested. Together we worked on capturing pKa data, and Samuel had put DrugMet online. Extracting pKa values from primary literature is a lot of laborious work and crowdsourcing did not pick up. This data was migrated to Wikidata about a year ago.

I also used the RDFIO extension when I started capturing nanosafety data from literature when I worked at Karolinska Institutet. I will soon write up this work, as the NanoWiki (check out these FigShare data releases) was a seminal data set in eNanoMapper, during which I continued adding data to test new AMBIT features.

Earlier this week Samuel's write up of his RDFIO project was published, to which I contributed the pKa use case (doi:10.1186/s13326-017-0136-y). There are various ways to install the software, as described on the RDFIO project site. The DrugMet data as well as the data for the OrphaNet data from the other example use case can also be downloaded from that site.

Lampa, S., Willighagen, E., Kohonen, P., King, A., Vrandečić, D., Grafström, R., & Spjuth, O. (2017). RDFIO: extending semantic MediaWiki for interoperable biomedical data management. Journal of Biomedical Semantics, 8 (1). http://dx.doi.org/10.1186/s13326-017-0136-y

Posted at 14:00

September 04

AKSW Group - University of Leipzig: DBpedia @ SEMANTiCS 2017

We are happy to invite you to the 10th DBpedia Community Meeting which will be held in Amsterdam. During the SEMANTiCS 2017, Sep 11-14, the DBpedia Community will get together on the 14th of September for the DBpdia Day.  

What cool things do you do with DBpedia? Present your tools and datasets at the DBpedia Community Meeting. Please submit your proposal in our form.

Highlights/Sessions

  • Keynote by Chris Welty (Google Research)
  • Keynote by Victor de Boer (VU University)
  • DBpedia Association Hour & Dutch DBpedia Hour
  • session on DBpedia ontology by members of the DBpedia ontology committee
  • DBpedia Tutorial Session (for people who want to learn about DBpedia)
  • We will talk with Mike Tung, CEO and founder from diffbot, about the DBpedia NLP department via videostream.

 Tickets

Attending the DBpedia Community Meeting costs €40 (excl. registration fee and VAT). DBpedia members get free admission, please contact your nearest DBpedia chapter or the DBpedia Association for a promotion code.  

Please check all details here.

Workshop

If you can’t stand it till the end of the SEMANTiCS, you can already participate in the workshop “Two worlds, one goal: A Reliable Linked Data ecosystem for media” held by DBpedia and Wolters Kluwer on the 11th of September. This half-day workshop aims at exploring major topics for publishers and libraries from DBpedia’s and Wolters Kluwer’s perspective. Therefore, both communities will dive into core areas like Interlinking, Metadata and Data Quality and address challenges such as fundamental requirements when publishing data on the web. Did we spark your interest? Check our detailed program here and get your ticket today.

We are looking forward to meeting you in Amsterdam!

Posted at 13:25

AKSW Group - University of Leipzig: PRESS RELEASE: Amsterdam​ ​-​ ​this​ ​year’s​ ​hotspot​ ​​on Linked​ ​Data​ ​Strategies​ ​&​ ​Practices

SEMANTiCs LogoSeptember 11-14, 2017 international experts from science and industry demonstrate the business value of smart data services at SEMANTiCS 2017

Experts from science and industry meet at Europe’s biggest Linked Data and Semantic Web event to present and discuss latest achievements, challenges and future perspectives of new data management practices. The conference for Semantic Systems is now in its 13th edition and run by a mixed industry and research consortium built by Semantic Web Company (Austria), Institute for Applied Informatics (Germany), University of Applied Science St.Pölten (Austria) and the dutch partners VU, TNO and Kadaster, together with Wolters Kluwer as major industry sponsor.

Most companies and public administrations nowadays are struggling to catch up with new data management practices, either by initializing a data strategy from scratch or by adjusting their old strategy to the affordances of new technological environments, legal frameworks or business models. The Semantics conference gives insights into data management strategies, discusses cases of data-driven business models and gives advice on how to catch up with latest developments at the dawn of smart, networked data.

The exchange between industry and research is facilitated by a rich program consisting of six keynotes from companies like EA Games, Wolters Kluwer, and OTTO, followed by a total of 36 industry and 25 scientific presentations, 17 workshops, a poster and demo area and numerous social side events.

Programme​ ​Overview

September 11, 2017: Pre-Conference Workshops
September 12, 2017: Main Conference Day 1: Keynotes by Wolters Kluwer, EA Games and Toulouse Institute of Computer Science Research
September 13, 2017: Main Conference Day 2: Keynotes by OTTO & Ghent University
September 14, 2017: Post-Conference Workshops & DBpedia Day: Keynote by Chris Welty

This year’s conference focuses on the business value of Linked Data technologies and services as an enabling technology for a cost-efficient, flexible and sustainable enterprise data strategy.

This is addressed in the opening keynote of Sandeep Sacheti, Executive Vice President, Customer Information Management & Operational Excellence of Wolter Kluwer and in the management panel on Wednesday with Frank Tierolff (board member of Kadaster), Henk Jan Vink (director Networked Innovation of TNO), Kor Brandts (director DUO) and Michiel Borgers (Dutch Ministry of Finance).

The full and rich programme, with talks and presentations by leading researchers in the field and leading industry adopters, can be found at the conference page at http://2017.semantics.cc

SEMANTiCS​ ​2017​ ​Key​ ​Data

Date: 11-14 September 2017
Venue: Meervaart Theatre, Amsterdam, The Netherlands
Website: http://2017.semantics.cc
Programm: http://2017.semantics.cc/programme
Twitter: @semanticsconf
Contact: Dissemination Chair Arjen Santema
arjen.santema@kadaster.nl |+31 (0)652481774

View the full release here: PDF

Looking forward to see you at SEMANTiCs 2017.

Posted at 09:58

September 03

Dublin Core Metadata Initiative: Early Registration for DC 2017 Ends 15 September 2017

Save now! Registration fees for DC-2017 go up on 16 September. You don’t want to miss out on the savings with early registration. The program schedule for the three full days of the conference in Washington, DC are filled with sessions that you will not want to miss. Check out the schedule at http://dcevents.dublincore.org/IntConf/index/pages/view/schedule17 where links from titles will take you to descriptions of events and the abstracts of Papers, Project Reports, Presentations, Posters, Workshops and Tutorials.

Posted at 16:10

Dublin Core Metadata Initiative: DC 2017 Preliminary Program Announced

DCMI is pleased to announce the publication of the Preliminary Program for DC-2017 at http://dcevents.dublincore.org/IntConf/index/pages/view/schedule17. The program includes an array of Papers, Project Reports, Presentations and Posters. Special Sessions on significant metadata topics as well as half- and full-day Workshops round out the program. The keynote will be delivered by G. Sayeed Choudhury, Associate Dean for Research Data Management and Hodson Director of the Digital Research and Curation Center at the Sheridan Libraries of Johns Hopkins University.

Posted at 16:09

Dublin Core Metadata Initiative: DCMI Website Migrated to New Platform

The DCMI website (this website) has been migrated to a new platform, as the first stage of a comprehensive overhaul. DCMI's Infrastructure Advisory Committee is overseeing a project to modernise the site and to improve the management of a significant amount of content, some of which goes back to 1995! We are taking advantage of two technologies in particular - Github, and Hugo - a static-site-generator. The migration of content to use these new technologies has involved the transformation of content from HTML to a combination of Markdown and YAML - the formats used by Hugo.

Posted at 16:08

August 29

schema.org: Schema.org 3.3: News, fact checking, legislation, finance, schedules, howtos, tourism and toilets!

Schema.org 3.3 has been released. As always, the release was prepared, debated and finalized by the schema.org community group, and features a range of additions, adjustments, bugfixes and clarifications to improve the expressiveness and usability of our schemas.


See the release notes for full details, but of particular note are some changes made around the NewsArticle type (in collaboration with the Trust Project on whose work this is largely based). For many years, our definition of NewsArticle was simply "a news article". With this release we add (via our "pending" mechanism) some more subtlety around News, making it possible to mark-up categories of news including opinion pieces, background articles, reportage, as well as as also introducing types for satirical and advertiser content. We also add properties that encourage greater transparency around News creation and publication. These are flagged as "pending" to emphasize that early adopter feedback on the new vocabulary is particularly welcomed, via Github, the W3C group, or the site's feedback form. These developments complement our earlier work to support interoperability amongst fact-checking sites via the ClaimReview type. Following discussion at GlobalFact4 conference, we have also amended the definition of the "expires" to highlight its applicability to fact checking content.

Other highlights of 3.3 include new terminology (also pending implementor feedback) for describing legislation, based on the European Legislation Identifier (ELI) ontology and the work of the ELI taskforce. We have also added an overview page giving more details on our finance-related terminology, contributed by the FIBO community, alongside a proposed design for describing schedules, new subtypes distinguishing user from critic reviews, and a generalization of our recipes schema called "HowTo" for recipe-like tasks that don't result in food. We've also added types for TouristAttraction and for PublicToilet...

Posted at 21:51

August 28

AKSW Group - University of Leipzig: AKSW Colloquium, 01.09.2017, IDOL: Comprehensive & Complete LOD Insights

At the AKSW Colloquium on Friday 1st of September, at 10:40 AM there will be a paper presentation by Gustavo Publio. He will present the paper IDOL: Comprehensive & Complete LOD Insights, from Ciro Baron Neto, Dimitris Kontokostas, Amit Kirschenbaum, Gustavo Publio, Diego Esteves, and Sebastian Hellmann which will be presented in the upcoming SEMANTiCS’17 Conference in Amsterdam, Netherlands.

Abstract

“Over the last decade, we observed a steadily increasing amount of
RDF datasets made available on the web of data. The decentralized
nature of the web, however, makes it hard to identify all these
datasets. Even more so, when downloadable data distributions are
discovered, only insufficient metadata is available to describe the
datasets properly, thus posing barriers on its usefulness and reuse.
In this paper, we describe an attempt to exhaustively identify the
whole linked open data cloud by harvesting metadata from multiple
sources, providing insights about duplicated data and the general
quality of the available metadata. This was only possible by using a
probabilistic data structure called Bloom Filter. Finally, we published
a dump file containing metadata which can further be used to enrich
existent datasets.”

 

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/public/colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there are complimentary coffee and cake after the session.

Posted at 15:24

August 25

Tetherless World Constellation group RPI: Get Off Your Twitter

Web Science, more so than many other disciplines of Computer Science, has a special focus on its humanist qualities – no surprise in that the Web is ultimately an instrument for human expression and cooperation. Naturally, lots of current research in Web Science centers on people and their patterns of behavior, making social media a potent source of data for this line of work.

 

Accordingly, much time has been devoted to analyzing social networks – perhaps to a fault. Much of the ACM’s Web Science ‘17 conference centered on social media; more specifically, Twitter. While it may sound harsh, the reality is that many of the papers presented at WebSci’17 could be reduced to the following pattern:

  1. There’s Lots of Political Polarization
  2. We Want to Explore the Political Landscape
  3. We Scraped Twitter
  4. We Ran (Sentiment Analysis/Mention Extraction/etc.)
  5. and We Found Out Something Interesting About the Political Landscape

Of the 57 submissions included in the WebSci’17 proceedings, 17 mention ‘Twitter’ or ‘tweet’ in the abstract or title; that’s about 3 out of every 10 submissions, including posters. By comparison, only seven mention Facebook, with some submissions mentioning both.

 

This isn’t to demean the quality or importance of such work; there’s a lot to be gained from using Twitter to understand the current political climate, as well as loosely quantifying cultural dynamics and understanding social networks. However, this isn’t the only topic in Web Science worth exploring, and Twitter certainly shouldn’t be the ultimate arbitrator of that discussion. While Twitter provides a potent means for understanding popular sentiment via a well-controlled dataset, it is still only a single service that attracts a certain type of user and is better for pithy sloganeering than it is for deep critical analysis, or any other form of expression that can’t be captured in 140 characters.

 

One of my fellow conference-goers also noticed this trend. During a talk on his submission to WebSci’17, Holge Holtzmann, a researcher from Germany working with Web archives, offered a truism that succinctly captures what I’m saying here: that Twitter ought not to be the only data source researchers are using when doing Web Science.

 

In fact, I would argue that Mr. Holtzmann’s focus, Web archives, could provide a much richer basis for testing our cultural hypotheses. While more old school, Web archives capture a much, much larger and more representative span of the Web from it’s inception to the dawn of social media than Twitter could ever hope to.

 

The winner for Best Paper speaks directly to the new possibilities offered by working with more diverse datasets. Applying a deep learning approach to Web archives, the authors examined the evolution of front-end Web design over the past two decades. Admittedly, I wasn’t blown away by their results; they claimed that their model had generated new Web pages in the style of different eras, but didn’t show an example, which was underwhelming. But that’s beside the point; the point is that this is a unique task which couldn’t be accomplished by leaning exclusively on Twitter or any other social media platform.

 

While I remain critical of the hyper-focus of the Web Science community on social media sites – and especially Twitter – as a seed for its work, I do admire the willingness to wade into cultural and other human-centric issues. This is a rare trait in technological disciplines in general, but especially fields of Computer Science; you’re far more likely to read about gains in deep reinforcement learning than you are to read about accommodating cultural differences in Web use (though these don’t necessarily exclude each other). To that point, the need to provide greater accessibility to the Web for disadvantaged groups and to preserve rapidly-disappearing Web content were widely noted, leaving me optimistic for the future of the field as a way of empowering everyone on the Web.

 

Now time to just wean ourselves off Twitter a bit…

Posted at 15:02

August 20

Bob DuCharme: Validating RDF data with SHACL

Setting some constraints--then violating them!

Posted at 15:54

August 17

Ebiquity research group UMBC: PhD defense: Prajit Das, Context-dependent privacy and security management on mobile devices

Ph.D. Dissertation Defense

Context-dependent privacy and security management on mobile devices

Prajit Kumar Das

8:00-11:00am Tuesday, 22 August 2017, ITE325b, UMBC

There are ongoing security and privacy concerns regarding mobile platforms which are being used by a growing number of citizens. Security and privacy models typically used by mobile platforms use one-time permission acquisition mechanisms. However, modifying access rights after initial authorization in mobile systems is often too tedious and complicated for users. User studies show that a typical user does not understand permissions requested by applications or are too eager to use the applications to care to understand the permission implications. For example, the Brightest Flashlight application was reported to have logged precise locations and unique user identifiers, which have nothing to do with a flashlight application’s intended functionality, but more than 50 million users used a version of this application which would have forced them to allow this permission. Given the penetration of mobile devices into our lives, a fine-grained context-dependent security and privacy control approach needs to be created.

We have created Mithril as an end-to-end mobile access control framework that allows us to capture access control needs for specific users, by observing violations of known policies. The framework studies mobile application executables to better inform users of the risks associated with using certain applications. The policy capture process involves an iterative user feedback process that captures policy modifications required to mediate observed violations. Precision of policy is used to determine convergence of the policy capture process. Policy rules in the system are written using Semantic Web technologies and the Platys ontology to define a hierarchical notion of context. Policy rule antecedents are comprised of context elements derived using the Platys ontology employing a query engine, an inference mechanism and mobile sensors. We performed a user study that proves the feasibility of using our violation driven policy capture process to gather user-specific policy modifications.

We contribute to the static and dynamic study of mobile applications by defining “application behavior” as a possible way of understanding mobile applications and creating access control policies for them. Our user study also shows that unlike our behavior-based policy, a “deny by default” mechanism hampers usability of access control systems. We also show that inclusion of crowd-sourced policies leads to further reduction in user burden and need for engagement while capturing context-based access control policy. We enrich knowledge about mobile “application behavior” and expose this knowledge through the Mobipedia knowledge-base. We also extend context synthesis for semantic presence detection on mobile devices by combining Bluetooth, low energy beacons and Nearby Messaging services from Google.

Committee: Drs. Anupam Joshi (chair), Tim Finin (co-chair), Tim Oates, Nilanjan Banerjee, Arkady Zaslavsky, (CSIRO), Dipanjan Chakraborty (Shopperts)

Posted at 23:22

August 14

Tetherless World Constellation group RPI: WebSci ’17

The Web Science Conference was hosted by Rensselaer Polytechnic Institute this year. The Tetherless World Constellation was heavily involved in organizing the event and ensuring the conference ran smoothly.The venue for the conference was the Franklin Plaza in downtown Troy. It was a great venue, with a beautiful rooftop.

On 25th June, there were a set of workshops organized for the attendees. I was a student volunteer at the “Algorithm Mediated Online Information Access (AMOIA)” workshop. We started the day off with a set of talks. The common theme for these talks were to reduce the bias in services we use online. We then spent the next few hours in a discussion on the “Role of recommendation algorithms in online hoaxes and fake news.”

Prof. Peter Fox and Prof Deborah McGuinness, who were the Main Conference Chairs, kicked off the Conference on 26th June. Steffen Staab gave his keynote talk on “The Web We Want“.  After the keynote talk, we jumped right into a series of talks. A few topics caught my attention during each session. Venkata Rama Kiran Garimella’s talk on “The Effect of Collective Attention on Controversial Debates on Social Media” was very interesting, as was the talk on “Recommendations for groups in location-based social networks” by Fred Ayala. We ended the talks with a Panel disscussion on “The ethics of doing Web Science”. After the panel discussions, we headed to the roof for some dinner and the Web Science Poster Session. There were plenty of Posters at the session. Congrui Li and Spencer Norris from TWC presented their work at the poster session.

 

27th of June was the day of the conference I was most looking forward to, since they had a session on “Networks : Structure, Identifiers, Search”. I found all the talk presented here very fascinating and useful. Particularly the talk “Herirachichal Change Point Detection” and “Adaptive Edge Probing” by Yu Wang and Sucheta Soundarajan respectively. I plan to use the work they presented in one of my current research projects. At the end of the day on 27th June, the award for the papers and posters were presented. Helena Webb won the best paper award. She presented her work on “The ethical challenges of publishing Twitter data for research dissemination”. Venkata Garimella won the best student paper award. Tetherless’ own Spencer Norris won the best poster award.

On 28th June, we started the day of by giving a set of talks on the topic chosen for the Hackthon, “Network Analysis for Non-Social Data”. Here I presented my work on how Network Analysis techniques can be leveraged and applied in the field of Earth Science. After these talk, the hackathon presentations were made by the participants. At lunch , Ahmed Eliesh from TWC won first place in the Hackathon. After lunch, we had the last 2 sessions at WebSci ’17. In these talks, Shawn Jones’ talk present Yasmin Alnomany’s work on “Generating Stories from Archived Collections” and Helena Webb’s best paper winning talk on “The ethical challenges of publishing Twitter data for research dissemination” piqued my interest.

Overall, attending the web science conference was a very valuable experience for me. There was plenty to learn, lots of networking opportunities and a generally jovial atmosphere around the conference. Here’s Looking forward to the next year’s conference in Amsterdam.

 

 

Posted at 21:01

August 07

Leigh Dodds: Bath Playbills 1812-1851

This weekend I published scans of over 2000 historical playbills for the Theatre Royal in Bath. Here are some notes on whey they come from and how they might be useful.

The scans are

Posted at 06:39

August 01

Leigh Dodds: We can strengthen data infrastructure by analysing open data

Posted at 10:13

July 31

Leigh Dodds: Experiences with the Freestyle Libre

Posted at 20:01

Leigh Dodds: Thank you for the data

Here are three anecdotes that show ways in which I’ve shared data with different types of organisation, and how they’ve shared data with me.

Last year we donated some old children’s toys and books to Julian House. When we dropped them off, I signed a

Posted at 16:11

Libby Miller: Libbybot eleven – webrtc / pi3 / presence robot

The libbybot posable presence robot’s latest instructions are 

Posted at 09:32

July 30

Bob DuCharme: The W3C standard constraint language for RDF: SHACL

A brief history of the new standard and some toys to play with it.

Posted at 15:46

AKSW Group - University of Leipzig: AKSW at ISWC2017

We are very pleased to announce that AKSW will be presenting 2 papers at ISWC 2017, which will be held on 21-24 October in Vienna, Austria. The demo and workshops papers have to be announced.
The International Semantic Web Conference (ISWC) is the premier international forum where Semantic Web / Linked Data researchers, practitioners, and industry specialists come together to discuss, advance, and shape the future of semantic technologies on the web, within enterprises and in the context of the public institution.

Here is the list of the accepted paper with their abstract:

Distributed Semantic Analytics using the SANSA Stack” by Jens LehmannGezim SejdiuLorenz BühmannPatrick WestphalClaus Stadler, Ivan ErmilovSimon Bin, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo and Hajira Jabeen.

Abstract:A major research challenge is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base  completion and reasoning. Analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases, and most analytics approaches which do scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input. This software framework paper describes the ongoing Semantic Analytics Stack (SANSA) project, which supports expressive and scalable semantic analytics by providing functionality for distributed computing on RDF data.

Iguana : A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores” by Felix ConradsJens Lehmann, Axel-Cyrille Ngonga Ngomo, Muhammad Saleem, and Mohamed Morsey.

Abstract  :The performance of triples stores is crucial for applications which rely on RDF data. Several benchmarks have been proposed that assess the performance of triple stores. However, no integrated benchmark-independent execution framework for these benchmarks has been provided so far. We propose a novel SPARQL benchmark execution framework called IGUANA. Our framework complements benchmarks by providing an execution environment which can measure the performance of triple stores during data loading, data updates as well as under different loads. Moreover, it allows a uniform comparison of results on different benchmarks. We execute the FEASIBLE and DBPSB benchmarks using the IGUANA framework and measure the performance of popular triple stores under updates and parallel user requests. We compare our results with state-of-the-art benchmarking results and show that our benchmark execution framework can unveil new insights pertaining to the performance of triple stores.

Thank you and looking forward to see you at ISWC 2017.

Acknowledgments
These work were supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227), the European Union’s H2020 research and innovation program BigDataEurope (GA no.644564), German Ministry BMWI under the SAKE project (Grant No. 01MD15006E), WDAqua : Marie Skłodowska-Curie Innovative Training Network and Industrial Data Space.

Posted at 03:57

July 26

Leigh Dodds: “The Rock Thane”, an open data parable

In a time long past, in a land far away, there was a once a troubled kingdom. Despite the efforts of the King to offer justice freely to all, many of his subjects were troubled by unscrupulous merchants and greedy landowners. Time and again, the King heard claims of goods not being delivered, or disputes over land.

While the merchants and landowners were able to produce documents and affidavits to their defence, the King grew increasingly troubled. He felt that his subjects were being wronged, and he grew distrustful of the scribes that thronged the hallways of his courts and marketplaces.

One day, three wizards visited the kingdom. The wizards had travelled from the Far East, where as Masters of the Satoshi School, they had developed many curious spells. The three wizards were brothers. Na was the youngest, and was made to work hardest by his elder brothers, Ka and Mo. Mo, the eldest, was versed in many arts still unknown to his brothers.

Their offer to the King was simple: through the use of their magic they would remove all corruption from his lands. In return they would expect to be well paid for their efforts. Keen to be a just and respected ruler, the King agreed to the wizards’ plan. But while their offer was simple, the plan itself was complex.

The wizards explained that, through an obscure art, they could cause words and images to appear within a certain type of rock, or crystal which could be found commonly throughout the land. Once imbued with words, a crystal could no longer be changed even by a powerful wizard. In a masterful show of power, Ka and Mo embedded the King’s favourite poem and then a painting of his mother in a pair of crystals of the highest quality.

The wizards explained that rather than relying on parchment which could be faked or changed through the cunning application of pumice stones, they could use inscribed crystals to create indelible records of trading bills, property sales and other important documents.

The wizards also demonstrated to the King how, by channelling the power of their masters, groups of their acolytes could simultaneously record the same words in crystals all across the land. This meant that not only would there be an indisputable record of a given trade, but that there would immediately be dozens of copies available across the land, for anyone to check. Readily available and verifiable copies of any bill of trade would mean that no merchant could ever falsify a transaction.

In payment, the wizards would receive a gold piece for every crystal inscribed by their acolytes. Each crystal providing a clear proof of their works.

Impressed, the King decreed that henceforth, all across his lands, trading would now be carried out in trading posts staffed by teams of the wizards’ acolytes.

And, for a time, everything was fine.

But the King began to again receive troubling reports about trading disputes. Trust was failing once again. Speaking to his advisers and visiting some of the new trading posts, the King learned the source of the concerns.

When trading bills had been written on parchment, they could be read by anyone. This made them accessible to all. But only the wizards and their acolytes could read the words inscribed in the crystals. And the King’s subjects didn’t trust them.

Demanding an explanation, the King learnt that Na, the youngest wizard, had been tasked with providing the power necessary to inscribe the crystals. Not as versed in the art as his elder brothers, he was only able to inscribe the crystals with a limited number of words and only the haziest of images. Rather than inscribing easily readable bills of trade, Na and the acolytes were making inscriptions in a cryptic language known only to wizards.

Anyone wanting to read a bill had to request an acolyte to interpret it for them. Rumours had been spreading that the acolytes could be paid to interpret the runes in ways that were advantageous to those with sufficient coin.

The middle brother, Ka, attempted to placate the enraged King, proposing an alternative arrangement. He would oversee the inscribing of the crystals in the place of his brother. Skilled in additional spells, Ka’s proposal was that the crystals would no longer be inscribed with runes describing the bills of sale. Instead each crystal would simply hold the number of a page in a magical book. Each Book of Bills, would hold an infinite number of pages. And, when a sale was made one acolyte would write the bill into a fresh page of a Book, whilst another would inscribe the page number into a crystal. As before, across the land, other acolytes would simultaneously inscribe copies of the bills into other crystals and other copies of the Book.

In this way, anyone wanting to read a bill of sale could simply ask a Book of Bills to turn to the page they needed. Anyone could then read from the book. But the crystals themselves would remain the ultimate proof of the trade. While someone might have been able to fake a copy of a Book, no-one could fake one of the crystals.

Grudgingly accepting this even more complex arrangement, the King was briefly satisfied. Until the accident.

One day, the wizard Ka visited the Craggy Valley, to forage for the rare Ipoh herb, which was known to grow in that part of the Kingdom. However, in a sudden fog, the wizard slipped and fell to his doom. And at the moment of his death, all of the wizard’s spells were undone. In a blink of an eye, all of the magical Books of Bills disappeared. Along with every proof of trade.

Enraged once more, the King gave the eldest wizard one more opportunity to deliver. Mo reassured the King that his power was far greater and that he was uniquely able to deliver on his late brother’s promise. Mo explained that through various dark arts he was able to resist death. He demonstrated his skill to the King, recklessly drinking terrible poisons, and throwing himself from a high tower only to land unharmed. Stunned at this show of power, the King agreed that Mo could take up his brother’s task.

For a few months, the turmoil was resolved, until fresh reports of corruption begin to spread.

A dismayed King granted an audience to a retinue of merchants who had travelled from all across his kingdom. The merchants claimed to have evidence that discrepancies had begun to appear in the Books of Bills. In different towns and cities the Books showed slightly different numbers. There was also talk of a strange, shadowy figure who had been present at many of the trading posts in which discrepancies had been found.

Troubled, the King sent out soldiers to set watch on the trading posts, giving orders that they should attempt to capture and bring this stranger to the court.

Many weeks of waiting and watching passed. More evidence of corrupted Books of Bills continued to appear. Challenged to explain the allegations, Mo scoffed at the evidence. The wizard suggested that the problem was illiterate merchants, asserting that his acolytes were above suspicion.

But finally the king’s soldiers captured the shadowy stranger, and his identity was revealed.

While Mo was the oldest of the three wizards, he was not the eldest. There was a fourth brother, named To. Much older than his brothers, To had been stripped of his riches and banished for studying certain forbidden arts. It was from their brother that Na, Ka and Mo had learned many of their spells, including the arts of inscribing crystals and books, and the means of channelling their powers through acolytes.

Except To had not taught them everything. He had kept many secrets for himself and was able to corrupt the spells used to inscribe the crystals and Books. He was able to change page numbers to refer to other pages which he had inscribed with different words. He had been selling his skills to unscrupulous merchants in an attempt to grow rich once again.

Sickened of wizards and their complicated schemes, the King banished them from his kingdom, never to return.

The King then turned to the task of once more building trust in commerce across his land. He did this not by trusting in magics and complex schemes, but by addressing the problems with which he was originally faced. He decreed the founding of a guild, to create a cadre of trusted, reliable scribes. He appointed new ombudsman and magistrates across the land, to help oversee and administer all forms of trade. He founded libraries and reading rooms to increase literacy amongst his subjects, so that more of them could read and write their own bills of trade. And he offered free use of the courts to all, so that none were denied an opportunity to seek justice.

Many years passed before the King and his kingdom worked through their troubles. But in the history books, the King was forever known as “The Rock Thane”.


Read the previous open data parables:

Posted at 20:20

July 17

Ebiquity research group UMBC: PhD defense: Deep Representation of Lyrical Style and Semantics for Music Recommendation

Dissertation Defense

Deep Representation of Lyrical Style and Semantics for Music Recommendation

Abhay L. Kashyap

11:00-1:00 Thursday, 20 July 2017, ITE 346

In the age of music streaming, the need for effective recommendations is important for music discovery and a personalized user experience. Collaborative filtering based recommenders suffer from popularity bias and cold-start which is commonly mitigated by content features. For music, research in content based methods have mainly been focused in the acoustic domain while lyrical content has received little attention. Lyrics contain information about a song’s topic and sentiment that cannot be easily extracted from the audio. This is especially important for lyrics-centric genres like Rap, which was the most streamed genre in 2016. The goal of this dissertation is to explore and evaluate different lyrical content features that could be useful for content, context and emotion based models for music recommendation systems.

With Rap as the primary use case, this dissertation focuses on featurizing two main aspects of lyrics; its artistic style of composition and its semantic content. For lyrical style, a suite of high level rhyme density features are extracted in addition to literary features like the use of figurative language, profanity and vocabulary strength. In contrast to these engineered features, Convolutional Neural Networks (CNN) are used to automatically learn rhyme patterns and other relevant features. For semantics, lyrics are represented using both traditional IR techniques and the more recent neural embedding methods.

These lyrical features are evaluated for artist identification and compared with artist and song similarity measures from a real-world collaborative filtering based recommendation system from Last.fm. It is shown that both rhyme and literary features serve as strong indicators to characterize artists with feature learning methods like CNNs achieving comparable results. For artist and song similarity, a strong relationship was observed between these features and the way users consume music while neural embedding methods significantly outperformed LSA. Finally, this work is accompanied by a web-application, Rapalytics.com, that is dedicated to visualizing all these lyrical features and has been featured on a number of media outlets, most notably, Vox, attn: and Metro.

Committee: Drs. Tim Finin (chair), Anupam Joshi, Tim Oates, Cynthia Matuszek and Pranam Kolari (Walmart Labs)

Posted at 01:38

July 13

Ebiquity research group UMBC: PhD Proposal: Analysis of Irregular Event Sequences using Deep Learning, Reinforcement Learning, and Visualization

Analysis of Irregular Event Sequences using Deep Learning, Reinforcement Learning, and Visualization

Filip Dabek

11:00-1:00 Thursday 13 July 2017, ITE 346, UMBC

History is nothing but a catalogued series of events organized into data. Amazon, the largest online retailer in the world, processes over 2,000 orders per minute. Orders come from customers on a recurring basis through subscriptions or as one-off spontaneous purchases, resulting in each customer exhibiting their own behavioral pattern when it comes to the way in which they place orders throughout the year. For a company such as Amazon, that generates over $130 billion of revenue each year, understanding and uncovering the hidden patterns and trends within this data is paramount in improving the efficiency of their infrastructure ranging from the management of the inventory within their warehouses, distribution of their labor force, and preparation of their online systems for the load of users. With the ever increasingly availability of big data, problems such as these are no longer limited to large corporations but are experienced across a wide range of domains and faced by analysts and researchers each and every day.

While many event analysis and time series tools have been developed for the purpose of analyzing such datasets, most approaches tend to target clean and evenly spaced data. When faced with noisy or irregular data, it has been recommended to undergo a pre-processing step of converting and transforming the data into being regular. This transformation technique arguably interferes on a fundamental level as to how the data is represented, and may irrevocably bias the way in which results are obtained. Therefore, operating on raw data, in its noisy natural form, is necessary to ensure that the insights gathered through analysis are accurate and valid.

In this dissertation novel approaches are presented for analyzing irregular event sequences using a variety of techniques ranging from deep learning, reinforcement learning, and visualization. We show how common tasks in event analysis can be performed directly on an irregular event dataset without requiring a transformation that alters the natural representation of the process that the data was captured from. The three tasks that we showcase include: (i) summarization of large event datasets, (ii) modeling the processes that create events, and (iii) predicting future events that will occur.

Committee: Drs. Tim Oates (Chair), Jesus Caban, Penny Rheingans, Jian Chen, Tim Finin

 

Posted at 02:55

July 12

Leigh Dodds: Data is infrastructure, so it needs a design manual

Posted at 12:07

Copyright of the postings is owned by the original blog authors. Contact us.