Planet RDF

It's triples all the way down

February 27

AKSW Group - University of Leipzig: AKSW Colloquium: Tommaso Soru and Martin Brümmer on Monday, March 2 at 3.00 p.m.

On Monday, 2nd of March 2015, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery. Martin Brümmer will then present NIF annotation and provenance – A comparison of approaches.

Tommaso Soru – ROCKER – Abstract

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

Martin Brümmer - Abstract – NIF annotation and provenance – A comparison of approaches

The uptaking use of the NLP Interchange Format (NIF) reveals its shortcomings on a number of levels. One of these is tracking metadata of annotations represented in NIF – which NLP tool added which annotation with what confidence at which point in time etc.

A number of solutions to this task of annotating annotations expressed as RDF statements has been proposed over the years. The talk will weigh these solutions, namely annotation resources, reification, Open Annotation, quads and singleton properties in regard to their granularity, ease of implementation and query complexity.

The goal of the talk is presenting and comparing viable alternatives of solving the problem at hand and collecting feedback on how to proceed.

Posted at 12:57

February 26

W3C Data Activity: Open Data Standards

Data on the Web Best Practices WG co-chair Steve Adler writes: Yesterday, Open Data reached a new milestone with the publication of the W3C’s first public working draft of its Data on the Web Best Practices. Open Data is spreading … Continue reading

Posted at 16:34

February 23

Dublin Core Metadata Initiative: DC-2015 Professional Program Session in Portuguese & English

2015-02-23, DCMI and the DC-2015 host, São Paulo State University, are please to announce that session proposals as well as the presentation language of sessions in the Professional Program at DC-2015 may be in either Portuguese or English. Depending on the language of the session presenters, simultaneous English/Portuguese or Portuguese/English translation will be provided. Tracks in the Professional Program include special topic sessions and panels, half- and full-day tutorials, workshops, and best practice posters and demonstrations.The call for participation in both the Professional and Technical Programs remain open until 28 March 2015. The call for participation can be found at http://purl.org/dcevents/dc-2015/cfp.

Posted at 23:59

Dublin Core Metadata Initiative: DCMI Webinar: "VocBench 2.0: A Web Application for Collaborative Development of Multilingual Thesauri"

2015-02-23, On 4 March 2015, Caterina Caracciolo of the United Nations Food and Agriculture Organization (FAO) and Armando Stellato of the University of Rome Tor Vergata, will present a webinar on VocBench, a web-based platform for the collaborative maintenance of multilingual thesauri. VocBench is an open source project developed through a collaboration between FAO and the University of Rome Tor Vergata. VocBench is currently used for the maintenance of AGROVOC, EUROVOC, GEMET, the thesaurus of the Italian Senate, the Unified Astronomy Thesaurus of Harvard University, as well as other thesauri. VocBench has a strong focus on collaboration, supported by workflow management for content validation and publication. Dedicated user roles provide a clean separation of competencies ranging from management aspects to vertical competencies in content editing such as conceptualization versus terminology editing. Extensive support for scheme management allows editors to fully exploit the possibilities of the SKOS model including fulfillment of its integrity constraints. VocBench is open source software since publication of version 2--open to a large community of users and institutions supporting its development with their feedback and ideas. During the webinar Dr. Caracciolo and Dr. Stellato will demonstrate the main features of VocBench from the point of view of users and system administrators, and explain in what ways you may join the project. Additional information about the webinar and access to registration is available at http://dublincore.org/resources/training/#2015stellato.

Posted at 23:59

February 20

Semantic Web Company (Austria): SEMANTiCS2015: Calls for Research & Innovation Papers, Industry Presentations and Poster/Demos are now open!

The SEMANTiCS2015 conference comes back this year in its 11th edition where it all started in 2005 to Vienna, Austria!

The conference  takes place from 15-17 September 2015 (the main conference will be on 16-17th of September and several back 2 back workshops & events on 15th) at the University of Economics – see all information: http://semantics.cc/.

SEMANTiCS 2015 - Banner - new

We are happy to announce the SEMANTiCS Open Calls as follows. All infos on the Calls can also be found on the SEMANTiCS2015 website here: http://semantics.cc/open-calls

Call for Research & Innovation Papers

The Research & Innovation track at SEMANTiCS welcomes the submission of papers on novel scientific research and/or innovations relevant to the topics of the conference. Submissions must be original and must not have been submitted for publication elsewhere. Papers should follow the ACM ICPS guidelines for formatting (http://www.acm.org/sigs/publications/proceedings-templates) and must not exceed 8 pages in lenght for full papers and 4 pages for short papers, including references and optional appendices.

Abstract Submission Deadline: May 22, 2015
Paper Submission Deadline: May 29, 2015
Notification of Acceptance: July 10, 2015
Camera-Ready Paper: July 24, 2015
Details: http://bit.ly/semantics15-research

Call for Industry & Use Case Presentations

To address the needs and interests of industry SEMANTICS presents enterprise solutions that deal with semantic processing of data and/or information in areas like like Linked Data, Data Publishing, Semantic Search, Recommendation Services, Sentiment Detection, Search Engine Add-Ons, Thesaurus and/or Ontology Management, Text Mining, Data Mining and any related fields. All submissions have a strong focus on real world applications beyond the prototypical status and demonstrate the power of semantic systems!

Submission Deadline: July 1, 2015
Notification of Acceptance: July 20, 2015
Presentation Ready: August 15, 2015
Details: http://bit.ly/semantics15-industry

Call for Posters and Demos

The Posters & Demonstrations Track invites innovative work in progress, late-breaking research and innovation results, and smaller contributions (including pieces of code) in all fields related to the broadly understood Semantic Web. The informal setting of the Posters & Demonstrations Track encourages participants to present innovations to business users and find new partners or clients.  In addition to the business stream, SEMANTiCS 2015 welcomes developer-oriented posters and demos to the new technical stream.

Submission Deadline: June 17, 2015
Notification of Acceptance: July 10, 2015
Camera-Ready Paper: August 01, 2015
Details: http://bit.ly/semantics15-poster

We are looking forward to receive your submissions for SEMANTiCS2015 and see you in Vienna in autumn!

Posted at 10:00

February 19

AKSW Group - University of Leipzig: AKSW Colloquium: Edgard Marx and Tommaso Soru on Monday, February 23, 3.00 p.m.

On Monday, 23rd of February 2015, Edgard Marx will introduce Smart, a search engine designed over the Semantic Search paradigm; subsequently, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery.

EDIT: Tommaso Soru’s presentation was moved to March 2nd.

Abstract – Smart

Since the conception of the Web, search engines play a key role in making content available. However, retrieving of the desire information is still significantly challenging. Semantic Search systems are a natural evolution of the traditional search engines. They promise more accurate interpretation by understanding the contextual meaning of the user query. In this talk, we will introduce our audience to Smart, a search engine designed over the Semantic Search paradigm. Smart incorporates two of our currently designed approaches of dealing with the problem of Information Retrieval, as well as a novel interface paradigm. Moreover, we will present some of the former, as well as more recent state-of-the-art approaches used by the industry – for instance by Yahoo!, Google and Facebook.

Abstract – ROCKER

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

Posted at 21:53

February 18

Semantic Web Company (Austria): Data to Value & Semantic Web Company agree partnership to bring cutting edge Semantic Management to Financial Services clients

The partnership aims to change the way organisations, particularly within Financial Services, manage the semantics embedded in their data landscapes. This will offer several core benefits to existing and prospective clients including locating, contextualising and understanding the meaning and content of Information faster and at a considerably lower cost. The partnership will achieve this through combining the latest Information Management and Semantic techniques including:

  • Text Mining, Tagging, Entity Definition & Extraction.
  • Business Glossary, Data Dictionary & Data Governance techniques.
  • Taxonomy, Data Model and Ontology development.
  • Linked Data & Semantic Web analyses.
  • Data Profiling, Mining & Discovery.

This includes improving regulatory compliance in areas such as BCBS, enabling new investment research and client reporting techniques as well as general efficiency drivers such as faster integration of mergers and acquisitions. As part of the partnership, Data to Value Ltd. will offer solution services and training in PoolParty product offerings, including ontology development and data modeling services.

Nigel Higgs, Managing Director of Data to Value notes; “this is an exciting collaboration between two firms which are pushing the boundaries in the way Data, Information and Semantics are managed by business stakeholders. We spend a great deal of time helping organisations at a grass roots level pragmatically adopt the latest Information Management techniques. We see this partnership as an excellent way for us to help organisations take realistic steps to adopting the latest semantic techniques.”

Andreas Blumauer, CEO of Semantic Web Company adds, “The consortium of our two companies offers a unique bundle, which consists of a world-class semantic platform and a team of experts who know exactly how Semantics can help to increase the efficiency and reliability of knowledge intensive business processes in the financial industry.”

Posted at 13:04

February 17

Libby Miller: Catwigs, printing and boxes

Catwigs are a set of cards that help you interrogate your project and are described

Posted at 20:10

AKSW Group - University of Leipzig: Call for Feedback on LIDER Roadmap

The LIDER project is gathering feedback on a roadmap for the use of Linguistic Linked Data for content analytics.  We invite you to give feedback in the following ways:

Excerpt from the roadmap

Full document: available here
Summary slides: available here

Content is growing at an impressive, exponential rate. Exabytes of new data are created every single day. In fact, data has been recently referred to as the “oil” of the new economy, where the new economy is understood as “a new way of organizing and managing economic activity based on the new opportunities that the Internet provided for businesses” .

Content analytics, i.e. the ability to process and generate insights from existing content, plays and will continue to play a crucial role for enterprises and organizations that seek to generate value from data, e.g. in order to inform decision and policy making.

As corroborated by many analysts, substantial investments in technology, partnerships and research are required to reach an ecosystem consisting of many players and technological solutions that provide the necessary infrastructure, expertise and human resources required to make sure that organizations can effectively deploy content analytics solutions at large scale in order to generate relevant insights that support policy and decision making, or even to define completely new business models in a data-driven economy.

Assuming that such investments need to be and will be made, this roadmap explores the role that linked data and semantic technologies can and will play in the field of content analytics and will generate a set of recommendations for organizations, funders and researchers on which technologies to invest as a basis to prioritize their investment in R&D as well as on optimizing their mid- and long-term strategies and roadmaps.

Conference Call on 19th of February 3 p.m. CET

Connection details: https://www.w3.org/community/ld4lt/wiki/Main_Page#LD4LT_calls
Summary slides: available here

Agenda

  1. Introduction to the LIDER Roadmap (Philipp Cimiano, 10 minutes)
  2. Discussion of Global Customer Engagement Use Cases (All, 10 minutes)
  3. Discussion of Public Sector and Civil Society Use Cases (All, 10 minutes)
  4. Discussion of Linked Data Life Cycle and Linguistic Linked Data Value Chain (All, 10 minutes)
  5. General Discussion on further use cases, items in the roadmap etc. (20 minutes)

In addition, the call will briefly discuss progress of meta-share linked data metadata model.

The call is open to the public, no LD4LT group participation is required. Dial-in information is available. Please spread this information widely. No knowledge about linguistic linked data is required. We especially are interested in feedback from potential users of linguistic linked data.

About the LIDER Project

Website: http://lider-project.eu

The project’s mission is to provide the basis for the creation of a Linguistic Linked Data cloud that can support content analytics tasks of unstructured multilingual cross-media content. By achieving this goal, LIDER will impact on the ease and efficiency with which Linguistic Linked Data will be exploited in content analytics processes.

We aim at providing an ecosystem for the establishment of a new Linked Open Data (LOD) based ecosystem of free, interlinked, and semantically interoperable language resources (corpora, dictionaries, lexical and syntactic metadata, etc.) and media resources (image, video, etc. metadata) that will allow for free and open exploitation of such resources in multilingual, cross-media content analytics across the EU and beyond, with specific use cases in industries related to social media, financial services, localization, and other multimedia content providers and consumers.

Take a personal interview to include your voice into the roadmap

Contact: http://lider-project.eu/?q=content/contact-us

The EU project LIDER has been tasked by the European Commission to put together a roadmap for future R&D funding in multilingual industries such as content and knowledge localization, multilingual terminology and taxonomy management, cross-border business intelligence, etc. As a leading supplier of solutions in one or more of these industries, we would need your input for this roadmap. We would like to conduct a short interview with you to establish your views on current and developing R&D efforts in multilingual and semantic technologies that will likely play an increasing role in these industries, such as Linked Data and related standards for web-based, multilingual data processing. The interview will cover the below 5 questions and will not take more than 30 minutes. Please let us know on a suitable time and date.

Posted at 14:38

February 16

AKSW Group - University of Leipzig: AKSW Colloquium: Konrad Höffner and Michael Röder on Monday, February 16, 3.00 p.m.

CubeQA—Question Answering on Statistical Linked Data by Konrad Höffner

Abstract

Question answering systems provide intuitive access to data by translating natural language queries into SPARQL, which is the native query language of RDF knowledge bases. Statistical data, however, is structurally very different from other data and cannot be queried using existing approaches. Building upon a question corpus established in previous work, we created a benchmark for evaluating questions on statistical Linked Data in order to evaluate statistical question answering algorithms and to stimulate further research. Furthermore, we designed a question answering algorithm for statistical data, which covers a wide range of question types. To our knowledge, this is the first question answering approach for statistical RDF data and could open up a new research area.
See also the paper (preprint, under review) and the slides.

News from the WSDM 2015 by Michael Röder

Abstract

The WSDM conference is one of the major conferences for Web Search and Data Mining. Michael Röder was attending this years WSDM conference in Shanghai and wants to present a short overview over the conference topics. After that, he wants to take a closer look at FEL – an entity linking approach for search queries peresented at the conference.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 12:45

AKSW Group - University of Leipzig: Kick-off of the FREME project

Hi all !

A new InfAI project, FREME, kicked off in Berlin. FREME – Open Framework of E-Services for Multilingual and Semantic Enrichment of Digital Content is an H2020 funded project with the objective of building an open, innovative, commercial-grade framework of e-services for multilingual and semantic enrichment of digital content.

InfAI will play an important role in FREME by driving two of the six central FREME services, e-Link and the e-Entity. NIF will be used as a mediator between language services and data sources, serving as foundation for e-Link, while DBpedia Spotlight will be a prototype for e-Entity services, linking named entities in natural language texts to Linked Open Data sets like DBpedia.

InfAI will also help to identify and publish new Linked Data sets that can contribute to data value chains. Our partners in this open content enrichment effort will be DFKI, Tilde, Iminds, Agro-Know, Wripl, VistaTEC and ISBM.

Stay tuned for more info ! In the meanwhile join the conversation on twitter #FREMEH2020.

- Amrapali Zaveri on behalf of the NLP2RDF group

Posted at 11:50

February 13

Bob DuCharme: Driving Hadoop data integration with standards-based models instead of code

RDFS models!

Posted at 18:43

AKSW Group - University of Leipzig: DL-Learner 1.0 (Supervised Structured Machine Learning Framework) Released

Dear all,

we are happy to announce DL-Learner 1.0.

DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis.

Website: http://dl-learner.org
GitHub page: https://github.com/AKSW/DL-Learner
Download: https://github.com/AKSW/DL-Learner/releases
ChangeLog: http://dl-learner.org/development/changelog/

DL-Learner is used for data analysis in other tools such as ORE and RDFUnit. Technically, it uses refinement operator based, pattern based and evolutionary techniques for learning on structured data. For a practical example, see http://dl-learner.org/community/carcinogenesis/. It also offers a plugin for Protege, which can give suggestions for axioms to add. DL-Learner is part of the Linked Data Stack – a repository for Linked Data management tools.

We want to thank everyone who helped to create this release, in particular (alphabetically) An Tran, Chris Shellenbarger, Christoph Haase, Daniel Fleischhacker, Didier Cherix, Johanna Völker, Konrad Höffner, Robert Höhndorf, Sebastian Hellmann and Simon Bin. We also acknowledge support by the recently started SAKE project, in which DL-Learner will be applied to event analysis in manufacturing use cases, as well as the GeoKnow and Big Data Europe projects where it is part of the respective platforms.

Kind regards,

Lorenz Bühmann, Jens Lehmann and Patrick Westphal

Posted at 09:38

AKSW Group - University of Leipzig: Writing a Survey – Steps, Advantages, Limitations and Examples

What is a Survey?

A survey or systematic literature review is a text of a scholarly paper, which includes the current knowledge including substantive findings, as well as theoretical and methodological contributions to a particular topic. Literature reviews use secondary sources, and do not report new or original experimental work [1].

A systematic review is a literature review focused on a research question, trying to identify, appraise, select and synthesize all high quality research evidence and arguments relevant to that question. Moreover, a literature review is comprehensive, exhaustive and repeatable, that is, the readers can replicate or verify the review.

Steps to perform a survey

  • Select two independent reviewers

  • Look for related/existing surveys

    • If it exists, see how long back it was done. If it was 10 years ago, you can go ahead and update it.

  • Formulate research questions

  • Devise eligibility criteria

  • Define search strategy – keywords, journals, conferences, workshops to search in

  • Retrieve further potential article using search strategy and also directly contacting top researchers in the field

  • Compare chosen articles among reviewers and decide a core set of papers to be included in the survey

  • Perform Qualitatively and Quantitatively on the selected set of papers

  • Report on the results

Advantages of writing a survey

There are several benefits/advantages of conducting a survey, such as:

  • A survey is the best way to get an idea of the state-of-the-art technologies, algorithms, tools etc. in a particular field

  • One can get a clear birds-eye overview of the current state of that field

  • It can serve as a great starting point for a student or any researcher thinking of venturing into that particular field/area of research

  • One can easily acquire updated information of a subject by referring to a review

  • It gives researchers the opportunity to formalize different concepts of a particular field

  • It allows one to identify challenges and gaps that are unanswered and crucial for that subject

Limitations of a survey

However, there are a few limitations that must be considered before undertaking a survey such as:

  • Surveys can tend to be biased, thus it is necessary to have two researchers, who perform the systematic search for the articles independently

  • It is quite challenging to unify concepts, especially when there are different ideas referring to the same concepts developed over several years

  • Indeed, conducting a survey and getting the article published is a long process

Surveys conducted by members of the AKSW group

In our group, three students conducted comprehensive literature reviews on three different topics:

  • Linked Data Quality: The survey covers 30 core papers, which focus on providing quality assessment methodologies for Linked Data specifically. A total of 18 data quality dimensions along with their definitions and 69 metrics are provided. Additionally, the survey contributes a comparison of 12 tools, which perform quality assessment of Linked Data [2].

  • Ubiquitous Semantic Applications: The survey presents a thorough analysis of 48 primary studies out of 172 initially retrieved papers.  The results consist of a comprehensive set of quality attributes for Ubiquitous Semantic Applications together with corresponding application features suggested for their realization. The quality attributes include aspects such as mobility, usability, heterogeneity, collaboration, customizability and evolvability. The proposed quality attributes facilitate the evaluation of existing approaches and the development of novel, more effective and intuitive Ubiquitous Semantic Applications [3].

  • User interfaces for semantic authoring of textual content: The survey covers a thorough analysis of 31 primary studies out of 175 initially retrieved papers. The results consist of a comprehensive set of quality attributes for SCA systems together with corresponding user interface features suggested for their realization. The quality attributes include aspects such as usability, automation, generalizability, collaboration, customizability and evolvability. The proposed quality attributes and UI features facilitate the evaluation of existing approaches and the development of novel more effective and intuitive semantic authoring interfaces [4].

Also, here is a presentation on “Systematic Literature Reviews”: http://slidewiki.org/deck/57_systematic-literature-review.

References

[1] Lisa A. Baglione (2012) Writing a Research Paper in Political Science. Thousand Oaks: CQ Press.

[2] Amrapali Zaveri, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann and Sören Auer (2015), ‘Quality Assessment for Linked Data: A Survey’, Semantic Web Journal. http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

[3] Timofey Ermilov, Ali Khalili, and Sören Auer (2014). ;Ubiquitous Semantic Applications: A Systematic Literature Review’. Int. J. Semant. Web Inf. Syst. 10, 1 (January 2014), 66-99. DOI=10.4018/ijswis.2014010103 http://dx.doi.org/10.4018/ijswis.2014010103

[4] Ali Khalili and Sören Auer (2013). ‘User interfaces for semantic authoring of textual content: A systematic literature review’, Web Semantics: Science, Services and Agents on the World Wide Web, Volume 22, October 2013, Pages 1-18 http://www.sciencedirect.com/science/article/pii/S1570826813000498

Posted at 09:10

February 09

Dublin Core Metadata Initiative: Join us in Brazil for DC-2015

2015-02-09, Each of the past 20 years, the metadata community has gathered for DCMI's conference and annual meeting. This year, the annual meeting and conference are being hosted by the Universidade Estadual Paulista--São Paulo State University (UNESP) and held in São Paulo, Brazil. The work agenda of the DCMI community is broad and inclusive of all aspects of innovation in metadata design, implementation and best practices. While the work of the Initiative progresses throughout the year, the annual meeting and conference provide the opportunity for DCMI "citizens" as well as newcomers, students, apprentices, and early career professionals to gather face-to-face to share experiences and knowledge. In addition, the gathering provides public- and private-sector initiatives beyond DCMI that are doing significant metadata work to come together to compare notes and cast a broader light into their particular metadata work silos. Through such a gathering of the metadata communities, DCMI advances its "first goal" of promoting metadata interoperability and harmonization. For general conference information, visit http://purl.org/dcevents/dc-2015. For the Call for Participation, visit http://purl.org/dcevents/dc-2015/cfp.

Posted at 23:59

February 06

Libby Miller: Notes on a TV-to-Radio prototype

In the last couple of weeks at work we’ve been making “radios” in order to test the

Posted at 19:44

February 05

schema.org: Schema.org v1.93: VisualArtwork, Invoices, plus lots of fixes and improvements.

Version v1.93 of schema.org has just been released.  As we mentioned in the previous update we are working towards a stable "version 2" release. This isn't yet v2.0, but it serves as a foundation,
fixing a variety of small issues across many schemas and examples. 

This release also introduces new vocabulary for describing visual artworks: a new VisualArtwork type alongside supporting properties - artEdition, artformmaterial and surface. Many thanks to Paul Watson for leading that work. See also Paul's blog posts about the schema, its mapping to the VRA Core 4, and its use with Getty's Art and Architecture Thesaurus (AAT) via Linked Data.

Invoices and bills also now have dedicated vocabulary in schema.org, see the new Invoice type for details. This addresses situations when an invoice is received that is not directly attached to an Order, for example utility bills.

As usual then release notes page has full details. In recent weeks we have been taking care to document the status of all schema.org open issues and proposals in our issue tracker on the Github site. As always, thanks are due to everyone who contributed to this release and to the ongoing discussions in Github and at W3C. 


Posted at 23:24

February 03

AKSW Group - University of Leipzig: Kick-Off for the BMWi project SAKE

Hi all!

One of AKSW’s Big Data Project, SAKE – Semantische Analyse Komplexer Ereignisse (SAKE – Semantic Analysis of Complex Events) kicked-off in Karlsruhe. SAKE is one of the winners of the Smart Data Challenge and is funded by the German BMWi (Bundesministerium für Wirtschaft und Energie) and has a duration of 3 years. Within this project, AKSW will develop powerful methods for analysis of industrial-scale Big Linked Data in real time. To this end, the team will extend existing frameworks like LIMES, DL-Learner, QUETSAL and FOX. Together with USU AG, Heidelberger Druckmaschinen, Fraunhofer  IAIS and AviComp Controls novel methods for tackling Business Intelligence challenges will be devised.

More info to come soon!

Stay tuned!

Axel on behalf of the SAKE team

Posted at 10:39

February 02

AKSW Group - University of Leipzig: AKSW Colloquium: Ricardo Usbeck and Ivan Ermilov on Monday, February 2, 3.00 p.m.

GERBIL – General Entity Annotation Benchmark Framework by Ricardo Usbeck

Abstract

The need to bridge between the unstructured data on the document Web and the structured data on the Data Web has led to the development of a considerable number of annotation tools. Those tools are hard to compare since published results are calculated on diverse datasets and measured in different units.

We present GERBIL, a general entity annotation system based on the BAT-Framework. GERBIL offers an easy-to-use web-based platform for the agile comparison of annotators using multiple datasets and uniform measuring approaches. To add a tool to GERBIL, all the end user has to do is to provide a URL to a REST interface to its tool which abides by a given specification. The integration and benchmarking of the tool against user-specified datasets is then carried out automatically by the GERBIL platform. Currently, out platform provides results for 9 annotators and 11 datasets with more coming. Internally, GERBIL is based on the Natural Language Programming Interchange Format (NIF) and provide Java classes for implementing APIs for datasets and annotators to NIF. For the paper see here.

Towards Efficient and Effective Semantic Table Interpretation by Ziqi Zhang presented by Ivan Ermilov

Abstract

Ivan will present a paper that describes TableMiner by Ziqi Zhang, the first semantic Table Interpretation method that adopts an incremental, mutually recursive and bootstrapping learning approach seeded by automatically selected ‘partial’ data from a table. TableMiner labels columns containing named entity mentions with semantic concepts that best describe data in columns, and disambiguates entity content cells in these columns. TableMiner is able to use various types of contextual information outside tables for Table Interpretation, including semantic markups (e.g., RDFa/microdata annotations) that to the best of our knowledge, have never been used in Natural Language Processing tasks. Evaluation on two datasets shows that compared to two baselines, TableMiner consistently obtains the best performance. In the classification task, it achieves significant improvements of between 0.08 and 0.38 F1 depending on different baseline methods; in the disambiguation task, it outperforms both baselines by between 0.19 and 0.37 in Precision on one dataset, and between 0.02 and 0.03 F1 on the other dataset. Observation also shows that the bootstrapping learning approach adopted by TableMiner can potentially deliver computational savings of between 24 and 60% against classic methods that ‘exhaustively’ processes the entire table content to build features for interpretation.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 11:00

January 27

Libby Miller: Tangible enough

Thinking about the purpose of prototypes:

Make new and upcoming technologies and standards tangible enough to help people think through the consequences of them.

Technology is moving fast, but it is also unevenly distributed, and the consequences – good and bad – of emerging technologies may only become apparent as they move into the mainstream. By making these consequences tangible early we can choose between possible futures.

Links

Posted at 20:14

Libby Miller: What is Radiodan for?

This is my view only, and there’s a certain amount of thinking out loud / lack of checking / potentially high bullshit level.

Yesterday I was asked to comment on a Radiodan doc and this popped out:

Posted at 11:59

January 25

Libby Miller: A quick Radiodan: Exclusively Archers

I made one of these a few months ago – they’re super simple – but Chris Lynas asked me about it, so I thought I should write it up quickly.

It’s an internet radio that turns itself on for

Posted at 14:29

January 23

Cambridge Semantics: Big Data Industry News Watch

A round up of recent industry news on the topics of Big Data and Enterprise Data Management

Posted at 15:00

Libby Miller: A quick analysis of wifi cards for using a Raspberry Pi as an access point

When Radiodan can’t access the web, it throws up an access point (AP) created by the Pi: you connect directly to that and it displays the available wifi points in a webpage as a captive portal, and asks you to add the password for the one you want. It’s not easy to get credentials for wifi to objects with no user interface, and this is the best one we’ve found so far (

Posted at 09:37

January 20

Cambridge Semantics: Putting the Smarts in Data Integration

Driving business value from your data often requires integration across many sources. These integration projects can be time consuming, expensive and difficult to manage. Any short cuts can compromise on quality and reuse. In many industries, non-compliance with data governance rules can put you firm’s reputation at risk and expose you to large fines.

Traditional data integration methods require point to point mapping of source and target systems. This effort typically requires a team of both business SMEs and technology professionals. These mappings are time consuming to create and code and errors in the ETL (Extract, Transform, and Load) process require iterative cycles through the process.

Posted at 20:47

Copyright of the postings is owned by the original blog authors. Contact us.