Planet RDF

It's triples all the way down

September 25

Bob DuCharme: Semantic web semantics vs. vector embedding machine learning semantics

It's all semantics.

Posted at 16:01

September 24

John Goodwin: Using Recurrent Neural Networks to Hallucinate New Model Army Lyrics

I decided to follow the example of

Posted at 15:36

September 23

Dublin Core Metadata Initiative: Sutton stepping down as DCMI Managing Director<</title>

2016-09-23, Stuart Sutton has announced his intention to step down as DCMI Managing Director effective 30 June 2017. Over the coming months, the DCMI Executive Committee and the Governing Board will be engaged in succession planning and the process of replacing the Managing Director. For additional information concerning the succession and appointment process, contact DCMI Chair-Elect Paul Walk at p[dot]walk[at]ed[dot]ac[dot]uk. Future announcements concerning the succession process will be posted here from time to time.

Posted at 23:59

Dublin Core Metadata Initiative: Synaptica becomes DC-2016 Gold Sponsor<</title>

2016-09-23, DCMI is pleased to announce that Synaptica is supporting DC-2016 in Copenhagen as a Gold Sponsor. Since 1995, Synaptica has been developing innovative software tools for organizing, indexing and classifying information, and for discovering knowledge. All of Synaptica's award-winning software products are built on a foundation of open standards and a commitment to client-led solutions and uncompromising customer service. Synaptica's Linked Canvas will be featured in a demonstration during the Conference Opening Reception. Linked Canvas is an easy-to-use tool designed for the cultural heritage community as well as schools and colleges to build interactive educational resources. For more information about Synaptica and Linked Canvas, visit http://dcevents.dublincore.org/IntConf/index/pages/view/sponsors16#synaptica. Visit http://dcevents.dublincore.org/IntConf/dc-2016 for more information about the conference and to register.

Posted at 23:59

Dublin Core Metadata Initiative: Danish Bibliographic Centre (DBC) becomes DC-2016 Sponsor<</title>

2016-09-23, The Danish Bibliographic Centre (DBC) joins in supporting DC-2016 in Copenhagen as Sponsor of the Conference Delegate Bags. The DBC's main task in Denmark is the development and maintenance of the bibliographic and IT infrastructure of Danish libraries. The DBC handles registration of books, music, AV materials, Internet documents, articles and reviews in newspapers and magazines in the National Bibliography, develops Danbib, the Danish union catalogue, and the infrastructure for interlibrary loan. Danbib is comprised of the National Bibliography and the holdings of the libraries. DBC also develops bibliotek.dk — the citizen's access to all Danish publications and the holdings of the Danish libraries. DBC's IT development is based on open source and service oriented architecture. DBC is a public limited company owned by Local Government Denmark and the Danish State.

Posted at 23:59

Dublin Core Metadata Initiative: Ana Alice Baptista named Chair-Elect of the DCMI Governing Board<</title>

2016-09-23, The DCMI Governing Board is pleased to announce that Ana Alice Baptista has been appointed to the DCMI Governing Board as an Independent Member. She also assumes the role of Chair-Elect of the Governing Board at the closing ceremony of DC-2016 in Copenhagen. She will succeed Paul Walk as Chair of the Board in 2017. Ana is a professor at the Information Systems Department and a researcher at ALGORITMI Center, both at University of Minho, Portugal. She graduated in computer engineering and has a PhD on Information Systems and Technologies. She is also a member of the Elpub conference series Executive Committee, participated in several R&D projects, and was an evaluator of project proposals under FP7. For more information about the Governing Board, visit the DCMI website at http://dublincore.org/about/oversight/.

Posted at 23:59

Dublin Core Metadata Initiative: Join us! DC-2016 in Copenhagen on 13-16 October<</title>

2016-09-23, DC-2016 in Copenhagen, Denmark on 13-16 October is rapidly approaching. The program promises a rich array of papers, project reports, presentations, demonstrations, posters, special panels, workshops and an exciting keynote by Elsevier's Bradley Allen. You will not want to miss this one! Register now at http://dcevents.dublincore.org/IntConf/index/pages/view/reg16.

Posted at 23:59

Frederick Giasson: Web Page Analysis With Cognonto

Extract Structured Content, Tag Concepts & Entities

 

Cognonto is brand new. At its core, it uses a structure of nearly 40 000 concepts. It has about 138,000 links to external classes and concepts that defines huge public datasets such as Wikipedia, DBpedia and USPTO. Cognonto is not a children’s toy. It is huge and complex… but it is very usable. Before digging into the structure itself, before starting to write about all the use cases that Cognonto can support, I will first cover all of the tools that currently exist to help you understand Cognonto and its conceptual structure and linkages (called KBpedia).

The embodiment of Cognonto that people can see are the tools we created and that we made available on the cognonto.com web site. Their goal is to show the structure at work, what ties where, how the conceptual structure and its links to external schemas and datasets help discover new facts, how it can drive other services, etc.

This initial blog post will discuss the demo section of the web site. What we call the Cognonto demo is a web page crawler that analyzes web pages to tag concepts, to tag named entities, to extract structured data, to detect language, to identity topics, and so forth. The demo uses the KBpedia structure and its linkages to Wikipedia, Wikidata, Freebase and USPTO to tag content that appears in the analyzed web pages. But there is one thing to keep in mind: the purpose of Cognonto is to link public or private datasets to the structure to expand its knowledge and make these tools (like the demo) even more powerful. This means that a private organization could use Cognonto, add their own datasets and link their own schemas, to improve their own version of Cognonto or to tailor it for their own purpose.

Let’s see what the demo looks like, what is the information it extracts and analyzes from any web page, and how it ties into the KBpedia structure.

Analyzing a web page

The essence of the Cognonto demo is to analyze a web page. The steps performed by the demo are:

  1. Crawling the web page’s content
  2. Extracting text content, defluffing and normalizing it
  3. Detecting the language used to write the content
  4. Extracting the copyright notice
  5. Extracting metadata from the web page (HTML, microformats, RDFa, etc.)
  6. Querying KBpedia to detect named entities
  7. Querying KBpedia to detect concepts
  8. Querying KBpedia to find information about the web page
  9. Analyzing extracted entities and concepts to determine the publisher
  10. Analyzing extracted concepts to determine the most likely topics
  11. Generating the analysis result set

To test the demo and see how it works, let’s analyze a piece of news recently published by CNN: Syria convoy attack: US blames Russia. You can start the analysis process by following this link. The first page will be:

What the demo shows is the header of the analyzed web page. The header is composed of the title of the web page and possibly a short description and an image. All of this information comes from the extracted metadata content of the page. Then you are presented with 5 tabs:

  1. Concepts: which shows the body content and extracted metadata of the web page tagged with all detected KBpedia concepts
  2. Entities: which shows the body content and extracted metadata of the web page tagged with all detected KBpedia named entities that exists in the knowledge base
  3. Analysis: which shows all the different kinds of analysis performed by the demo
  4. graphs: which shows how the topics found during the topic analysis step ties into the KBpedia conceptual structure
  5. export: which shows you what the final resultset looks like

Concepts tab

The concepts tab is the first one that is presented to you. All of the concepts that exist in KBpedia (among its ~40 000 concepts) will appear in the body content of the web page and its extracted metadata. There is one important thing to keep in mind here: the demo does detect what it considers to be the body content of the web page. It will defluff it, which means that it will remove the header, footer and sidebars and all other irrelevant content that can appear in the page surrounding the body content of that page. The model used by the demo works better on articles like web pages. So there are some web pages that may not end with much extracted body content for that very reason.

All of the concepts that appear in red are the ones that the demo considers to be the core concepts of the web page. The ones in blue are all of the other ones. If you mouse over any of these tagged terms, you will be presented a contextual menu that will show you one or multiple concepts that may refer to that surface form (the word in the text). For example, if you mouse over administration, you will be presented with two possible concepts for that word:

However, if you do the same for airstrikes then you will be presented a single unambiguous concept:

If you click on any of those links, then you will be redirected to a KBpedia reference concept view page. You will see exactly how that concepts ties into the broader KBpedia conceptual structure. You will see all of its related (direct and inferred) concepts, and how it links to external schemas, vocabularies and ontologies. It will show you lists of related entities, etc.

What all of this shows you is how these tagged concepts are in fact windows to a much broader universe that can be understood because all of its information is fully structured and can be reasoned upon. This is the crux of the demo. It shows that the content of a web page is not just about its content, but its entire context as well.

Entities tab

The entities tab presents information in exactly the same manner as the Concepts tab. However the content that is tagged is different. Instead of tagging concepts, we tag named entities. These entities (in the case of this demo) come from the entities datasets that we linked to KBpedia, namely: Wikipedia, Wikidata, Freebase and USPTO. These are a different kind of window than the concepts. These are the named things of this World that we detect in the content of the web page.

But there is one important thing to keep in mind: these are the named things that exist in the knowledge base at that moment. The demo is somewhat constrained to tens of millions of fully structured named entities that comes from these various public data sources. However the purpose of a knowledge base is to be nurtured and extended. Organizations could add private datasets into the mix to augment the knowledge of the system or to specialize it to specific domains of interest.

Another important thing to keep in mind is that we have constrained this demo to a specific domain of things, namely organizations. The demo is only considering a subset of entities from the KBpedia knowledge base, namely anything that is an organization. This shows how KBpedia can be sliced and diced to be domain specific. How millions of entities can be categorized in different kinds of domains id what leads to purposeful dedicated services.

The tag that appears in orange in the text is the organization entity that has been detected to be the organization that published that web page. All the other entities appear in blue. If you click on one of these entities, then you will be redirected to the entity view page. That page will show you all the structured information we have related to these entities in the knowledge base, and you will see how it ties to the KBpedia conceptual structure.

Analysis tab

The analysis tab is to core of the demo. It presents some analysis of the web page that uses the tagged concepts and entities to generate new information about the page. These are just some analysis we developed for the demo. All kinds of other analysis could be created in the future depending on the needs of our clients.

Language analysis

The first thing we detect is the language used to write the web page. The analysis is performed on the extracted body content of the page. We can detect about 125 languages at the moment. Cognonto is multilingual at its core, but at the moment we only configured the demo to analyze English web pages. Non-English web pages can be analyzed, but only English surface forms will be detected.

Topic analysis

The topic analysis section shows what the demo considers to be the most important concepts detected in the web page. Depending on a full suite of criteria, one concept will score higher than another. Note that all the concepts exist in the KBpedia conceptual structure. This means that we don’t simply “tag” a concept. We tag a concept that is part of an entire structure with hundreds and thousands of parents or children concepts, and linked to external schemas, vocabularies and ontologies. Again, these are not simple tags, these are windows into a much broader [conceptual] world.

Publisher analysis

The publisher analysis section shows what we consider to be the organization that published the web page. This analysis is much more complex in its processing. It incurs an analysis pipeline that includes multiple machine learning algorithms. However there is one thing that distinguishes it at its core than other more conventional machine learning pipelines: the heavy leveraging of the KBpedia conceptual structure. We use the tagged named entities the demo discovered, we check their types and then we analyze their structure within KBpedia, by using their SuperTypes for further analysis. Then we check the occurrence of their placements in the page and we compute a final likelihood score and we determine if one of these tagged entities can be considered the publisher of the web page.

Organizational Analysis

The organizational analysis is one of the steps that is performed by the Publisher analysis that we wanted to make explicit. What we do here is to show all the organization entities that we detected in the web page, and where in the web page (metadata, body content, etc.) they do appear.

The important thing to understand here is how we detect organization. We do not just check if the entities are of type Organization. What we do is to check if the entities are of type Organization by inference. What does that mean? It means that we use the KBpedia structure to keep all the tagged named entities that can be inferred to be an Organization. All of these organization entities are not defined to be of type kbpedia:Organization. However, what this analysis does is to check if the entities are of type kbpedia:Organization. But how is that possible? Cognonto does so by using the KBpedia structure, and its linkages to external vocabularies, schemas and ontologies, to determine which of the tagged named entities are of type kbpedia:Organization by inference.

Take a look at the kbpedia:Organization page. Take a look at all the Core structure and External structure linkage this single concept has with external conceptual structure. It is this structure that is used to determine if a named entity that exists in the KBpedia knowledge base is an Organization or not. There is no magic, but it is really powerful!

Metadata Extraction

All the extracted metadata by the demo is displayed at the end of the Analysis tab. This meta data comes from the HTML meta elements or some embedded microdata and RDFa structured content. Everything that got detected is displayed in this tab.

Graphs tab

The graph tab shows you a graphical visualization tool. The purpose is just to contextualize the identified concepts of the Topics analysis with the upper structure of KBpedia. It shows how everything is interconnected. But keep in mind that these are just tiny snapshots of the whole picture. There exists millions of links between these concepts and billions of inferred facts!

Here is a hierarchical view of the graph:

cognonto_demo_10

Here is a network view of that same graph:

 

Export tab

The export that is just a way for the user to visualize the resultset generated by Cognonto, enabling the web user interface to display the information you are seeing. It shows that all the information is structured and could be used by other computer systems for other means.

Conclusion

At the core of everything there is one thing: the KBpedia conceptual structure. It is what is being used across the board. It is what instructs machine learning algorithms, what helps us to analyze textual content such as web pages, this is what helps us to identify concepts and entities, it is what helps us to contextualize content, etc. This is the heart of Cognonto and everything else is just nuts and bolts. KBpedia can, and should, be extended with other private and public data sources. Cognonto/KBpedia is a living thing: it heals, it adapts and it evolves.

Posted at 17:48

September 21

Frederick Giasson: Cognonto

I am proud to announce the start of a new venture called Cognonto. I am particularly proud of it because even if it is just starting, it is in fact more than eight years old. It is the embodiment of eight years of research, of experimentation, of a big deal of frustration and of great joy with my long-time partner Mike. cognonto_logo-square

Eight years ago, we set a 5-to-10-year vision for our work as partners. We defined an initial series of technological goals for which we outlined a series of yearly milestones. The goals were related to help solving decades old problems with data integration and interoperability using a completely new research field (at the time): the Semantic Web.

And there we are eight years later, after working for an endless number of hours to create all kinds of different projects and services to pay for the research and the pieces of technologies we develop for these purposes. Cognonto is the embodiment of that effort, but it also created a series of other purposeful projects such as the creation of Stuctured Dynamics, UMBEL, the Open Semantic Framework and a series of other open source collaterals.

We spent eight years to create, sanitize, to make coherent and consistent, to generate and regenerate a conceptual structure of now 38,930 reference concepts with 138,868 mapping links to 27 external schemas, vocabularies and datasets. This led to the creation of KBpedia, which is the knowledge graph that drives Cognonto. The full statistics are available here.

I can’t thank Mike enough for this long and wonderful journey that led to the creation of Cognonto. I sent him an endless number of concepts lists that he diligently screened, assessed and mapped. We spent hundred of hours to discuss the knots and bolts of the structure, to argue about its core concepts and how it should be defined and used. It was not without pain, but I believe that the result is truly astonishing.

I won’t copy/paste the Cognonto press release here, a link will suffice. I it is just not possible for me to write a better introduction than the two pagers that Mike wrote for the press release. I would also suggest that you read his Cognonto introduction blog post: Cognonto is on the Hunt for Big AI Game.

In the coming weeks, I will write a lot about Cognonto, what it is, how it can be used, what are its use cases, how the information that is presented in the demo and the knowledge graph sections should be interpreted and what these pages tell you.

Posted at 13:19

September 19

Leigh Dodds: Why are bulk downloads of open data important?

I was really pleased to see that at the GODAN Summit last week

Posted at 17:47

September 15

Leigh Dodds: People like you are in this dataset

One of the recent projects we’ve done at

Posted at 22:38

September 02

Dublin Core Metadata Initiative: Tongfang Co., Ltd. becomes DC-2016 Reception Sponsor<</title>

2016-09-02, DCMI is pleased to announce that Tongfang Co., Ltd. has become the DC-2016 Reception Sponsor. Tongfang is a high-tech company established in 1997. Over the years, Tongfang has taken 'developing into a world-class high-tech enterprise' as its goal, and 'serving the society with science and technology' as its mission. The company, by making use of strengths of Tsinghua University in research and human resources, has been implementing such strategies as 'technology + capital', 'cooperation and development' and 'branding + internationalization'. With a corporate culture featuring 'action, exploration and excellence; loyalty, responsibility and value', Tongfang has been making explorations and innovations in industries of information, energy and environment. As of 2013, Tongfang's has a total asset of more than $5 billion. Its annual revenue was over $3.5 billion. For more information on becoming a sponsor for DC-2016, see http://dcevents.dublincore.org/public/sponsors/Sponsor-16.pdf. Visit the conference website at http://dcevents.dublincore.org/index.php/IntConf/dc-2016/schedConf/.

Posted at 23:59

Dublin Core Metadata Initiative: RFID System Technology Co. Ltd. becomes DC-2016 Gold Sponsor<</title>

2016-09-02, DCMI is pleased to announce that Shanghai RFID System Technology Co., Ltd., has become a Gold Sponsor of DC-2016 in Copenhagen. Founded on October 10, 2004, Shanghai RFID System Technology Co., Ltd. is a leading automatic management solution provider for libraries in China. The library of Chenyi College, Jimei University was the first library installed with RFID automatic system in China. In the past 12 years, the company has provided services for more than 400 libraries in China such as National Library of China, Shanghai Library, and Hangzhou Library. The services provided include library self-service system, book automatic sorting system, digital reading solution, device monitoring, mini library, cloud and big data platform, knowledge discovery system, and mobile applications in library. For more information on becoming a sponsor for DC-2016, see http://dcevents.dublincore.org/public/sponsors/Sponsor-16.pdf. Visit the conference website at http://dcevents.dublincore.org/index.php/IntConf/dc-2016/schedConf/.

Posted at 23:59

August 31

Leigh Dodds: Help me use your data

I’ve been interviewed a couple of times recently by people interested in understanding how best to publish data to make it useful for others.  Once by a startup and a couple of times by researchers. The core of the discussion has essentially been the same question: “how do you know if a dataset will be useful to you?”

I’ve given essentially the same answer each time. When I’m sifting through dataset descriptions, either in a portal or via a web search, my first stage of filtering involves looking for:

  1. A brief summary of the dataset: e.g. a title and a description
  2. The licence
  3. Some idea of its coverage, e.g. geographic coverage, scope of time series, level of aggregation, etc
  4. Whether it’s in a usable format

Beyond, that there’s a lot more that I’m interested in: the provenance of the data, its timeliness and a variety of quality indicators. But those pieces of information are what I’m looking for right at the start. I’ll happily jump through hoops to massage some data into a better format. But if the licence or coverage isn’t right then its useless to me.

We can frame these as questions:

  1. What is it? (Description)
  2. Can I use it? (Licence)
  3. Will it help answer my question? (in whole, or  part)
  4. How difficult will it be to use? (format, technical characteristics)

It’s frustrating how often these essentials aren’t readily available.

Here’s an example of why this is important.

A weather data example

I’m currently working on a project that needs access to local weather observations. I want openly licensed temperature readings for my local area.

My initial port of call was the

Posted at 18:12

AKSW Group - University of Leipzig: AKSW Colloquium, 05.09.2016. LOD Cloud Statistics, OpenAccess at Leipzig University.

On the upcoming Monday (05.09.2016), AKSW group will discuss topics related to Semantic Web and LOD Cloud Statistics. Also, we will have invited speaker from University of Leipzig Library (UBL) Dr. Astrid Vieler talking about OpenAccess at Leipzig University.

LODStats: The Data Web Census Dataset

by Ivan Ermilov et al.
Presented by: Ivan Ermilov

Abstract: Over the past years, the size of the Data Web has increased significantly, which makes obtaining general insights into its growth and structure both more challenging and more desirable. The lack of such insights hinders important data management tasks such as quality, privacy and coverage analysis. In this paper, we present the LODStats dataset, which provides a comprehensive picture of the current state of a significant part of the Data Web. LODStats is based on RDF datasets from data.gov, publicdata.eu and datahub.io data catalogs and at the time of writing lists over 9 000 RDF datasets. For each RDF dataset, LODStats collects comprehensive statistics and makes these available in adhering to the LDSO vocabulary. This analysis has been regularly published and enhanced over the past five years at the public platform lodstats.aksw.org. We give a comprehensive overview over the resulting dataset.

OpenAccess at Leipzig University

Invited talk by Dr. Astrid Vieler from Leipzig University Library (UBL). The talk will be about Open Access in general and the Open Access Policy of our University in special. She will tell us more about our right, which we have toward the publishers, and she gives us advice and hints on how we can increase the visibility of our publications.

After the talks, there is more time for discussion in smaller groups as well as coffee and cake. The colloquium starts at 3 p.m. and is located on 7th floor (Leipzig, Augustusplatz 10, Paulinum).

Posted at 09:23

August 28

Bob DuCharme: Converting between MIDI and RDF: readable MIDI and more fun with RDF

Listen to my fun!

Posted at 17:24

August 24

Frederick Giasson: Winnipeg City’s NOW [Data] Portal

The Winnipeg City’s NOW (Neighbourhoods Of Winnipeg) Portal is an initiative to create a complete neighbourhood web portal for its citizens. At the core of the project we have a set of about 47 fully linked, integrated and structured datasets of things of interests to Winnipegers. The focal point of the portal is Winnipeg’s 236 neighbourhoods, which define the main structure of the portal. The portal has six main sections: topics of interests, maps, history, census, images and economic development. The portal is meant to be used by citizens to find things of interest in their neibourhood, to learn their history, to see the images of the things of interest, to find tools to help economic development, etc.

The NOW portal is not new; Structured Dynamics was also its main technical contractor for its first release in 2013. However we just finished to help Winnipeg City’s NOW team to migrate their older NOW portal from OSF 1.x to OSF 3.x and from Drupal 6 to Drupal 7; we also trained them on the new system. Major improvements accompany this upgrade, but the user interface design is essentially the same.

The first thing I will do is to introduce each major section of the portal and I will explain the main features of each. Then I will discuss the new improvements of the portal.

Datasets

A NOW portal user won’t notice any of this, but the main feature of the portal is the data it uses. The portal manages 47 datasets (and growing) of fully structured, integrated and linked datasets of things of interests to Winnipegers. What the portal does is to manage entities. Each kind of entity (swimming pools, parks, places, images, addresses, streets, etc.) are defined with multiple properties and values. Several of the entities reference other entities in other datasets (for example, an assessment parcel from the Assessment Parcels dataset references neighbourhoods entities and property addresses entities from their respective datasets).

The fact that these datasets are fully structured and integrated means that we can leverage these characteristics to create a powerful search experience by enabling filtering of the information on any of the properties, to bias the searches depending where a keyword search match occurs, etc.

Here is the list of all the 47 datasets that currently exists in the portal:

  1. Aboriginal Service Providers
  2. Arenas
  3. Neighbourhoods of Winnipeg City
  4. Streets
  5. Economic Development Images
  6. Recreation & Leisure Images
  7. Neighbourhoods Images
  8. Volunteer Images
  9. Library Images
  10. Parks Images
  11. Census 2006
  12. Census 2001
  13. Winnipeg Internal Websites
  14. Winnipeg External Websites
  15. Heritage Buildings and Resources
  16. NOW Local Content Dataset
  17. Outdoor Swimming Pools
  18. Zoning Parcels
  19. School Divisions
  20. Property Addresses
  21. Wading Pools
  22. Electoral wards of Winnipeg City
  23. Assessment Parcels
  24. Libraries
  25. Community Centres
  26. Police Service Centers
  27. Community Gardens
  28. Leisure Centres
  29. Parks and Open Spaces
  30. Community Committee
  31. Commercial real estates
  32. Sports and Recreation Facilities
  33. Community Characterization Areas
  34. Indoor Swimming Pools
  35. Neighbourhood Clusters
  36. Fire and Paramedic Stations
  37. Bus Stops
  38. Fire and Paramedic Service Images
  39. Animal Services Images
  40. Skateboard Parks
  41. Daycare Nurseries
  42. Indoor Soccer Fields
  43. Schools
  44. Truck Routes
  45. Fire Stations
  46. Paramedic Stations
  47. Spray Parks Pads

Structured Search

The most useful feature of the portal to me is its full-text search engine. It is simple, clean and quite effective. The search engine is configured to try to give the most relevant results a NOW portal user may be searching. For example, it will positively bias some results that comes from some specific datasets, or matches that occurs in specific property values. The goal of this biasing is to improve the quality of the returned results. This is somewhat easy to do since the context of the portal is well known and we can easily boost scoring of search results since everything is fully structured.

Another major gain is that all the search results are fully templated. The search results do not simply return a title and some description for your search results. It does template all the information the system has about the matched results, but also displays the most relevant information to the users in the search results.

For example, if I search for a indoor swimming pool, in most of the cases it may be to call the front desk to get some information about the pool. This is why different key information will be displayed directly in the search results. That way, most of the users won’t even have to click on the result to get the information they were looking for directly in the search results page.

Here is an example of a search for the keywords main street. As you can notice, you are getting different kind of results. Each result is templated to get the core information about these entities. You have the possibility to focus on particular kind of entities, or to filter by their location in specific neighbourhoods.

now--search-1

Templated Search Results

Now let’s see some of the kind of entities that can be searched on the portal and how they are presented to the users.

Here is an example of an assessment parcel that is located in the St. John’s neighbourhood. The address, the value, the type and the location of the parcel on a map is displayed directly into the search results.

Another kind of entity that can be searched are the property addresses. These are located on a map, the value of the parcels and the building and the zoning of the address is displayed. The property is also linked to its assessment parcel entity which can be clicked to get additional information about the parcel.

Another interesting type of entity that can be searched are the streets. What is interesting in this case is that you get the complete outline of the street directly on a map. That way you know where it starts and where it ends and where it is located in the city.

There are more than a thousand geo-localized images of all different things in the city that can be searched. A thumbnail of the image and the location of the thing that appears on the image appears in the search results.

If you were searching for a nursery for your new born child, then you can quickly see the name, location on a map and the phone number of the nursery directly in the search result.

There are just a few examples of the fifty different kind of entities that can appear like this in the search results.

Mapping

The mapping tool is another powerful feature of the portal. You can search like if you were using the full-text search engine (the top search box on the portal) however you will only get the results that can be geo-localized on a map. You can also simply browse entities from a dataset or you can filter entities by their properties/values. You can persist entities you find on the map and save the map for future reference.

In the example below, it shows that someone searched for a street (main street) and then he persisted it on the map. Then he search for other things like nurseries and selected the ones that are near the street he persisted, etc. That way he can visualize the different known entities in the portal on a map to better understand where things are located in the city, what exists near a certain location, within a neighbourhood, etc.

now--map

Census Analysis

Census information is vital to the good development of a city. They are necessary to understand the trends of a sector, who populates it, etc., such that the city and other organizations may properly plan their projects to have has much impact as possible.

These are some of the reason why one of the main section of the site is dedicated to census data. Key census indicators have been configured in the portal. Then users can select different kind of regions (neighbourhood clusters, community areas and electoral wards) to get the numbers for each of these indicators. Then they can select multiple of these regions to compare each other. A chart view and a table view is available for presenting the census data.

History, Images & Points of Interest

The City took the time to write the history of each of its neighbourhoods. In additional to that, they hired professional photographs to photograph the points of interests of the city, to geo-localize them and to write a description for each of these photos. Because of this dedication, users of the portal can learn a much about the city in general and the neighbourhood they live in. This is what the History and Image sections of the website are about.

Historic buildings are displayed on a map and they can be browsed from there.

Images of points of interests in the neighbourhood are also located on a map.

Find Your Neighbourhood

Ever wondered in which neighbourhood you live in? No problem, go on the home page, put your address in the Find your Neighbourhood section and you will know it right away. From there you can learn more about your neighbourhood like its history, the points of interest, etc.

Your address will be located on a map, and your neighbourhood will be outlined around it. Not only you will know in which neighbourhood you live, but you will also know where you live within it. From there you can click on the name of the neigbourhood to get to the neighbourhood’s page and start learning more about it like its history, to see photos of points of interest that exists in your neighbourhood, etc.

Browsing Content by Topic

Because all the content of the portal is fully structured, it is easy to browse its content using a well defined topic structure. The city developed its own ontology that is used to help the users browse the content of the portal by browsing topics of interest. In the example below, I clicked the Economic Development node and then the Land use topic. Finally I clicked the Map button to display things that are related to land use: in this case, zoning and assessment parcels are displayed to the user.

This is another way to find meaningful and interesting content from the portal.

Depending on the topic you choose, and the kind of information related to that topic, you may end up with different options like a map, a list of links to documents related to that topic, etc.

Export Content

Now that I made an overview of each of the main features of the portal, let’s go back to the geeky things. The first thing I said about this portal is that at its core, all information it manages is fully structured, integrated and linked data. If you get to the page of an entity, you have the possibility to see the underlying data that exists about it in the system. You simply have to click the Export tab at the top of the entity’s page. Then you will have access to the description of that entity in multiple different formats.

In the future, the City should (or at least I hope will) make the whole set of datasets fully downloadable. Right now you only have access to that information via that export feature per entity. I hope because this NOW portal is fully disconnected from another initiative by the city: data.winnipeg.ca, which uses Socrata. The problem is that barely any of the datasets from NOW are available on data.winnipeg.ca, and the ones that are appearing are the raw ones (semi-structured, un-documented, un-integrated and non-linked) all the normalization work, the integration work, the linkage work done by the NOW team hasn’t been leveraged to really improve the data.winnipeg.ca datasets catalog.

New with the upgrades

Those who are familiar with the NOW portal will notice a few changes. The user interface did not change that much, but multiple little things got improved in the process. I will cover the most notable of these changes.

The major changes that happened are in the backend of the portal. The data management in OSF for Drupal 7 is incompatible with what was available in Drupal 6. The management of the entities became easier, the configuration of OSF networks became a breeze. A revisioning system has been added, the user interface is more intuitive, etc. There is no comparison possible. However, portal users’ won’t notice any of this, since these are all site administrator functions.

The first thing that users will notice is the completely new full-text search engine. The underlying search engine is almost the same, but the presentation is far better. All entity types have gotten their own special template, which are displayed in a special way in the search results. Most of the time results should be much more relevant, filtering is easier and cleaner. The search experience is much better in my view.

The overall site performance is much better since different caching strategies have been put in place in OSF 3.x and OSF for Drupal. This means that most of the features of the portal should react more swiftly.

Now every type of entity managed by the portal is templated: their webpage is templated in specific ways to optimize the information they want to convey to users along with their search result “mini page” when they get returned as the result of a search query.

Multi-linguality is now fully supported by the portal, however not everything is currently templated. However expect a fully translated NOW portal in French in the future.

Creating a Network of Portals

One of the most interesting features that goes with this upgrade is that the NOW portal is now in a position to participate into a network of OSF instances. What does that mean? Well, it means that the NOW portal could create partnerships with other local (regional, national or international) organizations to share datasets (and their maintenance costs).

Are there other organizations that uses this kind of system? Well, there is at least another one right in Winnipeg City: MyPeg.ca, also developed by Structured Dynamics. MyPeg uses RDF to model its information and uses OSF to manage its information. MyPeg is a non-profit organization that uses census (and other indicator) data to do studies on the well being of Winnipegers. The team behind MyPeg.ca are research experts in indicator data. Their indicator datasets (which includes census data) is top notch.

Let’s hypothetize that there would be interest between the two groups to start collaborating. Let’s say that the NOW portal would like to use MyPeg’s census datasets instead of its own since they are more complete, accurate and include a larger number of important indicators. What they basically want is to outsource the creation and maintenance of the census/indicators data to a local, dedicated and highly professional organization. The only things they would need to do is to:

  1. Formalize their relationship by signing a usage agreement
  2. The NOW portal would need to configure the MyPeg.ca OSF network into their OSF for Drupal instance
  3. The NOW portal would need to register the datasets it want to use from MyPeg.ca.

Once these 3 steps are done, taking no more than a couple of minutes, then the system administrators of the NOW portal could start using the MyPeg.ca indicator datasets like they were existing on their own network. (The reverse could also be true for MyPeg.) Everything would be transparent to them. From then on, all the fixes and updates performed by MyPeg.ca to their indicator datasets would immediately appear on the NOW portal and accessible to its users.

This is one possibility to collaborate. Another possibility would be to simply on a routine basis (every month, every 6 months, every year) share the serialized datasets such that the NOW portal re-import the dataset from the files shared by MyPeg.ca. This is also possible since both organizations use the same Ontology to describe the indicator data. This means that no modification is required by the City to take that new information into account, they only have to import and update their local datasets. This is the beauty of ontologies.

Conclusion

The new NOW portal is a great service for citizens of Winnipeg City. It is also a really good example of a web portal that leverages fully structured, integrated and linked data. To me, the NOW portal is a really good example of the features that should go along with a municipal data portal.

Posted at 17:33

August 16

Dublin Core Metadata Initiative: DC-2016 final program published

2016-08-16, DCMI is pleased to announce publication of the final program for DC-2016. The program consists of an array of presentations, lightning talks, papers, project reports, posters, special sessions, and workshops. To review the program, visit the Program Page at http://dcevents.dublincore.org/IntConf/dc-2016/schedConf/program where titles link to abstracts. Registration is open at http://dcevents.dublincore.org/IntConf/index/pages/view/reg16 with early rates available through 2 September 2016. Significant registration savings are available for DCMI members or ASIST members wishing to attend the collocated DC-2016 and ASIST conferences. The ASIST program is available at https://www.asist.org/events/annual-meeting/annual-meeting-2016/program/ and with seminars and workshops at https://www.asist.org/events/annual-meeting/annual-meeting-2016/seminars-and-workshops/.

Posted at 23:59

Dublin Core Metadata Initiative: Mike Lauruhn appointed to DCMI Governing Board<</title>

2016-08-16, DCMI is pleased to announce that Mike Lauruhn has accepted an appointment to the DCMI Governing Board as an Independent Member. Mike is Technology Research Director at Elsevier Labs and has been a longstanding participant in DCMI, serving as a member of the Education and Outreach Committee's "Linked Data for Professional Education (LD4PE) initiative" and as co-Program Chair for the DCMI annual conference in 2010. Before joining Elsevier Labs in 2010, he was a consultant with Taxonomy Strategies LLC working with numerous private companies, nonprofits, and government agencies to help define and implement taxonomies and metadata schemas. He began his library career cataloging at the California Newspaper Project at the Center for Bibliographic Studies & Research at the University of California, Riverside. Mike's three year term will begin at the close of DC-2016 in Copenhagen. For additional information, visit the Governing Board page at http://dublincore.org/about/oversight/#lauruhn.

Posted at 23:59

Semantic Web Company (Austria): Attend and contribute to the SEMANTiCS 2016 in Leipzig

UniLeipzig_Andreas Schmidt_2_1The 12th edition of the SEMANTiCS, which is a well known platform for professionals and researchers who make semantic computing work, will be held in the city of Leipzig from September 12th till 15th. We are proud to announce the final program of the SEMANTiCS conference. The program will cover 6 keynote speakers, 40 industry presentations, 30 scientific paper presentations, 40 poster & demo presentations and a huge number of satellite events. Special talks will given by Thomas Vavra from IDC and Sören Auer, who will feature the LEDS track. On top of that there will be a fishbowl session ‘Knowledge Graphs – A Status Update’ with lightning talks from Hans Uszkoreit (DFKI) and Andreas Blumenauer (SWC). This week, the set of our distinguished keynote speakers has been fixed and we are quite excited to have them at this years’ edition of SEMANTiCS. Please join us to listen to talks from representatives from IBM, Siemens, Springer Nature, Wikidata, International Data Corporation (IDC), Fraunhofer IAIS, Oxford University Press and the Hasso-Plattner-Institut, who will share their latest insights on applications of Semantic technologies with us. To register and be part of the SEMANTiCS 2016 in Leipzig, please go to: http://2016.semantics.cc/registration.

Share your ideas, tools and ontologies, last minute submissions
Meetup: Big Data & Linked Data – The Best of Both Worlds  

On the first eve of the SEMANTiCS conference we will discuss how Big Data & Linked Data technologies could become a perfect match. This meetup gathers experts on Big and Linked Data to discuss the future agenda on research and implementation of a joint technology development.

  • Register (free)

  • If you are interested to present your idea, approach or project which links Semantic technologies with Big Data in an ad-hoc lightning talk, please get in touch with Thomas Thurner (t.thurner@semantic-web.at).

WORKSHOPS/TUTORIALS

This year’s SEMANTiCS is starting on September 12th with a full day of exciting and interesting satellite events. In 6 parallel tracks scientific and industrial workshops and tutorials are scheduled to provide a forum for groups of researchers and practitioners to discuss and learn about hot topics in Semantic Web research.

How to find users and feedback for your vocabulary or ontology?

The Vocabulary Carnival is a unique opportunity for vocabulary publishers to showcase and share their work in form of a poster and a short presentation, meet the growing community of vocabulary publishers and users to build useful semantic, technical and social links. You can join the Carnival Minute Madness on the 13th of September.

How to submit to ELDC?

The European Linked Data Contest awards prizes to stories, products, projects or persons presenting novel and innovative projects, products and industry implementations involving linked data. The ELDC is more than yet another competition. We envisage to build a directory of the best European projects in the domain of Linked Data and the Semantic Web. This year the ELDC is awarded in the categories Linked Enterprise Data and Linked Open Data, with €1.500,- for each of the winners. Submission deadline is August 31, 2016.

7th DBpedia Community Meeting in Leipzig 2016

Co-located with SEMANTiCS, the next DBpedia meeting will be held at Leipzig on September 15th. Experts will speak about topics such as Wikidata: bringing structured data to Wikipedia with 16.000 volunteers. The 7th edition of this event covers a DBpedia showcase session, breakout sessions and a DBpedia Association meeting where we will discuss new strategies and which direction is important for DBpedia. If you like to become part of the DBpedia community and present your ideas, please submit your proposal or check our meeting website: http://wiki.dbpedia.org/meetings/Leipzig2016

Sponsorship  opportunities

We would be delighted to welcome new sponsors for SEMANTiCS 2016. You will find a number of sponsorship packages with an indication of benefits and prices here: http://semantics.cc/sponsorship-packages.

Special offer: You can buy a special SEMANTiCS industry ticket for €400 which includes a poster presentation at our marketplace. So take the opportunity to increase the visibility of your company, organisation or project among an international and high impact community. If you are interested, please contact us via email to semantics2016@fu-confirm.de.  

Posted at 15:58

August 15

Semantic Web Company (Austria): Introducing a Graph-based Semantic Layer in Enterprises

this-is-not-a-pipeThings, not Strings
Entity-centric views on enterprise information and all kinds of data sources provide means to get a more meaningful picture about all sorts of business objects. This method of information processing is as relevant to customers, citizens, or patients as it is to knowledge workers like lawyers, doctors, or researchers. People actually do not search for documents, but rather for facts and other chunks of information to bundle them up to provide answers to concrete questions.

Strings, or names for things are not the same as the things they refer to. Still, those two aspects of an entity get mixed up regularly to nurture the Babylonian language confusion. Any search term can refer to different things, therefore also Google has rolled out its own knowledge graph to help organizing information on the web at a large scale.

Semantic graphs can build the backbone of any information architecture, not only on the web. They can enable entity-centric views also on enterprise information and data. Such graphs of things contain information about business objects (such as products, suppliers, employees, locations, research topics, …), their different names, and relations to each other. Information about entities can be found in structured (relational databases), semi-structured (XML), and unstructured (text) data objects. Nevertheless, people are not interested in containers but in entities themselves, so they need to be extracted and organized in a reasonable way.

Machines and algorithms make use of semantic graphs to retrieve not only simply the objects themselves but also the relations that can be found between the business objects, even if they are not explicitly stated. As a result, ‘knowledge lenses’ are delivered that help users to better understand the underlying meaning of business objects when put into a specific context.

Personalization of information
The ability to take a view on entities or business objects in different ways when put into various contexts is key for many knowledge workers. For example, drugs have regulatory aspects, a therapeutical character, and some other meaning to product managers or sales people. One can benefit quickly when only confronted with those aspects of an entity that are really relevant in a given situation. This rather personalized information processing has heavy demand for a semantic layer on top of the data layer, especially when information is stored in various forms and when scattered around different repositories.

Understanding and modelling the meaning of content assets and of interest profiles of users are based on the very same methodology. In both cases, semantic graphs are used, and also the linking of various types of business objects works the same way.

Recommender engines based on semantic graphs can link similar contents or documents that are related to each other in a highly precise manner. The same algorithms help to link users to content assets or products. This approach is the basis for ‘push-services’ that try to ‘understand’ users’ needs in a highly sophisticated way.

‘Not only MetaData’ Architecture
Together with the data and content layer and its corresponding metadata, this approach unfolds into a four-layered information architecture as depicted here.

Following the NoSQL paradigm, which is about ‘Not only SQL’, one could call this content architecture ‘Not only Metadata’, thus ‘NoMeDa’ architecture. It stresses the importance of the semantic layer on top of all kinds of data. Semantics is no longer buried in data silos but rather linked to the metadata of the underlying data assets. Therefore it helps to ‘harmonize’ different metadata schemes and various vocabularies. It makes the semantics of metadata, and of data in general, explicitly available. While metadata most often is stored per data source, and therefore not linked to each other, the semantic layer is no longer embedded in databases. It reflects the common sense of a certain domain and through its graph-like structure it can serve directly to fulfill several complex tasks in information management:

  • Knowledge discovery, search and analytics
  • Information and data linking
  • Recommendation and personalization of information
  • Data visualization

Graph-based Data Modelling
Graph-based semantic models resemble the way how human beings tend to build their own models of the world. Any person, not only subject matter experts, organize information by at least the following six principles:

  1. Draw a distinction between all kinds of things: ‘This thing is not that thing’
  2. Give things names: ‘This thing is my dog Goofy’ (some might call it Dippy Dawg, but it’s still the same thing)
  3. Categorize things: ‘This thing is a dog but not a cat’
  4. Create general facts and relate categories to each other: ‘Dogs don’t like cats’
  5. Create specific facts and relate things to each other: ‘Goofy is a friend of Donald’, ‘Donald is the uncle of Huey, Dewey, and Louie’, etc.
  6. Use various languages for this; e.g. the above mentioned fact in German is ‘Donald ist der Onkel von Tick, Trick und Track’ (remember: the thing called ‘Huey’ is the same thing as the thing called ‘Tick’ – it’s just that the name or label for this thing that is different in different languages).

These fundamental principles for the organization of information are well reflected by semantic knowledge graphs. The same information could be stored as XML, or in a relational database, but it’s more efficient to use graph databases instead for the following reasons:

  • The way people think fits well with information that is modelled and stored when using graphs; little or no translation is necessary.
  • Graphs serve as a universal meta-language to link information from structured and unstructured data.
  • Graphs open up doors to a better aligned data management throughout larger organizations.
  • Graph-based semantic models can also be understood by subject matter experts, who are actually the experts in a certain domain.
  • The search capabilities provided by graphs let you find out unknown linkages or even non-obvious patterns to give you new insights into your data.
  • For semantic graph databases, there is a standardized query language called SPARQL that allows you to explore data.
  • In contrast to traditional ways to query databases where knowledge about the database schema/content is necessary, SPARQL allows you to ask “tell me what is there”.

Standards-based Semantics
Making the semantics of data and metadata explicit is even more powerful when based on standards. A framework for this purpose has evolved over the past 15 years at W3C, the World Wide Web Consortium. Initially designed to be used on the World Wide Web, many enterprises have been adopting this stack of standards for Enterprise Information Management. They now benefit from being able to integrate and link data from internal and external sources with relatively low costs.

At the base of all those standards, the Resource Description Framework (RDF) serves as a ‘lingua franca’ to express all kinds of facts that can involve virtually any kind of category or entity, and also all kinds of relations. RDF can be used to describe the semantics of unstructured text, XML documents, or even relational databases. The Simple Knowledge Organization System (SKOS) is based on RDF. SKOS is widely used to describe taxonomies and other types of controlled vocabularies. SPARQL can be used to traverse and make queries over graphs based on RDF or standard schemes like SKOS.

With SPARQL, far more complex queries can be executed than with most other database query languages. For instance, hierarchies can be traversed and aggregated recursively: a geographical taxonomy can then be used to find all documents containing places in a certain region although the region itself is not mentioned explicitly.

Standards-based semantics also helps to make use of already existing knowledge graphs. Many government organisations have made available high-quality taxonomies and semantic graphs by using semantic web standards. These can be picked up easily to extend them with own data and specific knowledge.

Semantic Knowledge Graphs will grow with your needs!
Standards-based semantics provide yet another advantage: it is becoming increasingly simpler to hire skilled people who have been working with standards like RDF, SKOS or SPARQL before. Even so, experienced knowledge engineers and data scientists are a comparatively rare species. Therefore it’s crucial to grow graphs and modelling skills over time. Starting with SKOS and extending an enterprise knowledge graph over time by introducing more schemes and by mapping to other vocabularies and datasets over time is a well established agile procedure model.

A graph-based semantic layer in enterprises can be expanded step-by-step, just like any other network. Analogous to a street network, start first with the main roads, introduce more and more connecting roads, classify streets, places, and intersections by a more and more distinguished classification system. It all comes down to an evolving semantic graph that will serve more and more as a map of your data, content and knowledge assets.

Semantic Knowledge Graphs and your Content Architecture
It’s a matter of fact that semantics serves as a kind of glue between unstructured and structured information and as a foundation layer for data integration efforts. But even for enterprises dealing mainly with documents and text-based assets, semantic knowledge graphs will do a great job.

Semantic graphs extend the functionality of a traditional search index. They don’t simply annotate documents and store occurrences of terms and phrases, they introduce concept-based indexing in contrast to term based approaches. Remember: semantics helps to identify the things behind the strings. The same applies to concept-based search over content repositories: documents get linked to the semantic layer, and therefore the knowledge graph can be used not only for typical retrieval but to classify, aggregate, filter, and traverse the content of documents.

PoolParty combines Machine Learning with Human Intelligence

Semantic knowledge graphs have the potential to innovate data and information management in any organisation. Besides questions around integrability, it is crucial to develop strategies to create and sustain the semantic layer efficiently.

Looking at the broad spectrum of semantic technologies that can be used for this endeavour, they range from manual to fully automated approaches. The promise to derive high-quality semantic graphs from documents fully automatically has not been fulfilled to date. On the other side, handcrafted semantics is error-prone, incomplete, and too expensive. The best solution often lies in a combination of different approaches. PoolParty combines Machine Learning with Human Intelligence: extensive corpus analysis and corpus learning support taxonomists, knowledge engineers and subject matter experts with the maintenance and quality assurance of semantic knowledge graphs and controlled vocabularies. As a result, enterprise knowledge graphs are more complete, up to date, and consistently used.

“An Enterprise without a Semantic Layer is like a Country without a Map.

Posted at 13:34

August 10

AKSW Group - University of Leipzig: AKSW Colloquium, 15th August, 3pm, RDF query relaxation

Michael Roeder On the 15th of August at 3 PM, Michael Röder will present the paper “RDF Query Relaxation Strategies Based on Failure Causes” of Fokou et al. in P702.

Abstract

Recent advances in Web-information extraction have led to the creation of several large Knowledge Bases (KBs). Querying these KBs often results in empty answers that do not serve the users’ needs. Relaxation of the failing queries is one of the cooperative techniques used to retrieve alternative results. Most of the previous work on RDF query relaxation compute a set of relaxed queries and execute them in a similarity-based ranking order. Thus, these approaches relax an RDF query without knowing its failure causes (FCs). In this paper, we study the idea of identifying these FCs to speed up the query relaxation process. We propose three relaxation strategies based on various information levels about the FCs of the user query and of its relaxed queries as well. A set of experiments conducted on the LUBM benchmark show the impact of our proposal in comparison with a state-of-the-art algorithm.

The paper is available at researchgate.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted at 09:03

August 09

schema.org: schema.org update: hotels, datasets, "health-lifesci" and "pending" extensions...

Schema.org 3.1 has been released! Many thanks to everyone in the community who has contributed to this update, which includes substantial new vocabulary for describing hotels and accommodation, some improvements around dataset description, as well as the usual collection of new examples, bugfixes, usability, infrastructural, standards compatibility and conceptual consistency improvements.

This release builds upon the recent 3.0 release. In version 3.0 we created a health-lifesci extension as a new home for the extensive collection of medical/health terms that were introduced back in 2012. Publishers and webmasters do not need to update their markup for this change, it is best considered an improvement to the structure of our documentation. Our extension system allows us to provide deeper coverage of specialist topics without cluttering the core project pages. Version 3.0 also included some improvements from the FIBO project, improving our representation of various financial products.

We have also introduced a special extension called "pending", which provides a place for newly proposed schema.org terms to be documented, tested and revised. We hope that this will help schema proposals get wider visibility and review, supporting greater participation from non-developer collaborators. You should not need to be a computer programmer to be part of our project, and "pending" is one step towards making work-in-progress schema proposals more visible without requiring knowledge of highly technical systems like GitHub. We have linked each term in pending.schema.org to the technical discussions at Github, but also to a simple feedback form. We anticipate updating the "pending" area relatively frequently, in between formal releases.

The site also features a new "how we work" document, oriented towards the Web standards community and toolmakers, explaining the evolving process we have adopted towards creating new and improved schemas. See also commentary on this in the UK government technology blog post about making job adverts more open with schema.org.

Many people were involved in these updates, but particular thanks are due to Martin Hepp for leading the hotels/accommodation design, and to Marc Twagirumukiza for chairing the "schemed" W3C community group that led the creation of our new health-lifesci extension.

Finally, we would like to dedicate this release to Peter Mika, who has served on our steering group since the early days. Peter has stepped down as Yahoo's representative, passing his duties to Nicolas Torzec. Thanks, Peter! Welcome, Nicolas...

For more details on version 3.1 of schema.org, check out the release notes

Posted at 17:47

August 04

Semantic Web Company (Austria): PoolParty Academy is opening in September 2016

PoolParty Academy offers three E-Learning tracks that enable customers, partners and individual professionals to learn Semantic Web technologies and PoolParty Semantic Suite in particular.

You can pre-register for the PoolParty Academy training tracks at the academy’s website or join our live class-room at the biggest European industrial Semantic Web conference – SEMANTiCS 2016.

read more

Posted at 07:16

August 02

AKSW Group - University of Leipzig: Article accepted in Journal of Web Semantics

We are happy to announce that the article “DL-Learner – A Framework for Inductive Learning on the Semantic Web” by Lorenz Bühmann, Jens Lehmann and Patrick Westphal was accepted for publication in the Journal of Web Semantics: Science, Services and Agents on the World Wide Web.

Abstract:

In this system paper, we describe the DL-Learner framework, which supports supervised machine learning using OWL and RDF for background knowledge representation. It can be beneficial in various data and schema analysis tasks with applications in different standard machine learning scenarios, e.g. in the life sciences, as well as Semantic Web specific applications such as ontology learning and enrichment. Since its creation in 2007, it has become the main OWL and RDF-based software framework for supervised structured machine learning and includes several algorithm implementations, usage examples and has applications building on top of the framework. The article gives an overview of the framework with a focus on algorithms and use cases.

Posted at 07:54

August 01

Dublin Core Metadata Initiative: DC-2016 Preliminary Program Announced

2016-08-01, DCMI is pleased to announce the publication of the Preliminary Program for DC-2016 at http://dcevents.dublincore.org/IntConf/dc-2016/schedConf/program. The program includes 28 Full Papers, Project Reports, and Presentations on Metadata as well as 14 Posters. Six Special Sessions on significant metadata topics as well as 6 half- and full-day Workshops round out the program. The keynote will be delivered by Bradley P. Allen, Chief Architect at Elsevier, the world's leading scientific publisher. The Final Program including the authors and abstracts of Papers, Project Reports, Posters, and Presentations on Metadata will be available on 15 August. Registration is now open at http://dcevents.dublincore.org/IntConf/index/pages/view/reg16 with an early registration rate available through 2 September 2016.

Posted at 23:59

July 31

Bob DuCharme: SPARQL in a Jupyter (a.k.a. IPython) notebook

With just a bit of Python to frame it all.

Posted at 15:15

July 23

Leigh Dodds: Reputation data portability

Yesterday I went to the

Posted at 11:24

July 18

AKSW Group - University of Leipzig: AKSW Colloquium, 18.07.2016, AEGLE and node2vec

On Monday 18.07.2016, Kleanthi Georgala will give her Colloquium presentation for her paper “An Efficient Approach for the Generation of Allen Relations”, that was accepted at the European Conference on Artificial Intelligence (ECAI) 2016.

Abstract

Event data is increasingly being represented according to the Linked Data principles. The need for large-scale machine learning on data represented in this format has thus led to the need for efficient approaches to compute RDF links between resources based on their temporal properties. Time-efficient approaches for computing links between RDF resources have been developed over the last years. However, dedicated approaches for linking resources based on temporal relations have been paid little attention to. In this paper, we address this research gap by presenting AEGLE, a novel approach for the efficient computation of links between events according to Allen’s interval algebra. We study Allen’s relations and show that we can reduce all thirteen relations to eight simpler relations. We then present an efficient algorithm with a complexity of O(n log n) for computing these eight relations. Our evaluation of the runtime of our algorithms shows that we outperform the state of the art by up to 4 orders of magnitude while maintaining a precision and a recall of 1.

Tommaso SoruAfterwards, Tommaso Soru will present a paper considered the latest chapter of the Everything-2-Vec saga, which encompasses outstanding works such as Word2Vec and Doc2Vec. The paper title is node2vec: Scalable Feature Learning for Networks” [PDF] by Aditya Grover and Jure Leskovec, accepted for publication at the International Conference on Knowledge Discovery and Data Mining (KDD), 2016 edition.

Posted at 12:56

July 11

Dublin Core Metadata Initiative: FINAL call for DC-2016 Presentations and Best Practice Posters/Demos

2016-07-11, The submission deadline of 15 July is rapidly approaching for the Presentations and Best Practice Posters and Demos tracks at DC-2016. Both presentations and posters/demos provide the opportunity to practitioners and researchers specializing in metadata design, implementation, and use to present their work at the International Conference on Dublin Core and Metadata Applications in Copenhagen. No paper is required for presentations or posters/demos. Accepted submissions in the Presentations track will have approximately 20-25 minutes to present and 5-10 minutes for questions and discussion. Proposal abstracts will be reviewed for selection by the Program Committee. The presentation slide decks and the poster images will be openly available as part of the permanent record of the DC-2016 conference. If you are interested in presenting at DC-2016, please submit a proposal abstract through the DC-2016 submission system before the 15 July deadline at http://dcevents.dublincore.org/index.php/IntConf/dc-2016/schedConf/cfp. For a fuller description of the Presentations track, see http://dcevents.dublincore.org/IntConf/index/pages/view/pre16.

Posted at 23:59

Copyright of the postings is owned by the original blog authors. Contact us.