It's triples all the way down
In the Huffington Post, Steve Hamby, the Chief Technology Officer of Orbis Technologies, Inc. writes an evaluation of why he thinks 2012 will be the year of the Semantic Web. He includes a report of the use of RDFa and says “Was the strategy successful? … I would say yes”
Posted at 09:05
Posted at 14:01
I had a great time reading a paper on Semantic Search
Posted at 16:53
Posted at 17:10
Open Sahara
Server, the open source semantic web server, includes a pluggable
API for adding smart search, query and analysis functionality. Open
Sahara can be used to analyse and annotate documents from the
Internet or intranets. The resulting knowledge can be queried, and
Open Sahara can present the query results as feeds, charts, or even
overlays on a map. This article will show how to create map
overlays.
We will use the OpenLayers javascript library to display a map, and use the Open Sahara REST-based query API to show matching documents as a cluster layer on top of that map. Such a cluster layer is a nice way to show which areas of a region have many matches and which areas have few matches. Knowledge of the OpenLayers javascript library is assumed. The article demonstrates to Open Sahara REST API to represent search results as a map cluster-layer.
Posted at 09:21
[http://purl.org/stuff/true and http://purl.org/stuff/false]
I'm sure they already had URIs somewhere before (http://dbpedia.org/resource/True is nearly there...) but it seemed a nice idea to give them some solid (?) semantics too, fortunately there's at least one media type available - so it's "true as in Javascript". Took a few minutes to set up to give that media type. Tried PHP first but it doesn't seem properly configured on this server (which is weird, I'm sure I've got PHP stuff live). Anyhow, Apache2 config for hyperdata included mod_python from who-knows-when, so I used that.
true.py is:
from mod_python import apache
def index(req):
req.content_type = "application/javascript"
return "true"
with .htaccess (same dir) as:
RewriteRule ^true$ true.py
- plus corresponding stuff for false.py.
I don't like the way the PURL redirects, can that be done transparently I wonder, keeping the same URI in the address bar?
Posted at 18:27
Browsers have certainly evolved since the WorldWideWeb browser in 1990 into pretty sophisticated pieces of kit, supporting rich views of HTML and many other media types, along with a powerful version of code-on-demand through Javascript. But in certain respects they're still very primitive. It was probably unavoidable, but there's a significant conceptual gap between what the browser can do as a general-purpose tool and what it can do as a container for site-localized Web Applications. Take Gmail as an example of the latter - very much the same ballpark as desktop mail applications. But move away from that domain and all gmail's functionality becomes inaccessible. We're still a way off a genuine Web of Applications.
One obstacle to maximizing the Webiness of Web Applications is found around the way buttons are used, directly mimicking the behaviour of desktop applications. But on the Web, the best affordances are associated with links (i.e. URIs + HTTP). In this context we should expect more of Web Applications - that the application should be built primarily as a Web API, i.e. a regular Web Site, so that the affordances are available to other applications. It should be trivial for me to check the contents of my gmail inbox from the comfort of my own Home Page. However I'd hazard that the business models of the big Web brands are likely to hinder development in these directions - Google, Facebook, Amazon etc. run somewhat counter to the open Web, in that they are motivated in keeping you in their domain (or in extreme cases like Apple and MS, in their own devices). Web Intents seem to me to be a good start towards enabling more flexible yet uniformly accessible interactions.
Over the years the read/write nature, or rather lack of it, has been discussed an awful lot. Even though the first browser included an in-place page editor, the current model still doesn't really support this. One big reason for this is that HTML - even HTML5 - falls short of supporting the full range of HTTP methods. The predominant approach to writing to the Web is through major intermediation by Content Management Systems. While CMSs are generally a very good idea, the fact that they're built on an effectively hobbled client means they aren't as Webby as they should be. There are genuine technical obstacles to generic writeability, notably those related to authentication and authorization, though hopefully WebID will help there.
The metaphor of the browser is itself quite limiting. Generally we only have one Web document open - visible on the screen - at a time because we can only read one thing at a time. Even with the development of tabs, the browser still essentially reflects this modal model of Web resources. I think I read about work on accessing data across tabs, but as far as I can see it doesn't exist yet. Ok, desktop applications are also under the restriction that we can only look at one thing at a time. But it's a lot more common to interact with multiple independent data sources/sinks and processing components there.
A browser can pretty much support a general-purpose HTTP client (through script), but because we're so used to thinking in terms of the Web of Documents and requirements there, the one page at a time modality is deep in the mindset. Service mashups, be they client or server-side, all really aim towards focusing down on a (primarily read-only) single-document view. A critical aspect is that traditionally the link has basically just one meaning - navigate to another page (do a GET and display the results). But while a link on a Web of Data could correspond to the same thing, it could also mean 'GET the data and merge it into the local store' or 'use this URI to filter the current view' or any number of pivot-like operations.
Ok, this is in danger of turning into another rant...let me sidestep by highlighting one specific browser affordance.
The Turn-Around Button?
One link-oriented metaphor for the browser Back Button could be walking down a footpath, junctions in the footpath corresponding to the available links presented to us on a Web page. Clicking the back button has the effect of walking backwards to the previous junction. We are still facing in the same direction. But what if we metaphorically turn around? Ok, the outlinks on the page will look just the same, but other data is available, our whole history. Why not present the current page alongside a recent history page (like chrome://history/) so we can hop back further - a turn around button. Yes, the Back Button may drop down a list showing the history, but richer information could be provided in the main window such as the links followed as a tree.
From a data perspective, variations on a Back Button might mean 'remove this data from the local store' or simply 'Undo'.
Dunno. Work needed on RDFAffordances.
See also: Identifying Applications State
Posted at 11:29
Together with REEEP (Renewable Energy and Energy Efficiency Partnership) the Semantic Web Company (SWC) has composed a fundamental publication on the topic of Linked Open Data.

Linked Open Data: The Essentials provides answers to the following key questions:
Read more about this publication and find out how to obtain a copy.
Posted at 08:18
Posted at 22:01
That's Descriptions of Runtime Klasses. Some simple Java for getting RDF out of code trees.
The RDF can be used to generate class diagrams, like this:

An interesting aspect of the Web Beep project is processor pipelines. To optimize things I needed to play with parameters easily so wound up building a system interface covering the processors and pipelines. As it stand in the source now, the configuration is set up from Java structures. But to see what the configuration is, a recursive toString() on the Java structures yields a fairly structured text description of the configuration (there's an example on the How It Works page).
This led me to think that if such descriptions could be used to describe existing configurations, they could also be used to set up those configurations. The format's ad hoc, so first it made sense to look at using something standard. The processor pipelines are essentially graphs (with annotations) so RDF was naturally the hammer I chose. The general processors/pipelines model is encoded (better word?) in the Java class structure, so if I could get that in RDF it'd be a good start. It's general-purpose stuff so I've split it off as a separate project at github and given it a silly name.
This kind of thing's been done before, in fact I'm hoping to incorporate David Huynh's doclet (for use with Javadoc to generate RDF) as well in the near future. But that approach gets its data 'statically' from the source, whereas the parameters at runtime are important for Web Beep's processors etc. I've made a start on the write-up with the code (ermm, Javadoc's todo :), but one key thing is just using a describe() method in the kind of places you might use a toString(). It should return a snippet of Turtle-syntax RDF describing the object in which it appears. I've also made a start on some easy-to-use utility methods that use reflection to extract a description of objects which doesn't rely on them having a describe() method, bit of a lighter touch.
As a sanity check on the generated RDF I made a (pretty trivial) SPARQL query with XSLT transform to GraphViz dot format, the result of which can be used (with straightforward command-line tools) to generate images like the one above. [I remembered half way through that Redland's rapper utility can output dot format, but that's RDFy (see screenshot) and I'm after something much more app-specific.] There's a little script which shows how the image was arrived at.
Posted at 21:12
It’s been a while coming, but (finally), I’d like to present this 3rd article in the series. Together, we’re building a Linked Data application to map the English Indices of Multiple Deprivation Stats. The final source code for the app can be found on github. If you missed the last blog post in the series, you can find it here.
Note: These articles use Github gists for showing code-snippets, and if you’re viewing this in a feed reader they might not show up. So I recommend you read it on the web instead.
The basic principle of the app is that the user pans or zooms or searches by postcode to change which bit of the map they are looking at. The app detects that, retrieves the required data using SPARQL, then draws it on the map. You can play with the finished app on the OpenDataCommunities site here.
In the last article we explored how the deprivation data was stored, wrote a template SPARQL query, and drew a standard Google map centred on a starting location.
By the way, I’ve created a new Github repo especially for this blog-series, so that you can watch the application evolve through the git commits. Everything that we did in parts 1 and 2 are in commit bce302.
In this article, we’ll write some JavaScript to build and execute the right SPARQL query (based on the template from last time), to retrieve the deprivation data about the LSOAs currently displayed on the map. The code from this article is in commit 17e211.
On the web server on which OpenDataCommunities is hosted, we’ve enabled Cross-Origin Resource Sharing (CORS), so that Ajax requests for the data can be made from sites not on the same domain. However, for this to be honoured, you’ll need to host the app on a web server (such as Apache), while you develop it. Just opening the html from your disk in a browser won’t work.
Much of the code we’ll write in this article will be in a file
called map-manager.js, in the javascripts/swirrl
directory. As you might expect, the MapManager will be responsible
for dealing with the interaction with the map. The following code
snippet explains the structure of the file:
We’re going to add some code to the main JavaScript closure (in
the HTML file) which will listen for
Google maps idle events (i.e. when the map stops being
zoomed or dragged).
When we notice that the map has become idle, we’ll ask the
MapManager to refresh the map.
The MapManager will report back to the main
closure, using jQuery events, so that it can start listening again
for idle events.
The constructor for the MapManager takes the Google
Map object and the initial score domain (from the drop down) as
it’s parameters, and sets some stuff up.
Interesting things to note:
this into a variable called
self so that when this gets set to other
things (in jQuery callbacks etc), we can always get a reference to
the current MapManager object.lsoaDataRetrieved event will be triggered by
another function when the deprivation data has been retrieved and
is ready for use. For now, we’re just going to log out the results
(there’s a gist later on with an example log-output), and tell the
calling code that we’ve finished.The refresh function (along with the constructor)
will form the public API for the
MapManager. Let’s add it to the prototype:
started event to tell others that we’re starting the
process of refreshing the map.getTiles function (for brevity, not included
in this article – see the source on github for details*),
interrogates the google map and determines which 0.1×0.1 lat/long
tiles are visible in the viewport.getTiles with the set of tiles from the previous time
refresh was called. The reason we do this is for
efficiency: we don’t need to request data for tiles for which we
already have data.getLsoaData, passing in the set
of tiles currently visible.The getLsoaData function is responsible for
building the right SPARQL query to call
(based on the template query in the previous article), and
executing it against the OpenDataCommunities
SPARQL endpoint.
That code might look a bit complicated, but it’s not really. Let me break it down a bit.
buildSparql is a nested function (as it wont be
needed outside the scope of getLsoaData). It’s
responsible for interpolating the bottom-left and top-right
lat/long values of tiles into the template SPARQL query.callAjaxSparqlPaging is another nested function,
which calls itself recursively until all the pages of data have
been retrieved (the SPARQL endpoint will
return the data in 1000-result ‘pages’).$.ajax
success callback), we call setLsoaData
which just sets the data for an LSOA
(such as label, centroid lat/long, score, and URI) into a nested object (with the top level
properties being the tile’s corners, and the inner objects’
properties being the notation of the LSOA
(e.g. ‘E01005061’). See the example log-output later on for an
example of this structure.getLsoaData function,
calls the SPARQL query for each tile in
our list of tiles.As I mentioned earlier, the main JavaScript closure in the
HTML file is responsible for
instantiating a MapManager and calling
refresh. Let’s see what that looks like:
bindMapIdle function, and then call it. As I described
at the top of the article, this calls the refresh
function when we see that the map is idle.started event listener makes a note of the
time, and then removes the idleListener if it exists (so that we
only call refresh once at a time). It also shows the busy
‘spinner’.refresh has finished, we log how
long it took, hide the spinner, and re-bind the idle listener.zoomToWide and zoomOK handlers,
just show and hide the zoom warning message (shown if the user
zooms out too much and we can’t show all the data).We’re now ready to see what all our code does.
If you’ve been following along, just open the map.html file (from your web server) in your browser. (If you’ve not been following along, just check out the code from this commit in Github).
Open your browser’s debug/console window (i.e. Web Inspector in Chrome/Safari, or Firebug in Firefox), and hit refresh. You should see a bunch of log-output lines including information on what requests were made against SPARQL endpoint and how long it took (“busy duration”).
Just before the busy duration message, there should be an entry
that looks like this: [>Object] (in Webkit-based
browsers, at least). Click the triangle to expand the object, and
you can see what data we have for the LSOAs with centroids in the
current viewport. (This is what is logged out from the
lsoaDataRetrieved handler in the
MapManager constructor.)
As I mentioned earlier, the top-level properties are the tiles. Each tile is an object whose properties which correspond to the LSOAs. Each LSOA contains data such as it’s label, the lat/long, the score for the currently selected domain, and the URI of the LSOA.
For example:
Try dragging and zooming the map, and watch what happens in the console window. The further zoomed out you are, the more tiles of data you will see.

In the next instalment (I promise not to leave it so long this time), we’ll get the boundary information for all the LSOAs and plot that on the map as polygons. If there’s time, we’ll also make these interactive.
*To be honest, I’m not particularly proud of this bit of code – it’s pretty hacky, but it does the job!
Posted at 14:03
What they means?
SOPA (Stop Online Piracy Act) was introduced in the United States House of Representatives on October 26, 2011.
PIPA (Protect IP Act) is a proposed law with the stated goal of giving the US government and copyright holders additional tools to curb access to "rogue websites dedicated to infringing or counterfeit goods", especially those registered outside the U.S.
Visual Summary
Source: AmericanCensorship.org
Response of the world
On the 18th of January the world's biggest online pages blacked out for a day to protest against censorship (See images below).
Google (Search engine)

Firefox (Browser)

WordPress (Blog engine)

Facebook (Social network)
Read
Facebook page about SOPA:
Mark Zuckerberg
The internet is the most powerful tool we have for creating a more open and connected world. We can't let poorly thought out laws get in the way of the internet's development. Facebook opposes SOPA and PIPA, and we will continue to oppose any laws that will hurt the internet. The world today needs political leaders who are pro-internet. We have been working with many of these folks for months on better alternatives to these current proposals. I encourage you to learn more about these issues and tell your congressmen that you want them to be pro-internet.
Impact that SOPA and PIPA can cause
Source: khanacademy.org
When will be any decision?
On Jan 24th, Congress will vote to pass internet censorship in the Senate.
Posted at 09:02
JEdwards is a little sub-project I've just been putting together in Java. Screenshot.
It's so named for two reasons:
Having said that, it does have a couple of features that may be of interest to sane developers:
Neither are entirely finished, but both are useable/reusable (Apache 2 license, or somesuch).

I've been using Eclipse for most of my dev stuff for years now. When I was doing things in Node.js I wound up configuring it to have a file explorer pane, a text editor pane (for Javascript, HTML, Turtle or SPARQL) and three terminal panes all connected to the local shell. Eclipse was basically a (slow) sledgehammer to crack a nut. I did spend a while looking for a way of setting these things up using separate apps, but was beaten by the problem of pinning the windows to the workspace. I believe it should be possible using Devil's Pie or similar, but I had no joy. But as it happened I wanted a terminal emulator in Java anyhow and had played with syntax highlighting before.
In Scute I'd put together some basic highlighting for Turtle, except when I came to look at it again it was a bit too hardcoded to reuse, and Javascript is quite complicated... Looking around I came across jsyntaxpane, which is a pluggable highlighter which takes its config from a JFlex lexer. It'd got the necessary for Javascript, so I decided to use that instead of my hacky code. I found a SPARQL/Flex file on the Web that someone had prepared for IntelliJ IDEA which although was geared to do other things saved me a bit of time writing out the SPARQL patterns. Here's sparql.flex.
For the terminal emulator I started with the JConsole UI from BeanShell, to which I've adding the bits which talk to the bash shell. It works ok on this Ubuntu machine, I've no idea what would be needed to set it up for a different OS. The source for that is here.
I started Scute, a desktop RDF toolkit, just over a year ago. I did get some bits working fairly well - I was using the SPARQL bits for real - but then I got distracted and left it largely unusable... This JEdwards bit of coding has got me back into it, and tightened up how I was thinking about the dev process. I must write this up properly. The main idea is, while it should be built from reusable components, the way it's setup as a whole will be optimized for how I want to work. Somewhat inspired by woodcarving, where a lot of the time what's best isn't a general purpose tool (wood router or software IDE) but a highly focused tool (1/4" No.4 fishtail gouge or JEdwards). If the resulting code is useful for other people, great, but the motivation isn't to create a product, just to help my own personal workflow. Horse before cart dogfood.
The reusable components part comes from testing. I'm lazy about tests at the best of times, and Scute is all about GUI so is a bit tricky to test. But I reckon component-level functional tests make a fair a substitute for unit tests. Anyhow, more about this another day.
Posted at 18:12
W3C today published the final report of the Linked Enterprise Data Workshop, hosted by W3C on the 6-7 December in Cambridge, MA, USA. This workshop provided a way for the community to meet and discuss some of the challenges when deploying application relying on the principles of Linked Data. The presentations covered many different topics, ranging from the benefits a set of additional conventions would bring to specific technical issues such as the challenges of dealing with the reality that URLs do change sometimes, as well as the need for a more robust security model, and specific gaps in the current set of standards.
Participants of the Workshop agreed that W3C should create a Working Group to define a “Linked Data Platform”. This is expected to be an enumeration of specifications which constitute Linked Data, with some small additional specifications to cover specific functionality such as pagination. We anticipate a draft charter will be available in the coming weeks.
Posted at 14:28
As PoolParty Team is present at SemTechBiz Berlin 2012 (February 6-7), we want you to join us. This is why we have issued a little lottery to give away a full conference pass (€795) plus our unique PoolParty Cocktail Shaker in a set
How to enter the
SKOSsy-lottery:Together with our PoolParty Suite, we are ready to present SKOSsy on our booth at SemTechBiz Berlin 2012 Exhibition area. SKOSsy is a handsome tool, which generates SKOS based seed-thesauri in German or in English by extracting data from DBpedia. See our finger exercise on a thesaurus describing the world of Alan Turing – done with SKOSsy.
Let us know, which knowledge realm you are interested in and join the lottery now. Good luck, and see you in Berlin.
Posted at 14:15
Abstract
This article wants to clarify some basic approaches about motivation. How the main motivating factors working, and how ot give positive and negative feedback as a leader.
What motivation factors motivates?
What kind of skills the motivation factors boost?
First see the following video about the "Candle problem"
TED - Daniel Pink on the surprising science of motivation
Money narrows your focus, and can make any change on your mechanical skills. Real world problems cannot be solved only with mechanical skills
Choose the recognition level in the company
Recognition level means, when you will say as a leader to the employee that "this is good job".
"This work is mediocre. I would be embarrassed to show this." (Steve Jobs)
or
"This is incredible! Really, insanely great! You are a star!" (Steve Jobs)
Working with Steve Jobs means, that you should do extraordinary good work. The level of recognition in this case is very high. This working culture suggests that you should outperform your boundaries.
Excellence, perfect, strong
Limit negative feedback within the company, teach the leaders to correct feedbacking
The basic rules of giving negative feedback as a leader are the following:
Give positive feedback
Yes, you should also give positive feedbacks, because one positive feedback could be stronger than a negative one. We can teach people with positive feedbacks, and there are also some rules of it:
Posted at 12:32
Speaking on the phone to my brother, I told him about the Listy Thing I've been working on, he pointed me to workflowy. It's an outline/list todo thing that already does a big chunk of what I had in mind for Listy Thing (quite funny they've also got a 'y' on the end). The UI is awesome, which on the one hand is inspiring in demonstrating feasibility, on the other scary, showing how far I have to go.
It is basically what I'm after, only I want something backed by RDF so that more data can be associated with nodes (especially nodes which correspond to Web resources), the data can be reused, and many alternate views are possible.
I'm still a little stuck on the fundamental question of how best to represent lists, I guess I just have to try things out. Had some good suggestions on the G+ page - there's even an Ordered List Ontology.
The issue's a bit conflicted, because on the one hand useful ordering is generally tied to some particular property (e.g. dc:date) so the list structure can be generated on demand (via SPARQL or whatever), no additional ordering is needed in the data. But then as far as user experience is concerned, as a list is being put together the order can be totally arbitrary - i.e. there is an order, only we're not quite sure what it is yet. This might suggest using rdf:List as a general purpose mechanism.
I think I'll try some kind of low-cost property (with a numeric value). So a property, which after all is just another kind of resource, gets minted when the list is created in the UI. Ideally I suppose it'd be a bnode but a quasi-disposable URI will do. Dunno, give it an rdfs:label on the fly and associate it with user/date of creation?
I use the namespace http://purl.org/stuff# for "disposable" classes and properties (feel free to follow suit). They're Cool URIs in the sense that they'll always resolve (although I must add RDF docs to that URI), disposable in the sense that they appear in instance data but won't have any more definition.
Posted at 08:25
It was with great sadness that I learned yesterday of the passing of my friend and colleague, Greg Leptoukh. Greg was a physical scientist at the NASA Goddard Space Flight Center and a coordinator of the Earth Science Information Partners (ESIP) Federation Information Quality cluster. Greg was dedicated to leveraging information systems to improve the usability of data for scientists; reducing technical barriers for data use and improving user comprehension of data generation and use.
I had the pleasure to work with Greg on the Multi-Sensor Data Synergy Advisor (MDSA) a prototype semantic extension to the already successful Giovanni online data anaylsis tool. Giovanni has proven to be a successful tool for reducing the technical barriers in science data processing, analysis, and visualization and information provided through Giovanni has played a role in over 400 science publications to date. With MDSA, Greg intended to show how Giovanni could be instrumented to provide provenance, quality, and expert knowledge about data to interested users. Greg was extremely enthusiastic about the potential of semantic technologies to power these enhancements; ontologies to describe concepts important to data generation and use and rules to expose and explain scenarios that may lead to misunderstood analysis results. I will always admire Greg’s enthusiasm for what we had been able to accomplish, and what we would be able to accomplish in future projects. Greg clearly saw what were were doing as a means to empower scientists, a noble goal if ever there was.
I am thankful for having had the opportunity to have worked with Greg, and incredibly sad he was not able to see the fruition of this work.
You will be missed friend.
Posted at 19:04
Posted at 19:10
The HTML Data Task Force of the W3C Semantic Web Interest Group has published two documents today:
Both documents are Working Drafts, with the goal of publishing a final version as Interest Group Notes. Comments and feedbacks are welcome; please send them to the public-html-data-tf@w3.org mailing list.
Posted at 16:03
Knowing how, where, when and why content was produced is an important part of making a trustworthy web. However, it is often difficult to interchange this provenance information between systems. For example, it’s often difficult to locate or find provenance information for a web page. Even if the provenance information is located, it is often only available as text or if it is available in a structured way it does not use a common terminology — making it difficult to create software that can leverage this information.
The Provenance Working Group was charted to help address these limitations. The group has been working diligently to create a family of specifications (called PROV) that allow for the interchange of provenance. The group is looking for your feedback. This post provides an overview of the various working drafts that have been published and should help you find your way around.
The set of specs at this point addresses two aspects of provenance interoperability introduced above:
PROV-AQ: Provenance Access and Query addresses how to both make available and retrieve provenance information for Web resources. The document specifies how to use existing Web technologies such as HTTP, link headers, and SPARQL to accomplish this. Where possible the specification attempts to be agnostic the format of the provenance being accessed.
Once some provenance is obtained, it is important for the information to be understandable in a machine interpretable fashion. The Working Group has defined a data model (PROV-DM) that provides facilities for representing the entities, people and activities involved in producing a piece of data or thing in the world. The data model is domain-agnostic and has well defined extensibility points. Importantly, the data model has a corresponding OWL ontology (PROV-O) that encodes the PROV-DM. PROV-O is envisioned to specify the serialization for exchanging provenance information.
To help orient users of PROV-O and PROV-DM, the working group has developed a primer (PROV-Primer) that introduces the core constructs of the data model and provides examples using PROV-O. It is recommended that users and reviewers of the specification begin with the primer before moving to the ontology or data model.
The group is looking for feedback of all types: Would you expose provenance using PROV-AQ? Can you represent your provenance information using the PROV-O data model? Does PROV-O integrate well with your Linked Data or other Semantic Web infrastructure?
Let us know what you think.
The PROV family of specifications:
Posted at 18:56
Too long; read later - here's a demo : SPARQL Sliders Test
+Ian Hickson posted a lovely semweb use case:
"I'd like a search tool for furniture that works like Google's Flight Search does for flights. That is, with sliders so I can say what type of furniture (table), what range of widths (1-2m), lengths (2-5m), and heights (1-2m), what material (wood), what thickness, what price range, etc, I'd like, with the list of available products updating in real time."
As it happens I wanted a slider thingy ages ago, so this was a good prompt to make a demo of the front end part which takes the values from slider components and uses them in a SPARQL query.
For convenience/lack of available data the demo runs against dpPedia via the SNORQL SPARQL Explorer. As furniture and it's dimensions wasn't available it uses cities and their populations and elevations.
So how would you get real data?
First of all, furniture vendors could either provide dumps of their data or, more Webby, mark up their sites with RDFa and/or HTML5 microdata using e.g. the GoodRelations e-commerce vocabulary.
Ultimately, for a front end like these sliders to work, the data would need to go in a store with a SPARQL endpoint. But, triplestores shouldn't be thought of as just a wacky alternative to a SQL database. A triplestore is just a cache of a little chunk of the Linked Data Web. The question of where the store resides and how the data is collected is entirely open. Following the more traditional DB model, a service might aggregate the data published by known furniture suppliers and provide the endpoint online.
But alternately, a local user agent (I think Chris Bizer had a little Java example, can't find the link...there are others) could crawl the Web to answer the query just-in-time. The advantage of this approach is that it's more thorough and the only real option for totally arbitrary queries, the downside being that it's answer will probably take longer than milliseconds. But remember triplestores are caches, not every little bit of information would have to be discovered and read from every page. There are vocabs for dataset and vocab discovery (remind me of the acronyms please :) Note too that you're not limiting your client agent to a single datastore. traditional backends (SQL or NoSQL) are effectively isolated silos, triplestores are integrated with the links of the Web.
Incidentally, this is something that might be nice to express as a Web Intent, along the lines of "make me a query from this template with these parameters and apply it to this endpoint, putting the results into this widget" (that's a bit verbose for a general-purpose intent, but you get the gist). c.f. RDFAffordances.
Posted at 14:01
The W3C Provenance Working Group has published two new documents:
Both documents are First Public Working Drafts; feedbacks and
comments are welcome! Please, use the public-prov-comments@w3.org
mailing list to provide your comments.

Posted at 15:59
Posted at 15:10
A wee rant.
Ok, I'm totally with the consensus that the future is Cloud-based, and to be a little more specific Platform-based and to be even more specific primarily HTTP-based. To back that up, cf.
But to expand something I mentioned in passing here recently :
in one respect the emperor is stark-bollock naked. Browsers are currently a really sucky environment for client development. Sure, the HTML/CSS-based (standard!) rendering is wonderful. As shown with Node.js (and despite what Google are saying around Dart), Javascript is a reasonably pleasant, perfectly capable programming language. The growth of Ajax and JSON have shown inter-system comms is workable. There are some good dev tools and libraries. So why does working with this stuff feel like pulling your own teeth?
Here I could point to the traditional DOM API, blame the W3C for all the world's ills and an awful lot of people would nod and smile knowingly. But although that's arguably valid (heh), I reckon the problem is more systematic and can mostly be blamed on browser developers.
Ok, blame is too strong. The decisions made over the years and the directions taken have generally been perfectly rational in the context of the prevailing conditions. But there have been feedback loops at work. The flashy [sic] chrome [sic] surrounding HTML dev, from the img tag onwards, has pulled Web developers in like moths around a flame. So the browser developers act to improve that experience. Meanwhile server-side tech has developed out of the corporate legacy of silo-based systems. Let me quote Steve Yegge there: "It's a big stretch even to get most teams to offer a stubby service to get programmatic access to their data and computations.". The way services are offered over the Web, even Web 2.0 services still have a big hangover from this mentality. I'd argue that most Web APIs are only marginally better than SOAPy stubs. Largely because XML and JSON aren't particularly Web-friendly. Ok, don't bite my head off, let me qualify that.
First XML. There have been plenty of arguments over the years around XHML, and back in the day (I wonder how old that phrase is) there were arguments about the XML nature of RSS. Postel's Law, the "Robustness Principle" got cited a lot. Let me give you some deja vu:
Be liberal in what you accept, and conservative in what you send.
What a lot of people misinterpreted was the keyword robust. A robust system is one designed to be able to fail gracefully or continue working acceptably with noisy data. That's exactly what we want for the Web, right? Well not necessarily, if I was ordering a book from Amazon, and there was a partial failure, I'd rather they didn't make a best-guess when it came to taking money of my credit card (I think paraphrasing Tim Bray there). Anyhow, XML is not robust, by design. XML is designed to bail out completely at the first sniff of anything dodgy. As it happens, the way XML is often served on the Web is without proper regard for the media type, i.e. dodgy and hence broken.
Sorry, that was gratuitous deviation, the real reason I'd say XML isn't Web friendly, like JSON, is in the way people use it. Whether data is conveyed as name-value pairs or through more complex structures, the key parts are generally just simple strings. But by itself, a string on the Web is next to useless. You or I can (maybe) read it, or even paste it into Google and get a definition. But what is a poor machine client to do? What makes the Web are links. It's 101 but somehow still manages to be overlooked: the link has two facets: a universally unambiguous name (URI/IRI) and a protocol for following it (HTTP). If a client on the Web encounters a link, it can follow its nose to find out more information about it. That's what we as humans do in browsers all the time, yet when it comes to Web services for some reason a simple string is seen as adequate to identify something.
Ok, with XML, the HTML DOM and to some extent JSON there's been some justifiable resistance to the use of URIs for names, because namespaces have traditionally been uninuitive at best and agony at worst. Using URIs instead of simple strings certainly adds a burden (it doesn't have to be that great, check Turtle syntax), but its benefits far outweigh the costs.
The thing is, you'll hear talk of snowflake APIs - only one implementation of each exists - but what gets overlooked is that by their very nature, most APIs just aren't Webby. The client must have prior knowledge that the service at endpoint X uses API Y. What you end up with is effectively a series of 1:1 client-server connections. That, while the uniform interface REST may mean it's less brittle than an RPC connection, still means tight coupling.
Ok, you might argue, that for any communication to take place, some prior knowledge is required. Sure, but that can be minimised - just like the way we follow links for more information in a browser, a service client can follow links to get more information. This is only a small conceptual step, but what it enables is hugely powerful. Above everything else, it's what Linked Data and the Semantic Web gets right.
I reckon that browser developers, with their emphasis on doc-oriented HTML have a natural tendency to carry their experience in that domain across and apply it to data. Naturally namespace-less XML and JSON will seem preferable through that lens. But in practice, documents and data are apples and oranges. Browsers have been optimized over the years for the former, incidentally making the latter harder than necessary.
It's funny how you don't hear so much about service mashups these days, despite their undeniable coolness. I'll assert that it's because developing for Web data in the browser is bloody hard work, especially when there are NxN arbitrary API mappings to know.
Overall it's actually something of a miracle that the notion of cloud-based platforms has emerged.
I had planned to say more about Cloud Computing Outside of the Browser - or to put it another way, evolving old-fashioned non-browser Rich Internet Clients (as well as server-server and every other non-browser configuration). But ranting's worn me out. Anyhow, in short, I reckon that for the forseeable future, non-browser clients in many circumstance are probably preferable to browser-based equivalents, primarily because they're easier to develop (as I keep saying, I reckon the agent model of combined client/server units is a good way to go). While I personally welcome HTML5 and the APIs as a clean-up of document markup and processing, when it comes to data it isn't even a Band-Aid.
Posted at 19:00
Reviewing the interview we made with Les Kneebone (project manager of the vocabulary projects at Education Services Australia) in November 2010 we can see that ESA has been one of the early adopters of SKOS as a standard for thesaurus development. Les said then: “We had already identified SKOS as an important standard for ScOT so it was natural to select PoolParty as our new thesaurus management tool”. Around a year later ESA´s vocabulary site went online with PoolParty as its basis.
We asked Les to comment on his statement from last year and he confirmed that SKOS continues to be central to the ESA vocabulary business model and that it has also been important for ESA that PoolParty has been flexible enough to support continued publication of non-RDF formats, especially IMS VDEX.
In the course of this project it became more and more obvious that SKOS cannot only be used as yet another format for publishing thesauri but rather as a unified model to build thesauri in general. This approach made possible several improvements to the vocabulary development model and the maintenance process of ESA. Since all data is stored as RDF in a triple store, and SKOS and RDF are flexible formats supporting interoperability and interchangeability of data, many manual transformations that had to be done before are not needed anymore and all other systems using the vocabularies are dynamically fed by PoolParty offering the data in its needed formats (see image below).
Les states that while some manual processes still exist to support legacy systems, PoolParty ensures the integrity and richness of ESA data. Support and customizations for legacy systems can be achieved in the confidence that the linked-data capabilities are centrally managed and stored in the PoolParty triple store.
From the publishing perspective, the previous vocabulary publishing site has been replaced by the PoolParty Linked Data Frontend (LD-Frontend) that has been customized especially for this project to offer more flexibility in the display and the layout of the data. Similar to the frontend for the Austrian Geological Survey mentioned in a previous blog post , the LD-Frontend has been adapted to the ESA styleguide and the display of the data in the HTML view of the frontend has been adapted to be more user-friendly (see screenshot below).
From ESA’s perspective Les commented here that for the vocabulary manager, edits to the frontend styles and templates are intuitive and can be tested in staging environments. But he also stated that for publishing support is important, and that SWC was very responsive.
Of course we asked Les to give a preview of the next steps for ESA. He stated that they include language translation projects so that its vocabularies, especially Schools Online Thesaurus (ScOT), can be accessed by wider markets and by students of other languages. He also stated that PoolParty handles multi-lingual thesauri very well.
We here at SWC are glad to see PoolParty used in more and more applications and usage scenarios. We are looking forward to the next steps that will be done in this project and also to see how the data offered by the ESA vocabulary site is used in other applications.
Thanks to Les Kneebone from ESA for his contribution to his blog post.
Posted at 15:29
I just heard about Dart (via Seth Ladd and Edd), a new Web programming language from Google. It aims to fulfil the role Javascript currently has, only doing it better. On the pro side, new languages are inherently cool, and Javascript can be a real pain. On the con side it seems unlikely that any browsers other than Chrome will support it in the foreseeable future, except potentially via translation to Javascript, i.e. This Page Best Viewed with Chrome
It's hard not to see echoes of the old Microsoft arrogantly pushing it's own product here (remember VBScript?), although Google have in recent years made NIH an artform. But who cares about politics, how's this going to affect the Web?
Well, Code-on-Demand does appear in Fielding's thesis (slightly bizarrely as an 'optional constraint') and has been around since the early days. Pluggable clients are certainly a good idea, and Google have been leaders in moving Rich Internet Applications as opaque desktop apps into the browser using Javascript. The apps are still pretty opaque (View Source on gmail if you doubt that) but they do at least more-or-less run cross-browser.
I've not read much of the Dart docs yet, not tried it at all, but first impressions are that it's a nice clean syntax not unlike JS (or for that matter Java, C# or Python...) and they've already got a good bunch of libs together (even if they do include RPC, yuck!).
As an aside, it should be noted that there's a cost to the standardization of today's browser as Web client (in the process of being defined via HTML5 and associated APIs). It does mean an effective monoculture of HTTP clients. Arguably you can write whatever kind of client you like (probably in Javascript) and host it inside a browser, but they have been optimized for a fairly specific app scope. If you stray from the general model of a Web of HTML Documents you're in for an uphill journey. The arbitrary desktop client has more freedom to use HTTP more creatively, but then there won't be one on everyone's desktop. (Personally I like the notion of Web agents (where an agent = client + server + persistence + code) as an abstraction for Web components, as in "Two Webs!" [pdf - heh]. I wonder, is there a HTTP server in Dart yet?)
Looking at the "Leaked internal dart email" (as with UK politics, it's probably sensible to take the "Leaked" aspect with a pinch of salt), there does seem to be some motivation for Dart coming in response to the success of iOS. I'm pretty sure a new language isn't the best response to this, but it certainly makes a change to the usual big proprietary Flash/Silverlight kind of issues. Google are still talking of evolving Javascript, but it does raise the question of what Dart will offer that couldn't be achieved using JS. Optional typing is the feature they seem to be plugging most. So I wondered if anyone had worked on adding static types to JS. Funnily enough, the first few hits refer to iOS. Oh dear, we're really not talking iOS envy, are we?
It's a little surprising that Google haven't thrown their expertise at the JS-is-a-mess issue previously, I don't see a groundbreaking dev tool and pattern library out there (funnily enough the Dart Editor is based on Eclipse, which does seem a bit un-groundbreaking (although I'm not criticising the choice, Eclipse is my main IDE)).
Whatever, it should be interesting to watch how this pans out. Dart will almost certainly be a very cool language, albeit engendering ambivalence everywhere outside Google. Give me a shout when it includes libs for non-HTML Web languages (i.e. gimmee RDF :)
Comments (G+)
Posted at 19:48