Planet RDF

It's triples all the way down

January 07

W3C Read Write Web Community Group: Read Write Web — Q4 Summary — 2017

Summary

TPAC 2017 kicked off in California, achieving its highest attendance to date, with some saying it may have been the best TPAC ever.  Extensive Strategic Highlights were also published.

HTML 5.2 is now a recommendation, with HTML 5.3 coming.  Work on payments made progress with some demos presented at money 20/20.  There was also a nice look ahead to 2018 and semantic web trends from Dataversity.

A slight pickup in activity this quarter in the community group.  With around 75 messages, almost double the previous quarter.  More details below.

Communications and Outreach

Aside from Fedora, there was some outreach this quarter with CERN, where it all began, with a view to possibly reuse Web Access Control.

 

Community Group

In the CG there were calls for team members with Sebastian Samagura starting a project.  An announcement of the Fedora API Spec CR and a fantastic post by Ruben entitled, “Paradigm Shifts for the Decentralized Web“.

solid

Applications

By convention it’s been decided to try to add the ‘solid-app’ tag to new and existing apps for the solid platform, which will hopefully allow apps to become searchable.  One new app, twee-fi, was written from scratch quite quickly and seems to work with all our servers.  There is a general move to patch existing apps to follow this pattern so that both OIDC and TLS auth can be leveraged.

There have been updates to rdflib.js, solid-ui and solid-app-set.  I also made a small console based playground, solid-libraries, that shows how these libraries fit together, and allows a few commands as examples.  Additionally I have started trying to patch apps to use the new auth system starting with the pastebin tutorial.  Hopefully more apps will be patched this quarter.

Last but not Least…

The OWL Time ontology is now a W3C REC. The ontology provides a vocabulary for expressing facts about topological (ordering) relations among instants and intervals, together with information about durations, and about temporal position including date-time information.

Posted at 08:15

January 03

W3C Data Activity: W3C study on Web data standardization

The Web has had a huge impact on how we exchange and access information. The Web of data is growing rapidly, and interoperability depends upon the availability of open standards, whether intended for interchange within small communities, or for use … Continue reading

Posted at 16:13

December 31

Bob DuCharme: SPARQL and Amazon Web Service's Neptune database

Promising news for large-scale RDF development.

Posted at 14:53

December 21

John Goodwin: Using Machine Learning to write the Queen’s Christmas Message

In their excellent book “The Indisputable Existence of Santa Claus”

Posted at 11:00

December 18

AKSW Group - University of Leipzig: SANSA 0.3 (Semantic Analytics Stack) Released

Dear all,

We are happy to announce SANSA 0.3 – the third release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
  • Reading OWL files in various standard formats
  • Support for multiple data partitioning techniques
  • SPARQL querying via Sparqlify (with some known limitations until the next Spark 2.3.* release)
  • SPARQL querying via conversion to Gremlin path traversals (experimental)
  • RDFS, RDFS Simple, OWL-Horst (all in beta status), EL (experimental) forward chaining inference
  • Automatic inference plan creation (experimental)
  • RDF graph clustering with different algorithms
  • Rule mining from RDF graphs based AMIE+
  • Terminological decision trees (experimental)
  • Anomaly detection (beta)
  • Distributed knowledge graph embedding approaches: TransE (beta), DistMult (beta), several further algorithms planned

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • There is example code for various tasks available.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular the projects Big Data Europe, HOBBIT, SAKE, Big Data Ocean, SLIPO, QROWD and BETTER.

Greetings from the SANSA Development Team

Posted at 10:15

December 17

Egon Willighagen: Winter solstice challenge #2: first submission

Citation data for articles colored by
availability of a full text, as indicated
in Wikidata. Mind the artifacts.
Reproduce with this query and the
new GraphBuilder functionality.
Bianca Kramer (of 101 Innovations fame) is the first to submit results for the Winter solstice challenge and it's impressive! She has an overall score based on her own publications and the first level citations of 54%!

So, the current rankings in the challenge are as follows.

Highest Score

  1. Bianca Kramer

Best Tool

  1. Bianca Kramer

I'm sure she is more than happy if you use her tool to calculate your score. If you're patient, you may even wish to take it one level deeper.

What are you talking about??
Well, the original post sheds some idea on this, but basically scientific writing has become so dense, that a single paper does not provide enough information. But if you cannot read the cited papers, you may not be able to precisely reproduce what they did. Now that many countries are steadily heading to 100% #OpenAccess it is time to start thinking about the next step. So, is the knowledge you built on also readily available or is that still locked away.

For example, take the figure on the right-hand side: it shows when articles are published that I cited in my work (a subset, because based on data in Wikidata, using the increasing amount of I4OC data). We immediately see some indication of the availability of the cited papers. The more yellow, the more available. However, keep in mind that this is based on "full text availability" information in Wikidata, which is very sparse. That makes Bianca's approach so powerful: it uses (the equally wonderful) oadoi.org.

You also note the immediate quality issues. Apparently, this data tells me I am citing articles from the future :) You also see that I am citing some really old articles.

Posted at 06:56

December 16

Ebiquity research group UMBC: Videos of ISWC 2017 talks

Videos of almost all of the talks from the 16th International Semantic Web Conference (ISWC) held in Vienna in 2017 are online at videolectures.net. They include 89 research presentations, two keynote talks, the one-minute madness event and the opening and closing ceremonies.

Posted at 23:15

December 15

Dublin Core Metadata Initiative: CEDIA Joins as new Regional Member

I am delighted to report that the Corporación Ecuatoriana para el Desarrollo de Investigación y la Academia (CEDIA) has agreed to join DCMI as a regional member. CEDIA is a national membership organisation which represents higher-education and research institutions in Ecuador. The organisations has a keen interest in metadata standardisation - especially in the domain of data repositories, and has been active in this area since 2009. CEDIA is a strong proponent of Open Access, and encourages and supports the adoption of interoperable metadata and specialised metadata vocabularies among its member organisations.

Posted at 10:56

December 14

W3C Data Activity: W3C Workshop on Linked Data and Privacy

W3C is inviting position papers for a workshop on data controls and linked data vocabularies to be held in Vienna, Austria on 7-8 March 2018. This is motivated by the challenges for addressing privacy across an ecosystem of services involving … Continue reading

Posted at 11:26

W3C Data Activity: Dataset Exchange WG publishes use cases and requirements

The Dataset Exchange Working Group (DXWG) is pleased to announce the publication of the First Public Working Draft of the Dataset ExchangeUse Cases and Requirements. The working group will produce a second version of the Data Catalog(DCAT) Vocabulary, guidance for … Continue reading

Posted at 11:14

December 01

Libby Miller: #Makevember

@

Posted at 16:38

November 29

Ebiquity research group UMBC: paper: Automated Knowledge Extraction from the Federal Acquisition Regulations System

Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS)

Srishty Saha and Karuna Pande Joshi, Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS), 2nd International Workshop on Enterprise Big Data Semantic and Analytics Modeling, IEEE Big Data Conference, December 2017.

With increasing regulation of Big Data, it is becoming essential for organizations to ensure compliance with various data protection standards. The Federal Acquisition Regulations System (FARS) within the Code of Federal Regulations (CFR) includes facts and rules for individuals and organizations seeking to do business with the US Federal government. Parsing and gathering knowledge from such lengthy regulation documents is currently done manually and is time and human intensive.Hence, developing a cognitive assistant for automated analysis of such legal documents has become a necessity. We have developed semantically rich approach to automate the analysis of legal documents and have implemented a system to capture various facts and rules contributing towards building an ef?cient legal knowledge base that contains details of the relationships between various legal elements, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules. In this paper, we describe our framework along with the results of automating knowledge extraction from the FARS document (Title48, CFR). Our approach can be used by Big Data Users to automate knowledge extraction from Large Legal documents.

Posted at 01:56

November 25

Leigh Dodds: Data assets and data products

A lot of the work that we’ve done at the ODI over the last few years has involved helping organisations to recognise their data assets.

Many organisations will have their IT equipment and maybe even their desks and chairs asset tagged. They know who is using them, where they are, and have some kind of plan to make sure that they only invest in maintaining the assets they really need. But few will be treating data in the same way.

That’s a change that is only just beginning. Part of the shift is in understanding how those assets can be used to solve problems. Or help them, their partners and customers to make more informed decisions.

Often that means sharing or opening that data so that others can use it. Making sure that data is at the right point of

Posted at 11:56

November 24

Leigh Dodds: We CAN get there from here

On Wednesday, as part of the Autumn Budget,

Posted at 20:39

November 23

Dublin Core Metadata Initiative: Webinar: Save the Children Resource Libraries

update: Branka Kosovac will join Joseph Busch in presenting this webinar DCMI is pleased to announce a new webinar: Save the Children Resource Libraries: Aligning Internal Technical Resource Libraries with a Public Distribution Website. Presented by Joseph Busch, Founder of Taxonomy Strategies and Branka Kosovac, Taxonomy Strategies associate and Principal of dotWit Consulting, the webinar will discuss a recent project which has established an internal library of technical resources at the international Save the Children charity.

Posted at 10:56

November 19

Bob DuCharme: SPARQL queries of Beatles recording sessions

Who played what when?

Posted at 15:40

October 29

Bob DuCharme: An HTML form trick to add some convenience to life

With a little JavaScript as needed.

Posted at 15:07

October 28

Leigh Dodds: The state of open licensing, 2017 edition

Let’s talk about open data licensing. Again.

Last year I wrote a post, the

Posted at 12:42

Dublin Core Metadata Initiative: Director Transition

2017 has been a year of transition, with Stuart Sutton stepping down after six years at the helm of DCMI as its Managing Director. He has been succeeded by Paul Walk. DCMI would like to extend its gratitude to Stuart for his tremendous service, during a challenging period. Stuart is known for his warmth and generosity, which combined have made so many people feel welcomed into the DCMI community. Some of our community were able to contribute their personal thoughts and messages for Stuart, and we compiled these into this document, which we had printed for Stuart in a little book to serve as a memento of his time as Director.

Posted at 10:19

October 26

Ebiquity research group UMBC: W3C Recommendation: Time Ontology in OWL

W3C Recommendation: Time Ontology in OWL

The Spatial Data on the Web Working Group has published a W3C Recommendation of the Time Ontology in OWL specification. The ontology provides a vocabulary for expressing facts about  relations among instants and intervals, together with information about durations, and about temporal position including date-time information. Time positions and durations may be expressed using either the conventional Gregorian calendar and clock, or using another temporal reference system such as Unix-time, geologic time, or different calendars.

Posted at 16:54

October 23

Leigh Dodds: What is a Dataset? Part 2: A Working Definition

A few years ago I wrote a post called “

Posted at 19:44

October 16

Ebiquity research group UMBC: Agniva Banerjee on Managing Privacy Policies through Blockchain

Link before you Share: Managing Privacy Policies through Blockchain

Agniva Banerjee

11:00am Monday, 16 October 2017

An automated access-control and audit mechanism that enforces users’ data privacy policies when sharing their data across third parties, by utilizing privacy policy ontology instances with the properties of blockchain.

Posted at 17:42

October 15

Egon Willighagen: Two conference proceedings: nanopublications and Scholia


The nanopublication conference article in
Scholia.
It takes effort to move scholarly publishing forward. And the traditional publishers have not all shown to be good at that: we're still basically stuck with machine-broken channels like PDFs and ReadCubes. They seem to all love text mining, but only if they can do it themselves.

Fortunately, there are plenty of people who do like to make a difference and like to innovate. I find this important, because if we do not do it, who will. Two people who make an effort are two researchers who recently published their work as conference proceedings: Tobias Kuhn and Finn Nielsen. And I am happy to have been able to contribute to both efforts.

Nanopublications
Tobias works on nanopublications which innovates how we make knowledge machine readable. And I have stressed how important this is in my blog for years. Nanopublications describe how knowledge is captures, makes it FAIR, but importantly, it links the knowledge to the research that led to the knowledge. His recent conference proceedings details how nanopublications can be used to establish incremental knowledge. That is, given two sets of nanopubblications, it determines which have been removed, added, and changed. The paper continues outlining how that can be used to reduce, for example, download sizes and how it can help establish an efficient change history.

Scholia
And Finn developed Scholia, an interface not unlike Web-of-Science. But then based on Wikidata and therefore fully on CCZero data. And, with a community actively adding the full history of scholarly literature and the citations between papers, courtesy to the Initiative for Open Citations. This is opening up a lot of possibilities: from keeping track of articles citing your work, to get alerts of articles publishing new data on your favorite gene or metabolite.

Kuhn T, Willighagen E, Evelo C, Queralt-Rosinach N, Centeno E, Furlong L. Reliable Granular References to Changing Linked Data. In: d'Amato C, Fernandez M, Tamma V, Lecue F, Cudré-Mauroux P, Sequeda J, et al., editors. The Semantic Web – ISWC 2017. vol. 10587 of Lecture Notes in Computer Science. Springer International Publishing; 2017. p. 436-451. doi:10.1007/978-3-319-68288-4_26


Nielsen FÃ, Mietchen D, Willighagen E. Scholia and scientometrics with Wikidata. arXiv.org; 2017. Available from: http://arxiv.org/abs/1703.04222.

Posted at 11:42

October 03

W3C Read Write Web Community Group: Read Write Web — Q3 Summary — 2017

Summary

A relatively quiet quarter in the community group, however, some good progress has been made behind the scenes.

There was a new release, this quarter, to schema.org.  The Shapes Constraint Language (SHACL) is now a W3C Recommendation and ActivityPub is also a candidate recommendation.

There was light discussion on the mailing list, however, there was also a major release of node solid server (4.0.0) which includes a new authentication method, WebID-OIDC.

 

Communications and Outreach

There was some outreach the folks at the Rebooting the Web of Trust workshop, and I’ll also be talking to the team from Remote Storage, about reusing JSON-LD as part of their model for reading and writing to the web with HTTP verbs.

 

Community Group

A quiet quarter in the Community Group tho Sebastian Samaruga has been investigating the idea of Semantic Business intelligence and has started a blog and codebase on the subject.  There was also testing of the new version of node solid server with some success.

 

solid

Applications

As mentioned above node solid server 4.0.0 now has WebID-OIDC support, N3 patches and many other features.

A new version of solid auth client was released in which will allow login via both TLS and OIDC.

It is possible to try out the latest version on the solid test server.  Which will lead the user to a default set of apps, and a data browser which both read and write data, and has quite a few built in apps which are fired off depending on the (rdf) type of data being viewed.

 

Last but not Least…

Virtuoso 8.0 was also released, with

ABAC (Attribute-based Access Controls) and the WebID+TLS+Delegation Protocol, existing open standards (such as TLS, HTTPS, URIs, and RDF) are leveraged to aid the development and deployment of new and more-agile services and solutions

Posted at 17:19

October 02

Dublin Core Metadata Initiative: Responding To The Post-truth Phenomenon

In 2016, the Oxford Dictionaries declared that their Word Of The Year was to be "post-truth", which they defined as: "relating to or denoting circumstances in which objective facts are less influential in shaping public opinion than appeals to emotion and personal belief" This decision reflects the way in which the concept of post-truth has quickly become a widely-acknowledged phenomenon and a mainstream concern. It should be of particular concern to those, such as librarians, with a professional interest in the provision of access to accurate information.

Posted at 07:46

September 27

Libby Miller: Capturing button presses from bluetooth hands free kits on a Raspberry Pi

Is there anything better than this wonky and unreliable hack for capturing keypresses from a handsfree kit?

sudo hciconfig hci0 down
sudo hciconfig hci0 up
sudo hcidump -l 1 | grep ACL

As the kit connects, I see in syslog

Sep 27 21:17:10 gvoice bluetoothd[532]: Unable to get connect data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)

Sep 27 21:17:10 gvoice bluetoothd[532]: Unable to get connect data for Headset Voice gateway: getpeername: Transport endpoint is not connected (107)

I can see it appearing as

Sep 27 21:14:29 gvoice kernel: [  827.342038] input: B8:D5:0B:4C:CF:59 as /devices/virtual/input/input6

evtest gives

sudo evtest
No device specified, trying to scan all of /dev/input/event*
Available devices:
/dev/input/event0: B8:D5:0B:4C:CF:59
Select the device event number [0-0]: 0
[...]
    Event code 402 (KEY_CHANNELUP)
    Event code 403 (KEY_CHANNELDOWN)
    Event code 405 (KEY_LAST)
  Event type 2 (EV_REL)
Key repeat handling:
  Repeat type 20 (EV_REP)
    Repeat code 0 (REP_DELAY)
      Value    300
    Repeat code 1 (REP_PERIOD)
      Value     33
Properties:
Testing ... (interrupt to exit)

but there are never any events.

(I’m asking as I have it nicely hooked up to google voice recogniser via

Posted at 21:28

September 23

Ebiquity research group UMBC: talk: Automated Knowledge Extraction from the Federal Acquisition Regulations System

In this week’s meeting, Srishty Saha, Michael Aebig and Jiayong Lin will talk about their work on extracting knowledge from the US FAR System.

Automated Knowledge Extraction from the Federal Acquisition Regulations System

Srishty Saha, Michael Aebig and Jiayong Lin

11am-12pm Monday, 25 September 2017, ITE346, UMBC

The Federal Acquisition Regulations System (FARS) within the Code of Federal Regulations (CFR) includes facts and rules for individuals and organizations seeking to do business with the US Federal government. Parsing and extracting knowledge from such lengthy regulation documents is currently done manually and is time and human intensive. Hence, developing a cognitive assistant for automated analysis of such legal documents has become a necessity. We are developing a semantically rich legal knowledge base representing legal entities and their relationships, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules.

Posted at 21:26

September 22

Tetherless World Constellation group RPI: WebSci ’17 Tutorial Note– Analyzing Geolocated Data with Twitter

Speaker:

Prof. Bruno Gonçalves, New York University

(http://www.bgoncalves.com/)

Schedule

09:00 -10:20 theory session

10:30 -12:00 practical session

Theory Session:

GPS-enabled smartphone: provides precise geographic locations

Jan,17 global digital snapshot

Social MEowDia Explained- different behaviors on different social media

Twitter:

Anatomy of a tweet: short (start as a message system), hashtag, how many times shared, timestamp, location (comes from your GPS system), background info—metadata,

Metadata:

Text-content, User, Geo, URL, etc.

Geolocated Tweets:

Follows a user’s geo info over time

GPS Coordinates vs World Population

Smartphone ownership—highest among adults, higher education/ income levels (results from survey)

Market Penetration: larger user group in higher GDP countries

Age Distribution

Demographics: ICWSM’11 375(2011)

Language and Geography: different languages show different distributions among geographic location, for example, Spanish and English distributions in NYC

Multilayer Network:

Retweet- information layers

|

Mention

|

Follower- social layers

Link Function–ICWSM’ 11, 89 (2011)

Cluster—retweets ~= agreement; mention ~= discussion

Retweets and mention have very different meanings

The Strength of Ties: chains of ties

Interviews to find out how individuals found out about opportunities

Mostly from acquaintance or friend of friends

It argued that the degree of overlap of two individual’s social networks varies directly with the strength of their tie to one another.

Neighborhood Overlap

Network Structures: arrows-retweets; cluster-different friendship communities; dots- users; people/user serves as a bridge between communities.

Links: internal, between groups, intermediary, etc.

Groups

Geography

Retweet- information layers

|

Mention

|

 Follower- social layers

|

Geographic location

Twitter follower distance

Locality: measures percentage of a user’s friend who lives in the same country.

Co-occurrences and social ties

Geotagged Flickr Photos

Divide the world into a grid, count number of cells on which two individuals were within a given interval

Measures: share photo within a period of time in the same grid – likelihood of becoming friends

Mobility: school/work—home—vacation—move to different city/country

Airline Flights: in Europe within 24h

Commuting: train, subway, bus, etc.

Realistic Epidemic Spreading

Human Mobility: Statistical Model

Privacy (Sci Rep 3, 1376(2013))

How many indicators we need to identify a unique person.

Mobility and Social Network (PLoS One 9, E92196 (2014))

Geo-Social Properties- Matrix of social behavior over distance: Probability of a link, reciprocity, Clustering, Triangle disparity

Geo-Social Model:

Starting position of user u

Visit a random neighbor                    jump to a new location

New position of u

Model fitting: probability of visiting old friend vs meeting new friend

Human Diffusion: how people are moving around on map (J.R.Sco. Interface 12, 20150473 (2015))

Residents and Tourists

City Communities

Practical Session:

https://github.com/bmtgoncalves/WebSci17

Environment Requirement: anaconda & python

Registering an Application

API basics

The Twitter module provides the OAuth interface, we just need to provide the right credentials.

Best to keep the credentials in a dict and parametrize our calls with the dict keyswitch accounts.

.Twitter(auth) takes an OAuth instance as an argument and returns a Twitter object.

Authenticating with the API

In the remainder of this course, the accounts dict will live inside the twitter_accounts.py file.

4 basic types of objects: tweets, users, entities, places.

Searching for Tweets

.search.tweets(query, count)  https://dev.twitter.com/docs/api/1.1/get/search/tweets

  • query is the content to search for
  • count is the maximum number of results to return (from most recent tweets)

returns dict with a list of ‘statuses’

Social Connections

.friends.ids() and .followers.ids() returns a list of up to 500 of a user’s friends or followers for a given screen_name or user_id.

Results is a dict containing multi-fields.

User Timeline

.statuses.user_timeline() returns a set of tweets posted by a single user.

Important options:
include_rst = ‘true’ to include retweet

Count = 200 is max # of tweets to return in each call

Trim_user = ‘true’ to not include the user information

Max_id = 1234 to include only tweets with an id lower than 1234

Return at most 200 tweets in each call, can get all of a user’s tweets up to 3200 with multiple calls

Social Interaction

Data processing extended from user timeline

NetworkX–networkx_demo.py

High productive software for complex network

Come with anaconda

Simple python interface

Four different types of graphs

  • Graph—undirected graph
  • DiGraph—directed graph
  • MultiGraph—multi-edged graph
  • MultiDiGraph—multi-edged directed graph

Similar interface for all graphs

Nodes can be any type of python object

Growing graph—add nodes, edges, etc.

Graph Properties

  • .nodes() return a list nodes
  • .edges()
  • .degree() return a dict with each node degree .in_degree()/ .out_degree() for DiGraph
  • .is_connected()
  • .is_weakly/strongly_connected()
  • .connected_components()

Snowball Sampling–snowball.py

Commonly used in Social Science and Computer Science

  • Start with a single node
  • Get friends list
  • For each friend get the friend list
  • Repeat for a fixed number of layers or until enough

Generates a connected component graph

Streaming Geocoded data–twitter_location.py

The streaming api provides real time data, subject to filter

Use TwitterStream instead of Twitter object

  • .status.filter(track = 1) while return tweets that matches the query q in real time
  • return generator that you can iterate over
  • .status.filter(locations = bb) will return tweets that occur within the bounding box bb in real time

bb is a comma separated pair of lon/lat coordinates.

Shapefiles

Open specification developed by ESRI, still the current leader in the commercial GIS software

Shapefiles aren’t actual files

But actually a set of files sharing the same name but with different extensions.

The actual set of files changes depending on the contents, but 3 files are usually present:

  • .shp—also commonly referred to as the shapefile contains geometric info
  • .dbf—a simple database containing the feature attribute table
  • .shx—a spatial index

QGIS

Pyshp–hapefile_load.py

Pyshp defines utility functions to load and manipulate shapefiles programmatically.

The shapefile module handles the most common operations:

  • .reader(filename) return a reader object
  • reader.records()/iterRecords()
  • reader.shapes()/iterShapes()
  • reader.shapeRecords()/iterShapeRecords()

shape objects contain several fields:
bbox lower left and upper right x,y coordinates (long/lat)

Simple shapefile plot–plot_shapefile.py

Shapely–shapefile_shape_properties.py

Shaplely defines geometric object under shapely.geometry

                   Points, polygon, multip-polygon, shapes()

And common operations

                   .crosses, .contains, etc..

shape object provides useful field to query a shapes properties:

                    .centroid, .area, .bounds, etc..

Filter Points with a shapefile–shapefile_filter.py

Twitter Places–shapefile_filter_places.py

Twitter defines a “coordinates” filed in tweets

There is also a place field that we glossed over

The place object contains also geographic info, but at a courser resolution than the coordinated filed

Each place has a unique place_id, a bouding_box and some geographical information such as country and full_name.

Places can be of several different types: admin, city, neighborhood, poi

Place Attributes: Key, street_address, phone, post_code, region, ios3, twitter, URL, App:id, etc.

Filter points and places–plot_shapefile_points.py   

Aggregation–shapefile_filter_aggregate.py

Posted at 21:51

Leigh Dodds: The Lego Analogy

I think Lego is a great analogy for understanding the importance of data standards and registers.

Lego have been making plastic toys and bricks

Posted at 12:35

September 19

Leigh Dodds: Mapping wheelchair accessibility, how google could help

This month Google announced

Posted at 07:15

Copyright of the postings is owned by the original blog authors. Contact us.