Wikidata, open data, and interoperability
This week I’m attending a conference titled Collaborations Workshop 2019, run by the Software Sustainability Institute of the UK. The conference focuses on interoperability, documentation, training and sustainability. I’m blogging my notes from the talks I attend. All credit goes to the presenter, and all mistakes are my own.
Franziska Heine presented a keynote on Wikidata, a Wikimedia project that provides structured data to Wikipedia and other data sets. Franziska is Head of Software & Development at Wikimedia Deutschland.
Franziska’s talk was titled “Wikidata, Interoperability and the Future of Scientific Work“.
The Wikidata project
Franziska said she’s very excited to be here and talk about Wikidata, as it’s such a big part of what her team does. She cares about making Wikipedia, which started 20 years ago, into something that remains meaningful in the future.
Wikidata makes interwiki link semantics so that computers can understand the relationships between the pieces of data. When you ask Siri or Google Assistant a question, the answer comes from Wikidata. Franziska also showed us a map of the world with a data overlay sourced from Wikidata. (I can’t find a link to that specific map, alas.)
Wikidata has more than 20,000 active editors per month. That’s the highest number in the entire Wikimedia movement, surpassing even the number of edits of the English-language Wikipedia.
How Wikidata works
The core of Wikidata is a database of items. Each item describes a concept in the world. Each item has an ID number (“Q number”). Items also have descriptions and language information. In Wikipedia, the content for each language is completely separate. So, you can have the same topic in various languages, each with entirely different content. By contrast, in Wikidata all the languages are properties of the single data item. So, for example, each item has a description, and the description may be available in various languages.
Each item is also linked to all the various Wikipedia instances.
Each item has a number of statements (pieces of information), such as date of birth, place of birth, date of death, and so on. Each statement lists the sources of the information. It is of course possible that different sources may provide conflicting information about a particular statement. For example, there may be different opinions about the date of birth of a person.
Wikidata can be edited by people, but there are also bots that do the updates. The concepts within Wikidata are not built primarily for humans to navigate, but rather for machines to understand. For example, Wikidata is able to give Siri and Google Assistant information in ways that Wikipedia can’t.
But can humans look at the data?
Yes! You can use the Wikidata Query Service to access the data. To get started, grab an example query and then adapt it. The query language is SPARQL.
Franziska showed us some interesting query results:
The location of trees grown from seeds that have travelled around the moon.
  

