Tag Archives: wikidata

Using wikidata for linked data WordPress indexes⤴

from @ Sharing and learning

A while back I wrote about getting data from wikidata into a WordPress custom taxonomy. Shortly thereafter Alex Stinson said some nice things about it:


and as a result that post got a little attention.

Well, I have now a working prototype plugin which is somewhat more general purpose than my first attempt.

1.Custom Taxonomy Term Metadata from Wikidata

Here’s a video showing how you can create a custom taxonomy term with just a name and the wikidata Q identifier, and the plugin will pull down relevant wikidata for that type of entity:

[similar video on YouTube]

2. Linked data index of posts

Once this taxonomy term is used to tag a post, you can view the term’s archive page, and if you have a linked data sniffer, you will see that the metadata from WikiData is embedded in machine readable form using schema.org. Here’s a screenshot of what the OpenLink structured data sniffer sees:

Or you can view the Google structured data testing tool output for that page.

Features

  • You can create terms for custom taxonomies with just a term name (which is used as the slug for the term) and the Wikidata Q number identifier. The relevant name, description and metadata is pulled down from Wikidata.
  • Alternatively you can create a new term when you tag a post and later edit the term to add the wikidata Q number and hence the metadata.
  • The metadata retrieved from Wikidata varies to be suitable for the class of item represented by the term, e.g. birth and death details for people, date and location for events.
  • Term archive pages include the metadata from wikidata as machine readable structured data using schema.org. This includes links back to the wikidata record and other authority files (e.g. ISNI and VIAF). A system harvesting the archive page for linked data could use these to find more metadata. (These onward links put the linked in linked data and the web in semantic web.)
  • The type of relationship between the term and posts tagged with it is recorded in the schema.org structure data on the term archive page. Each custom taxonomy is for a specific type of relationship (currently about and mentions, but it would be simple to add others).
  • Short codes allow each post to list the entries from a custom taxonomy that are relevant for it using a simple text widget.
  • This is a self-contained plugin. The plugin includes default term archive page templates without the need for a custom theme. The archive page is pretty basic (based on twentysixteen theme) so you would get better results if you did use it as the basis for an addition to a custom theme.

How’s it work / where is it

It’s on github. Do not use it on a production WordPress site. It’s definitely pre-alpha, and undocumented, and I make no claims for the code to be adequate or safe. It currently lacks error trapping / exception handling, and more seriously it doesn’t sanitize some things that should be sanitized. That said, if you fancy giving it a try do let me know what doesn’t work.

It’s based around two classes: one which sets up a custom taxonomy and provides some methods for outputting terms and term metadata in HTML with suitable schema.org RDFa markup; the other handles getting the wikidata via SPARQL queries and storing this data as term metadata. Getting the wikidata via SPARQL is much improved on the way it was done in the original post I mentioned above. Other files create taxonomy instances, provide some shortcode functions for displaying taxonomy terms and provide default term archive templates.

Where’s it going

It’s not finished. I’ll see to some of the deficiencies in the coding, but also I want to get some more elegant output, e.g. single indexes / archives of terms from all taxonomies, no matter what the relationship between the post and the item that the term relates to.

There’s no reason why the source of the metadata need be Wikidata. The same approach could be with any source of metadata, or by creating the term metadata in WordPress. As such this is part of my exploration of WordPress as a semantic platform. Using taxonomies related to educational properties would be useful for any instance of WordPress being used as a repository of open educational resources, or to disseminate information about courses, or to provide metadata for PressBooks being used for open textbooks.

I also want to use it to index PressBooks such as my copy of Omniana. I think the graphs generated may be interesting ways of visualizing and processing the contents of a book for researchers.

Licenses: Wikidata is CC:0, the wikidata logo used in the featured image for this post is sourced from wikimedia and is also CC:0 but is a registered trademark of the wikimedia foundation used with permission. The plugin, as a derivative of WordPress, will be licensed as GPLv2 (the bit about NO WARRANTY is especially relevant).

The post Using wikidata for linked data WordPress indexes appeared first on Sharing and learning.

Getting data from wikidata into WordPress custom taxonomy⤴

from @ Sharing and learning

I created a custom taxonomy to use as an index of people mentioned. I wanted it to work nicely as linked data, and so wanted each term in it to refer to the wikidata identifier for the person mentioned. Then I thought, why not get the data for the terms from wikidata?

Brief details

Lots of tutorials on how to set up a custom taxonomy with with custom metadata fields. I worked from this one from smashingmagazine, to get a taxonomy call people, with a custom field for the wikidata id.

Once the wikidata is entered, this code will fetch & parse the data (it’s a work in progress as I add more fields)

<?php
function omni_get_wikidata($wd_id) {
    print('getting wikidata<br />');
    if ('' !== trim( $wd_id) ) {
	    $wd_api_uri = 'https://wikidata.org/entity/'.$wd_id.'.json';
    	$json = file_get_contents( $wd_api_uri );
    	$obj = json_decode($json);
    	return $obj;
    } else {
    	return false;
	}
}

function get_wikidata_value($claim, $datatype) {
	if ( isset( $claim->mainsnak->datavalue->value->$datatype ) ) {
		return $claim->mainsnak->datavalue->value->$datatype;
	} else {
		return false;
	}
}

function omni_get_people_wikidata($term) {
	$term_id = $term->term_id;
    $wd_id = get_term_meta( $term_id, 'wd_id', true );
   	$args = array();
   	$wikidata = omni_get_wikidata($wd_id);
   	if ( $wikidata ) {
    	$wd_name = $wikidata->entities->$wd_id->labels->en->value;
    	$wd_description = $wikidata->entities->$wd_id->descriptions->en->value;
    	$claims = $wikidata->entities->$wd_id->claims;
   		$type = get_wikidata_value($claims->P31[0], 'id');
   		if ( 'Q5' === $type ) {
			if ( isset ($claims->P569[0] ) ) {
				$wd_birth_date = get_wikidata_value($claims->P569[0], 'time');
				print( $wd_birth_date.'<br/>' );
			}
   		} else {
	   		echo(' Warning: that wikidata is not for a human, check the ID. ');
	   		echo(' <br /> ');
   		} 
    	$args['description'] = $wd_description;
    	$args['name'] = $wd_name;
		print_r( $args );print('<br />');
    	update_term_meta( $term_id, 'wd_name', $wd_name );
    	update_term_meta( $term_id, 'wd_description', $wd_description );
    	wp_update_term( $term_id, 'people', $args );
    	
   	} else {
   		echo(' Warning: no wikidata for you, check the Wikidata ID. ');
   	}
}
add_action( 'people_pre_edit_form', 'omni_get_people_wikidata' );
?>

(Note: don’t add this to edited_people hook unless you want along wait while causes itself to be called every time it is called…)

That on its own wasn’t enough. While the name and description of the term were being updated, the values for them displayed in the edit form weren’t updated until the page was refreshed. (Figuring out that it was mostly working took a while.) A bit of javascript inserted into the edit form fixed this:

function omni_taxonomies_edit_fields( $term, $taxonomy ) {
    $wd_id = get_term_meta( $term->term_id, 'wd_id', true );
    $wd_name = get_term_meta( $term->term_id, 'wd_name', true ); 
    $wd_description = get_term_meta( $term->term_id, 'wd_description', true ); 
//JavaScript required so that name and description fields are updated 
    ?>
    <script>
	  var f = document.getElementById("edittag");
	  var n = document.getElementById("name");
  	  var d = document.getElementById("description");
  	  function updateFields() {
  		n.value = "<?php echo($wd_name) ?>";
  		d.innerHTML = "<?php echo($wd_description) ?>";
  	  }

	  f.onsubmit=updateFields();
	</script>
    <tr class="form-field term-group-wrap">
        <th scope="row">
            <label for="wd_id"><?php _e( 'Wikidata ID', 'omniana' ); ?></label>
        </th>
        <td>
            <input type="text" id="wd_id"  name="wd_id" value="<?php echo $wd_id; ?>" />
        </td>
    </tr>
    <?php
}
add_action( 'people_edit_form_fields', 'omni_taxonomies_edit_fields', 10, 2 );

 

The post Getting data from wikidata into WordPress custom taxonomy appeared first on Sharing and learning.

Wikidata driven timeline⤴

from @ Sharing and learning

I have been to a couple of wikidata workshops recently, both involving Ewan McAndrew; between which I read Christine de Pizan‘s Book of the City of Ladies(*). Christine de Pizan is described as one of the first women in Europe to earn her living as a writer, which made me wonder what other female writers were around at that time (e.g. Julian of Norwich and, err…). So, at the second of these workshops, I took advantage of Ewan’s expertise, and the additional bonus of Navino Evans cofounder of Histropedia  also being there, to create a timeline of medieval European female writers.  (By the way, it’s interesting to compare this to Asian female writers–I was interested in Christina de Pizan and wanted to see how she fitted in with others who might have influenced her or attitudes to her, and so didn’t think that Chinese and Japanese writers fitted into the same timeline.)

Histropedia timeline of medieval female authors (click on image to go to interactive version)

This generated from a SPARQL query:

#Timeline of medieval european female writers
#defaultView:Timeline
SELECT ?person ?personLabel ?birth_date ?death_date ?country (SAMPLE(?image) AS ?image) WHERE {
  ?person wdt:P106 wd:Q36180; # find everything that is a writer
          wdt:P21 wd:Q6581072. # ...and a human female
  OPTIONAL{?person wdt:P2031 ?birth_date} # use florit if present for birth/death dates  
  OPTIONAL{?person wdt:P2032 ?death_date} # as some v impecise dates give odd results 
  ?person wdt:P570 ?death_date. # get their date of death
  OPTIONAL{?person wdt:P569 ?birth_date} # get their birth date if it is there
  ?person wdt:P27 ?country.   # get there country
  ?country wdt:P30  wd:Q46.   # we want country to be part of Europe
  FILTER (year(?death_date) < 1500) FILTER (year(?death_date) > 600)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  OPTIONAL { ?person wdt:P18 ?image. }
}
GROUP BY ?person ?personLabel ?birth_date ?death_date ?country
Limit 100

[run it on wikidata query service]

Reflections

I’m still trying to get my head around SPARQL, Ewan and Nav helped a lot, but I wouldn’t want to pass this off as exemplary SPARQL. In particular, I have no idea how to optimise SPARQL queries, and the way I get birth_date and death_date to be the start and end of when the writer flourished, if that data is there, seems a bit fragile.

It was necessary to to use florit dates because some of the imprecise birth & death dates lead to very odd timeline displays: born C12th . died C13th showed as being alive for 200 years.

There were other oddities in the wikidata. When I first tried, Julian of Norwich didn’t appear because she was a citizen of the Kingdom of England, which wasn’t listed as a country in Europe. Occitania, on the other hand was.  That was fixed. More difficult was a writer from Basra who was showing up because Basra was in the Umayyad Caliphate, which included Spain and so was classed as a European country. Deciding what we mean by European has never been easy.

Given the complexities of the data being represented, it’s no surprise that the Wikidata data model isn’t simple. In particular I found that dealing with qualifiers for properties was mind bending (especially with another query I tried to write).

Combining my novice level of SPARQL and the complexity of the Wikidata data model, I could definitely see the need for SPARQL tutorials that go beyond the simple “here’s how you find triple that matches a pattern” level.

Finally: histropedia is pretty cool.

Footnote:

The Book of the City of Ladies is a kind of women in red for Medieval Europe.  Rosalind Brown-Grant’s translation for Penguin Classics is very readable.

The post Wikidata driven timeline appeared first on Sharing and learning.