Tag Archives: lrmi

Schema course extension update⤴

from @ Sharing and learning

This progress update on the work to extend schema.org to support the discovery of any type of educational course is cross-posted from the Schema Course Extension W3C Community Group. If you are interested in this work please head over there.

What aspects of a course can we now describe?
As a result of work so far addressing the use cases that we outlined, we now have answers to the following questions about how to describe courses using schema.org:

As with anything in schema.org, many of the answers proposed are not the final word on all the detail required in every case, but they form a solid basis that I think will be adequate in many instances.

What new properties are we proposing?
In short, remarkably few. Many of the aspects of a course can be described in the same way as for other creative works or events. However we did find that we needed to create two new types Course and CourseInstance to identify whether the description related to a course that could be offered at various times or a specific offering or section of that course. We also found the need for three new properties for Course: courseCode, coursePrerequisites and hasCourseInstance; and two new properties for CourseInstance: courseMode and instructor.

There are others under discussion, but I highlight these as proposed because they are being put forward for inclusion in the next release of the schema.org core vocabulary.

showing how Google will display information about courses in a search galleryMore good news:  the Google search gallery documentation for developers already includes information on how to provide the most basic info about Courses. This is where we are going ?

Schema course extension progress update⤴

from @ Sharing and learning

I am chair of the Schema Course Extension W3C Community Group, which aims to develop an extension for schema.org concerning the discovery of any type of educational course. This progress update is cross-posted from there.

If the forming-storming-norming-performing model of group development still has any currency, then I am pretty sure that February was the “storming” phase. There was a lot of discussion, much of it around the modelling of the basic entities for describing courses and how they relate to core types in schema (the Modelling Course and CourseOffering & Course, a new dawn? threads). Pleased to say that the discussion did its job, and we achieved some sort of consensus (norming) around modelling courses in two parts

Course, a subtype of CreativeWork: A description of an educational course which may be offered as distinct instances at different times and places, or through different media or modes of study. An educational course is a sequence of one or more educational events and/or creative works which aims to build knowledge, competence or ability of learners.

CourseInstance, a subtype of Event: An instance of a Course offered at a specific time and place or through specific media or mode of study or to a specific section of students.

hasCourseInstance, a property of Course with expected range CourseInstance: An offering of the course at a specific time and place or through specific media or mode of study or to a specific section of students.

(see Modelling Course and CourseInstance on the group wiki)

This modelling, especially the subtyping from existing schema.org types allows us to meet many of the requirements arising from the use cases quite simply. For example, the cost of a course instance can be provided using the offers property of schema.org/Event.

The wiki is working to a reasonable extent as a place to record the outcomes of the discussion. Working from the outline use cases page you can see which requirements have pages, and those pages that exist point to the relevant discussion threads in the mail list and, where we have got this far, describe the current solution.  The wiki is also the place to find examples for testing whether the proposed solution can be used to mark up real course information.

As well as the wiki, we have the proposal on github, which can be used to build working test instances on appspot showing the proposed changes to the schema.org site.

The next phase of the work should see us performing, working through the requirements from the use cases and showing how they can be me. I think we should focus first on those that look easy to do with existing properties of schema.org/Event and schema.org/CreativeWork.

schema for courses⤴

from @ Sharing and learning

This is essentially an invite to get involved with building a schema extension for educational courses, by way of a description of work so far. If you want to reply it’s sent as an email schema.org mail list.

About a year ago there was a flurry of discussion about wanting to markup descriptions of courses in schema. Vicky Tardiff-Holland produced a proposal which we discussed in LRMI and elsewhere as a result of which various suggestions were and comments were added to that proposal.

I also led some work in LRMI around scope, use cases, requirements, existing data; which I hope will lead to validating/refining the proposal by some example data that could be used to demonstrate that it met the use cases.

I am up for another push on courses. I share the doc I was working on in the hope that it is good starting point. It’s a bit long, so here is an overview of what it contains:

  • scope: concerning discovery of any type of educational course (online/offline, long/short, scheduled/on-demand) Educational course defined as “some sequence of events and/or creative works which aims to build knowledge, competence or ability of learners”. (out of scope: information about students and their progression etc; information needed internally for course management rather than discovery)
  • comparators: a review of some established ways of sharing similar data
  • use cases
  • requirements arising from the use cases
  • mapping to some existing examples. I used hypothes.is to annotate existing web pages that describe different types of course, e.g. from Coursera or a University, tagging the requirement that the data was relevant to. Here’s an example of a page as tagged (click on a yellow highlight to show the relevant requirement as a comment with a tag)
    hypothes.is aggregates the selected information for each tag, to give a list of the information relevant to each use case, for example cost

I think the next step would be to review the use cases and requirements in light of some of the observations from the mapping, and to look again at the proposal to see how it reflects the data available/required. But first I want to try to get more people involved, see whether anyone has a better idea for how to progress, or if anyone wants to check the work so far and help move it forward.

Finally, I’m aware the docs and discussions so far around schema for courses are a scattered set of scraps and drafts. If there is enough interest it would be really useful to have it in one place.

Presentation: LRMI – using schema.org to facilitate educational resource discovery on the web and beyond⤴

from @ Sharing and learning

Today I am in London for the ISKO Knowledge Organisation in Learning and Teaching meeting, where I am presenting on LRMI and schema.org to facilitate educational resource discovery on the web and beyond. My slides are here, mostly they cover similar ground to presentations I’ve given before which have been captured on video or which I have written up in more detail. So here I’ll just point to my slides for today and summarise the new stuff.

LRMI uptake

People always want to know how much LRMI exists in the wild, and now schema.org reports this infomation. Go to the schema.org page for any class or property and at the top it says in how many domains they find markup for it. Obviously this misses that not all domains are equal in extent or importance: finding LRMI on pjjk,net should not count as equal to finding it on bbc.co.uk, but as a broad indicator it’s OK: finding a property on 10 domains or 10,000 domains is a valid comarison. LRMI properties are mostly reported as found on 100-1000 domains (e.g. learning resource type) or 10-100 domains (e.g. educational alignment). A couple of LRMI properties have greater usage, e.g. typical age range and is based on URL (10-50,00 and 1-10,000 domains respectively), but I guess that reflects their generic usefulness beyond learning resources. We know that in some cases LRMI is used for internal systems but not exposed on web pages, but still the level of usage is not as high as we would like.

I also often get asked about support for creating LRMI metadata, this time I’m including a mention of how it is possible to write WordPress plugins and themes with schema / LRMI support, and the drupal schema.org plugin. I’m also aware of “tagging tools” associated with various repositories, e.g. the learning registry and the Illinois Shared Learning Environment. I think it’s always going to be difficult to answer this one as the best support will always come from customising whatever CMS an organisation uses to manage their content or metadata and will be tailored to their workflow and the types of resources and educational contexts they work in.

As far implementation for search I still cover google custom search, as in the previous presentations.

Current LRMI activities

The DCMI LRMI task group is active, one of our priorities is to improve the support for people who want to use LRMI. Two activities are nearing fruitition: firstly, we are hoping to provide examples for relevant properties and type on the schema.org web site. Secondly, we want to provide better support for the vocabularies used for properties such as alignment type (in the Alignment Object), learning resource type etc, by way of clear definitions and machine readable vocabulary encodings (using SKOS). We are asking for public review and comment on LRMI vocabularies, so please take a look and get in touch.

Other work in progress is around schema for courses and extending some of the vocabularies mentioned above. We have monthly calls, if you would like to lend a hand please do get in touch.

Checking schema.org data with the Yandex structured data validator API⤴

from @ Sharing and learning

I have been writing examples of LRMI metadata for schema.org. Of course I want these to be valid, so I have been hitting various online validators quite frequently. This was getting tedious. Fortunately, the Yandex structured data validator has an API, so I could write a python script to automate the testing.

Here it is

import httplib, urllib, json, sys 
from html2text import html2text
from sys import argv

noerror = False

def errhunt(key, responses):                 # a key and a dictionary,  
    print "Checking %s object" % key         # we're going on an err hunt
    if (responses[key]):
        for s in responses[key]:             
            for object_key in s.keys(): 
                if (object_key == "@error"):              
                    print "Errors in ", key
                    for error in s['@error']:
                        print "tError code:    ", error['error_code'][0]
                        print "tError message: ", html2text(error['message'][0]).replace('n',' ')
                        noerror = False
                elif (s[object_key] != ''):
                    errhunt(object_key, s)
                    print "No errors in %s object" % key
        print "No %s objects" % key

    script, file_name = argv 
    print "tError: Missing argument, name of file to check.ntUsage: yandexvalidator.py filename"

    file = open( file_name, 'r' )
    print "tError: Could not open file ", file_name, " to read"

content = file.read()

    validator_url = "validator-api.semweb.yandex.ru"
    key = "12345-1234-1234-1234-123456789abc"
    params = urllib.urlencode({'apikey': key, 
                               'lang': 'en', 
                               'pretty': 'true', 
                               'only_errors': 'true' 
    validator_path = "/v1.0/document_parser?"+params
    headers = {"Content-type": "application/x-www-form-urlencoded",
               "Accept": "*/*"}
    validator_connection = httplib.HTTPSConnection( validator_url )
    print "tError: something went wrong connecting to the Yandex validator." 

    validator_connection.request("POST", validator_path, content, headers)
    response = validator_connection.getresponse()
    if (response.status == 204):
        noerror= True
        response_data = response.read()   # to clear for next connection
        response_data = json.load(response)
    print "tError: something went wrong getting data from the Yandex validator by API." 
    print "tcontent:n", content
    print "tresponse: ", response.read()
    print "tstatus: ", response.status
    print "tmessage: ", response.msg 
    print "treason: ", response.reason 
    print "n"


if noerror :
    print "No errors found."
    for k in response_data.keys():
        errhunt(k, response_data)


$ ./yandexvalidator.py test.html
No errors found.
$ ./yandexvalidator.py test2.html
Checking json-ld object
No json-ld objects
Checking rdfa object
No rdfa objects
Checking id object
No id objects
Checking microformat object
No microformat objects
Checking microdata object
Checking http://schema.org/audience object
Checking http://schema.org/educationalAlignment object
Checking http://schema.org/video object
Errors in  http://schema.org/video
	Error code:     missing_empty
	Error message:  WARNING: Не выполнено обязательное условие для передачи данных в Яндекс.Видео: **isFamilyFriendly** field missing or empty  
	Error code:     missing_empty
	Error message:  WARNING: Не выполнено обязательное условие для передачи данных в Яндекс.Видео: **thumbnail** field missing or empty  

Points to note:

  • I’m no software engineer. I’ve tested this against valid and invalid files. You’re welcome to use this, but it might not work for you. (You’ll need your own API key). If you see something needs fixing, drop me a line.
  • Line 51: has to be an HTTPS connection.
  • Line 58: we ask for errors only (at line 46) so no news is good news.
  • The function errhunt does most of the work, recursively.

The response from the API is a json object (and json objects are converted into python dictionary objects by line 62), the keys of which are the “id” you sent and each of the structured data formats that are checked. For each of these there is an array/list of objects, and these objects are either simple key-value pairs or the value may be an array/list of objects. If there is an error in any of the objects, the value for the key “@error” gives the details, as a list of Error_code and Error_message key-value pairs. errhunt iterates and recurses through these lists of objects with embedded lists of objects.

LRMI examples for schema.org⤴

from @ Sharing and learning

It’s been over two years since the LRMI properties were added to schema.org. One thing that we should have done much sooner is to create simple examples of how they can be used for the schema.org website (see, for example, the bottom of the Creative Work page). We’re nearly there.

We have two examples in the final stages of preparation, so close to ready that you can see previews of what we propose to add (at the bottom of that page).

The first example is very simple, just a few lines describing a lesson plan for US second grade teachers (NB, the lesson plan itself is not included in the example):

  <h1>Designing a treasure map</h1>
  <p>Resource type: lesson plan, learning activity</p>
  <p>Target audience: teachers</p>
  <p>Educational level: US Grade 2</p>
  <p>Location: <a href="http://example.org/lessonplan">http://example.org/lessonplan</a></p>

With added microdata that becomes

<div itemscope itemtype="http://schema.org/CreativeWork">
    <h1 itemprop="name">Designing a treasure map</h1>
    <p>Resource type: 
      <span itemprop="learningResourceType">lesson plan</span>, 
      <span itemprop="learningResourceType">learning activity</span>
    <p>Target audience: 
      <span itemprop="audience" itemscope itemtype="http://schema.org/EducationalAudience">
        <span itemprop="educationalRole">teacher</span></span>s.
    <p itemprop="educationalAlignment" itemscope itemtype="http://schema.org/AlignmentObject">
        <span itemprop="alignmentType">Educational level</span>: 
        <span itemprop="educationalFramework">US Grade Levels</span> 
        <span itemprop="targetName">2</span>
        <link itemprop="targetUrl" href="http://purl.org/ASN/scheme/ASNEducationLevel/2" />
    <p>Location: <a itemprop="url" href="http://example.org/lessonplan">http://example.org/lessonplan</a></p>

(Other flavours of schema.org markup are at the bottom of the AlignmentObject preview.)

This is illustrates a few points:

  • free text learning resource types, which can be repeated
  • the audience for the resource (teachers) is different from the grade level of the end users (pupils)
  • educationalRole is a property of EducationalAudience, not the CreativeWork being described
  • how the AlignmentObject should be used to specify the grade level appropriateness of the resource
  • human readable grade level information is supplemented with a machine readable URI as the targetUrl

The second example is more substantial. It is based on a resource from BBC bitesize (though we have hacked the HTML around a bit). Here’s the HTML

    <h1>The Declaration of Arbroath</h1>
    <p>A lesson plan for teachers with associated video. 
       Typical length of lesson, 1 hour. 
       Recommended for children aged 10-12 years old.
    <p>Subject: Wars of Scottish independence</p>
    <p>Alignment to curriculum:</p>
            National Curriculum: KS 3 History: The middle ages (12th to 15th century)
            SCQF: Level 2
            Curriculum for Excellence: Social studies: people past events and societies: The Wars of Independence
    <p>Location: <a href="http://example.org/lessonplan">http://example.org/lessonplan</a></p>
        <source src="http://example.org/movie.mp4" type="video/mp4" />
        Duration 03:12

and here’s the JSON-LD:

<script type="application/ld+json">
  "@context":  "http://schema.org/",
  "@type": "WebPage",
  "name": "The Declaration of Arbroath",
  "about": "Wars of Scottish independence",
  "learningResourceType": "lesson plan",
  "timeRequired": "1 hour",
  "typicalAgeRange": "10-12",
  "audience": {
      "@type": "EducationalAudience",
      "educationalRole": "teacher"
  "educationalAlignment": [
  "educationalAlignment": [
      "@type": "AlignmentObject",
      "alignmentType": "educationalSubject",
      "educationalFramework": " Curriculum for Excellence: ",
      "targetName": "Social studies: people past events and societies",
      "targetUrl": "http://example.org/CFE/subjects/3362"      
      "@type": "AlignmentObject",
      "alignmentType": "educationalLevel",
      "educationalFramework": "SCQF",
      "targetName": "Level 2",
      "targetUrl":  "http://example.org/SCQF/levels/2"      
      "@type": "AlignmentObject",
      "alignmentType": "educationalLevel",
      "educationalFramework": "National Curriculum",
      "targetName": "KS 3",
      "targetUrl": "http://example.org/ENC/levels/KS3"
      "@type": "AlignmentObject",
      "alignmentType": "educationalSubject",
      "educationalFramework": "National Curriculum",
      "targetName": "History: The middle ages (12th to 15th century)",
      "targetUrl" : "http://example.org/ENC/subjects/3102"
  "url" : "http://example.org/lessonplan",
  "video": {
    "@type": "VideoObject",
    "description": "Video description",
    "duration": "03:12",
    "name": "Video Title",
    "thumbnailUrl": "http://example.org/thubnail.mp4",
    "uploadDate": "2000-01-01",
    "url" : "http://example.org/movie.mp4"

(Again other flavours of schema.org markup are at the bottom of the AlignmentObject preview.)

The additional points to note here are that

  • we chose to mark up the lesson plan as the learning resource rather than the video. Nothing wrong with marking up the video as a learning resource, but things like educational alignment will be more explicit for lesson plans. (Again, neither the lesson plan nor the video are in the example)
  • the typical learning time of the lesson plan is not the same as the duration of the video
  • this example treat the English and Scottish curricula as providing subjects for study at given educational levels. It would be possible to go deeper and align to specific competence statements (e.g. say that the alignment is that the resource “teaches” Curriculum for Excellence outcome SOC 2-01a “I can use primary and secondary sources selectively to research events in the past.”
  • the educational subject (i.e. the educational context of the resource in terms of those subjects studied at school) is different from the topic that the resource is about
  • it’s a shame that there are no real URIs to use for the targets of these aligments
  • while you can omit the url property for the described resource, we thought it safer to include it; the default value is the URI of the webpage in which the markup is included, which may be what you need, but in the case of stand-alone JSON-LD you’re lost without an explicit value for “url”

As ever, comments would be welcome.

WordPress as a semantic web platform?⤴

from @ Sharing and learning

For the work we’ve been doing on semantic description of courses we needed a platform for creating and editing instance data flexibly and easily. We looked at callimachus and Semantic MediaWiki; in the end we went with the latter because of JAVA version incompatibility problems with the other, but it has been a bit of a struggle. I’ve used WordPress for publishing information about resources on a couple of projects, for Cetis publications and for learning resources, and have been very happy with it. WordPress handles the general task of publishing stuff on the web really well, it is easily extensible through plugins and themes, I have nearly always found plugins that allow me to do what I want and themes that with a little customization allow me to present the information how I want. As a piece of open source software it is used on a massive scale (about a quarter of all web domains use it) and has the development effort and user support to match. For the previous projects my approach was to have a post for each resource I wanted to describe and to set the title, publication date and author to be those for the resource, I used the main body of the post for a description and used tags and categories for classification, e.g. by topic or resource type; other metadata could be added using WordPress’s Custom Fields, more or less as free text name-value pairs. While I had modified themes so that  the semantics of some of this information was marked up with microdata or RDFa embedded in the HTML, I was aware that WordPress allowed for more than I was doing.

The possibility of using WordPress for creating and publishing semantic data hinges on two capabilities that I hadn’t used before: firstly the ability to create custom post types so that for each resource type there can be a corresponding post type; secondly the ability to create custom metadata fields that go beyond free text. I used these in conjunction with a theme which is a child theme of the current default, TwentyFifteen, which sets up the required custom types and displays them. Because I am familiar with it and it is quite general purpose, I chose the schema.org ontology to implement, but I think the ideas in this post would be applicable to any RDF vocabulary. When creating examples I had in mind a dataset describing the books that I own and the authors I am interested in.

I started by using a plugin, Custom Post Type UI, to create the post types I wanted but eventually was doing enough in php as a theme extensions (see below) that it made sense just to add a function to create the the post types. This drops the dependency on the plugin (though it’s a good one) and means the theme works from the outset without requiring custom types to be set up manually.

add_action( 'init', 'create_creativework_type' );
function create_creativework_type() {
  register_post_type( 'creativework',
      'labels' => array(
        'name' => __( 'Creative Works' ),
        'singular_name' => __( 'Creative Work' )
      'public' => true,
      'has_archive' => true,
      'rewrite' => array('slug' => 'creativework'),
      'supports' => array('title', 'thumbnail', 'revisions' )

The key call here is to the WP function register_post_type() which is used to create a post type with the same name as the schema.org resource type / class; so I have one of these for each of the schema.org types I use (so far Thing, CreativeWork, Book and Person). This is hooked into the WordPress init process so it is done by the time you need those post types.

I do use a plugin to help create the custom metadata fields for every property except the name property (for which I use the title of the post). Meta Box extends the WordPress API with some functions that make creating metadata fields in php much easier. These metadata fields can be tailored for particular data types, e.g. text, dates, numbers, urls and, crucially, links to other posts. That last one gives you what you need for to create relationships between the resources you describe in WordPress, which can expressed as triples. Several of these custom fields can be grouped together into a “meta box” and attached as a group to specific post types so that they are displayed when editing posts of those types. Here’s what declaring a custom metadata field for the author relationship between a CreativeWork and a Person looks like with MetaBox (for simplicity I’ve omitted the code I have for declaring the other properties of a Creative Work and some of the optional parameters). I’m using the author property as an example because a repeatable link to another resource is about as complicated a property as you get.

function semwp_register_creativework_meta_boxes( $meta_boxes )
    $prefix = 'semwp_creativework_';

    // 1st meta box
    $meta_boxes[] = array(
        'id'         => 'main_creativework_info',
        'title'      => __( 'Main properties of a schema.org Creative Work', 'semwp_creativework_' ),
        // attach this box to the following post types
        'post_types' => array('creativework', 'book' ),

	// List of meta fields
	'fields'     => array(
            // Author
            // Link to posts of type Person.
                'name'        => __( 'Author (person)', 'semwp_creativework_' ),
                'id'          => "{$prefix}authors",
                'type'        => 'post',
                'post_type'   => 'person',
                'placeholder' => __( 'Select an Item', 'semwp_creativework_' ),
            // set clone to true for repeatable fields
            'clone' => true
    return $meta_boxes;

What this gives when editing a post of type book is this:


WordPress uses a series of nested templates to display content, which are defined in the theme and can either be specific to a post type or generic, the generic ones being used as a fall back if a more specific one does not exist. As I mentioned I use a child theme of TwentyFifteen which means that I only have to include those files that I change from the parent. To display the main content of posts of type book I need a file called content-book.php (the rest of the page is common to all types of post), which looks like this

<article resource="?<?php the_ID() ; ?>#id" id="?<?php the_ID(); ?>" <?php post_class(); ?> vocab="http://schema.org/" typeof="Book">

<header class="entry-header">
        if ( is_single() ) :
            the_title( '

<h1 class="entry-title" property="name">', '<;/h1>' );
        else :
            the_title( sprintf( '

<h2 class="entry-title"><;a href="%s" rel="bookmark">', esc_url( get_permalink() ) ), '</a></h2>

;' );

<div class="entry-content">
    <?php semwp_print_creativework_author(); ?>
    <?php semwp_print_book_bookEdition(); ?>
    <?php semwp_print_book_numberOfPages(); ?>
    <?php semwp_print_book_isbn(); ?>
    <?php semwp_print_book_illustrator(); ?>
    <?php semwp_print_creativework_datePublished(); ?>
    <?php semwp_print_book_bookFormat(); ?>
    <?php semwp_print_creativework_sameAs(); ?></div>

<footer class="entry-footer">
    <?php twentyfifteen_entry_meta(); ?>
    <?php edit_post_link( __( 'Edit', 'twentyfifteen' ), '<span class="edit-link">', '</span>' ); ?>
    <?php semwp_print_extract_rdf_links(); ?>


Note the RDFa in some of the html tags, for example the <article> tag includes

resource= [url]#id vocab="http://schema.org/" typeof="Book"

and the title is output in an <h1> tag with the


attribute. Exposing semantic data as RDFa is one (good) thing, but what about other formats? A useful web service called RDF Translator helps here. It has an API which allowed me to put a link at the foot of each resource page to the semantic data from that page in formats such as RDF/XML, N3 and JSON-LD; it’s quite not what you would want for fully fledged semantic data publishing but it does show the different views of the data that can be extracted from what is published.

Also note that most of the content is printed through calls to php functions that I defined for each property, semwp_print_creativework_author() looks like this (again a repeatable link to another resource is about as complex as it gets:

function semwp_print_alink($id) {
     if (get_the_title($id))       //it's a object with a title
         echo sprintf('<a property="url" href="%s"><span property="name">%s</span></a>', esc_url(get_permalink($id)), get_the_title($id) );
     else                          //treat it as a url
         echo sprintf('<a href="%s">%s</a>', esc_url($id), $id );
function semwp_print_creativework_author()
    if ( rwmb_meta( 'semwp_creativework_authors' ) )
	echo '

By: ';
	$authors = rwmb_meta( 'semwp_creativework_authors' );
        foreach ( $authors as $author )
               echo '<span property="author" typeof="Person">';
               echo '</span>';
        echo '


So in summary, for each resource type I have two files of php/html code: one which sets up a custom post type, custom metadata fields for the properties of that type (and any other types which inherit them) and includes some functions that facilitate the output of instance data as HTML with RDFa; and another file which is the WordPress template for presenting that data. Apart from a few generally useful functions related to output as HTML and modifications to other theme files (mostly to remove embedded data which I found distracting) that’s all that is required.

The result looks like this:

Note, this image is linked to the page on wordpress that is shows, click on it if you want to explore the little data that there is there, but please do be aware that it is a development site which won't always be working properly.
Note, this image is linked to the page on my WordPress install that it shows, click on it if you want to explore the little data that there is there, but please do be aware that it is a development site which won’t always be working properly.

And here’s the N3 rendering of the data in that page as converted by RDF Translator:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://www.pjjk.net/semanticwp/book/the-day-of-the-triffids> rdfa:usesVocabulary schema: .

<http://www.pjjk.net/semanticwp/book/the-day-of-the-triffids?36#id> a schema:Book ;
    schema:author [ a schema:Person ;
            schema:name "John Wyndham"@en-gb ;
            schema:url <http://www.pjjk.net/semanticwp/person/john-wyndham> ] ;
    schema:bookEdition "Popular penguins (2011)"@en-gb ;
    schema:bookFormat ""@en-gb ;
    schema:datePublished "2011-09-01"^^xsd:date ;
    schema:illustrator [ a schema:Person ;
            schema:name "John Griffiths"@en-gb ;
            schema:url <http://www.pjjk.net/semanticwp/person/john-griffiths> ] ;
    schema:isbn "0143566539"@en-gb ;
    schema:name "The day of the triffids"@en-gb ;
    schema:numberOfPages 256 ;
    schema:sameAs "http://www.amazon.co.uk/Day-Triffids-Popular-Penguins/dp/0143566539/"@en-gb,
        "https://books.google.co.uk/books?id=zlqAZwEACAAJ"@en-gb .

Further work: Ideas and a Problem

There’s a ton of other stuff that I can think of that could be done with this, from the simple, e.g. extend the range of types supported, to the challenging, e.g. exploring ways of importing data or facilitating / automating the creation of new post types from known ontologies, output in other formats, providing a SPARQL end point &c &c… Also, I suspect that much of what I have implemented in a theme would be better done as a plugin.

There is one big problem that I only vaguely see a way around, and that is illustrated above in the screenshot of the editing interface for the ‘about’ property. The schema.org/about property has an expected type of schema.org/Thing; schema.org types are hierarchical, which means the value for about can be a Thing or any subtype of Thing (which is to say of any type). This sort of thing isn’t unique to schema.org. However, the MetaBox plugin I use will only allow links to be made to posts of one specific type, and I suspect that reflects something about how WordPress organises posts of different custom types. I don’t think there is any way of asking it to show posts from a range of different types and I don’t think there is any way of saying that posts of type person are also of type thing and so on.  In practice this means that at the moment I can only enter data that shows books as being about unspecific Things; I cannot, for example, say that a biography is a book about a Person. I can only see clunky ways around this.

[Aside: the big consumers of schema data (Google, Bing, Yahoo, Yandex) will also permit text values for most properties and try to make what sense of it they can, so you could say that for any property either a string literal or a link to another resource should be permitted. This, I think, is a peculiarity of schema.org. The screenshot above of the data input form shows that the about field is repeated to provide the option of a text-only value, an approach hinting at one of the clunky unscalable solutions to the general problem described above.]

What next? I might set a student project around addressing some of these extensions. If you know a way around the selecting different type problem please do drop me a line. Other than that I can see myself extending this work slowly if it proves useful for other stuff, like creating examples of pages with schema.org or LRMI data in them. If anyone is really interested in the source code I could put it on github.