Tag Archives: schema.org

Progress report for Educational and Occupational Credentials in schema.org⤴

from @ Sharing and learning

[This is cross-posted from the Educational and Occupational Credentials in schema.org W3C community group, if you interested please direct your comments there.]

Over the past few months we have been working systematically through the 30-or-so outline use cases for describing Educational and Occupational Credentials in schema.org, suggesting how they can be met with existing schema.org terms, or failing that working on proposals for new terms to add. Here I want to summarize the progress against these use cases, inviting review of our solutions and closure of any outstanding issues.

Use cases enabled

The list below summarizes information from the community group wiki for those use cases that we have addressed, with links to the outline use case description, the wiki page showing how we met the requirements arising from that use case, and proposed new terms on a test instance of schema.org (may be slow to load). I tried to be inclusive / exhaustive in what I have called out as an issue.

1.1 Identify subtypes of credential

1.2 Name search for credential

1.3 Identify the educational level of a credential

1.4 Desired/required competencies

1.6 Name search for credentialing organization

1.8 Labor market value

1.11 Recognize current competencies

1.13 Language of Credential

2.1 Coverage

2.2 Quality assurance

2.5 Renewal/maintenance requirements

2.6 Cost

3.1 Find related courses, assessments or learning materials

3.3 Relate credentials to competencies

3.4 Find credentialing organization

4.2 Compare credentials

  • Credentials can be compared in terms of any of the factors above, notably cost, compentencies required, recognition and validity.

4.3 Build directories

1.5 Industry and occupation analysis

1.7 Career and education goal

1.10 Job vacancy

3.2 Job seeking

Use cases that have been ‘parked’

The following use cases have not been addressed; either they were identified as low priority or there was insufficient consensus as to how to enable them:

1.9 Assessment (see issue 5, no way to represent assessments in schema.org)

1.12 Transfer value: recognizing current credentials (a complex issue, relating to “stackable” credentials, recognition, and learning pathways)

2.3 Onward transfer value (as previous)

2.4 Eligibility requirements (discussed, but no consensus)

3.5 Find a service to verify a credential (not discussed, low priority)

4.1 Awarding a Credential to a Person  (not discussed, solution may be related to personal self-promotion)

4.4 Personal Self-promotion (pending discussion)

4.5 Replace and retire credentials (not discussed, low priority)

Summary of issues

As well as the unaddressed use cases above, there are some caveats about the way other use cases have been addressed. I have tried to be inclusive / exhaustive in what I have called out as an issue,–I hope many of them can be acknowledged and left for future contributions to schema.org, we just need to clarify that they have been.

  • Issue 1: whether EducationalOccupationalCredential is a subtype of CreativeWork or Intangible.
  • Issue 2: competenceRequired only addresses the simplest case of individual required competencies.
  • Issue 3: whether accreditation is a form of recognition.
  • Issue 4: the actual renewal / maintenance requirements aren’t specified.
  • Issue 5: there is no way represent Assessments in schema.org
  • Issue 6: there is no explicit guidance on how to show required learning materials for a Course in schema.org.

There is an issues page on the wiki for tracking progress in disposing of these issues.

Summary of proposed changes to schema.org

Many of the use cases were addressed using terms that already exist in schema.org. The changes we currently propose are

Addition of a new type EducationalOccupationalCredential

Addition of four properties with domain EducationalOccupationalCredential:

Addition of EducationalOccupationalCredential to the domain of two existing properties (with changes to their definition to reflect this):

Addition of EducationalOccupationalCredential to the range of three existing properties:

The post Progress report for Educational and Occupational Credentials in schema.org appeared first on Sharing and learning.

Using wikidata for linked data WordPress indexes⤴

from @ Sharing and learning

A while back I wrote about getting data from wikidata into a WordPress custom taxonomy. Shortly thereafter Alex Stinson said some nice things about it:


and as a result that post got a little attention.

Well, I have now a working prototype plugin which is somewhat more general purpose than my first attempt.

1.Custom Taxonomy Term Metadata from Wikidata

Here’s a video showing how you can create a custom taxonomy term with just a name and the wikidata Q identifier, and the plugin will pull down relevant wikidata for that type of entity:

[similar video on YouTube]

2. Linked data index of posts

Once this taxonomy term is used to tag a post, you can view the term’s archive page, and if you have a linked data sniffer, you will see that the metadata from WikiData is embedded in machine readable form using schema.org. Here’s a screenshot of what the OpenLink structured data sniffer sees:

Or you can view the Google structured data testing tool output for that page.

Features

  • You can create terms for custom taxonomies with just a term name (which is used as the slug for the term) and the Wikidata Q number identifier. The relevant name, description and metadata is pulled down from Wikidata.
  • Alternatively you can create a new term when you tag a post and later edit the term to add the wikidata Q number and hence the metadata.
  • The metadata retrieved from Wikidata varies to be suitable for the class of item represented by the term, e.g. birth and death details for people, date and location for events.
  • Term archive pages include the metadata from wikidata as machine readable structured data using schema.org. This includes links back to the wikidata record and other authority files (e.g. ISNI and VIAF). A system harvesting the archive page for linked data could use these to find more metadata. (These onward links put the linked in linked data and the web in semantic web.)
  • The type of relationship between the term and posts tagged with it is recorded in the schema.org structure data on the term archive page. Each custom taxonomy is for a specific type of relationship (currently about and mentions, but it would be simple to add others).
  • Short codes allow each post to list the entries from a custom taxonomy that are relevant for it using a simple text widget.
  • This is a self-contained plugin. The plugin includes default term archive page templates without the need for a custom theme. The archive page is pretty basic (based on twentysixteen theme) so you would get better results if you did use it as the basis for an addition to a custom theme.

How’s it work / where is it

It’s on github. Do not use it on a production WordPress site. It’s definitely pre-alpha, and undocumented, and I make no claims for the code to be adequate or safe. It currently lacks error trapping / exception handling, and more seriously it doesn’t sanitize some things that should be sanitized. That said, if you fancy giving it a try do let me know what doesn’t work.

It’s based around two classes: one which sets up a custom taxonomy and provides some methods for outputting terms and term metadata in HTML with suitable schema.org RDFa markup; the other handles getting the wikidata via SPARQL queries and storing this data as term metadata. Getting the wikidata via SPARQL is much improved on the way it was done in the original post I mentioned above. Other files create taxonomy instances, provide some shortcode functions for displaying taxonomy terms and provide default term archive templates.

Where’s it going

It’s not finished. I’ll see to some of the deficiencies in the coding, but also I want to get some more elegant output, e.g. single indexes / archives of terms from all taxonomies, no matter what the relationship between the post and the item that the term relates to.

There’s no reason why the source of the metadata need be Wikidata. The same approach could be with any source of metadata, or by creating the term metadata in WordPress. As such this is part of my exploration of WordPress as a semantic platform. Using taxonomies related to educational properties would be useful for any instance of WordPress being used as a repository of open educational resources, or to disseminate information about courses, or to provide metadata for PressBooks being used for open textbooks.

I also want to use it to index PressBooks such as my copy of Omniana. I think the graphs generated may be interesting ways of visualizing and processing the contents of a book for researchers.

Licenses: Wikidata is CC:0, the wikidata logo used in the featured image for this post is sourced from wikimedia and is also CC:0 but is a registered trademark of the wikimedia foundation used with permission. The plugin, as a derivative of WordPress, will be licensed as GPLv2 (the bit about NO WARRANTY is especially relevant).

The post Using wikidata for linked data WordPress indexes appeared first on Sharing and learning.

Educational and occupational credentials in schema.org⤴

from @ Sharing and learning

Since the summer I have been working with the Credential Engine, which is based at Southern Illinois University, Carbondale, on a project to facilitate the description of educational and occupational credentials in schema.org. We have just reached the milestone of setting up a W3C Community Group to carry out that work.  If you would like to contribute to the work of the group (or even just lurk and follow what we do) please join it.

Educational and occupational credentials

By educational and occupational credentials I mean diplomas, academic degrees, certifications, qualifications, badges, etc., that a person can obtain by passing some test or examination of their abilities. (See also the Connecting Credentials project’s glossary of credentialing terms.)  These are already alluded to in some schema.org properties that are pending, for example an Occupation or JobPosting’s qualification or a Course’s educationalCredentialAwarded. These illustrate how educational and occupational credentials are useful for linking career aspirations with discovery of educational opportunities. The other entity type to which educational and occupational credentials link is Competence, i.e. the skills, knowledge and abilities that the credential attests. We have been discussing some work on how to describe competences with schema.org in recent LRMI meetings, more on that later.

Not surprisingly there is already a large amount of relevant work done in the area of educational and occupational credentials. The Credential Engine has developed the Credential Transparency Description Language (CTDL) which has a lot of detail, albeit with a US focus and far more detail that would be appropriate for schema.org. The Badge Alliance has a model for open badges metadata that is applicable more generally. There is a W3C Verifiable Claims working group which is looking at credentials more widely, and the claim to hold one. Also, there are many frameworks which describe credentials in terms of the level and extent of the knowledge and competencies they attest, in the US Connecting Credentials cover this domain, while in the EU there are many national qualification frameworks and a Framework for Qualifications of the European Higher Education Area.

Potential issues

One potential issue is collision with existing work. We’ll have to make sure that we know where the work of the educational and occupational credential working group ends, i.e what work would best be left to those other initiatives, and how we can link to the products of their work. Related to that is scope creep. I don’t want to get involved in describing credentials more widely, e.g. issues of identification, authentication, authorization; hence the rather verbose formula of ‘educational and occupational credential. That formula also encapsulates another issue, a tension I sense between the educational world and the work place: does a degree certificate qualify someone to do anything, or does it just relate to knowledge?  Is an exam certificate a qualification?

The planned approach

I plan to approach this work in the same way that the schema course extension community group worked. We’ll use brief outline use cases to define the scope, and from these define a set of requirements, i.e. what we need to describe in order to facilitate the discovery of educational and occupational credentials. We’ll work through these to define how to encode the information with existing schema.org terms, or if necessary, propose new terms. While doing this we’ll use a set of examples to provide evidence that the information required is actually available from existing credentialling organizations.

Get involved

If you want to help with this, please join the community group. You’ll need a W3C account, and you’ll need to sign an assurance that you are not contributing any intellectual property that cannot be openly and freely licensed.

The post Educational and occupational credentials in schema.org appeared first on Sharing and learning.

Schema course extension update⤴

from @ Sharing and learning

This progress update on the work to extend schema.org to support the discovery of any type of educational course is cross-posted from the Schema Course Extension W3C Community Group. If you are interested in this work please head over there.

What aspects of a course can we now describe?
As a result of work so far addressing the use cases that we outlined, we now have answers to the following questions about how to describe courses using schema.org:

As with anything in schema.org, many of the answers proposed are not the final word on all the detail required in every case, but they form a solid basis that I think will be adequate in many instances.

What new properties are we proposing?
In short, remarkably few. Many of the aspects of a course can be described in the same way as for other creative works or events. However we did find that we needed to create two new types Course and CourseInstance to identify whether the description related to a course that could be offered at various times or a specific offering or section of that course. We also found the need for three new properties for Course: courseCode, coursePrerequisites and hasCourseInstance; and two new properties for CourseInstance: courseMode and instructor.

There are others under discussion, but I highlight these as proposed because they are being put forward for inclusion in the next release of the schema.org core vocabulary.

showing how Google will display information about courses in a search galleryMore good news:  the Google search gallery documentation for developers already includes information on how to provide the most basic info about Courses. This is where we are going ?

Schema course extension progress update⤴

from @ Sharing and learning

I am chair of the Schema Course Extension W3C Community Group, which aims to develop an extension for schema.org concerning the discovery of any type of educational course. This progress update is cross-posted from there.

If the forming-storming-norming-performing model of group development still has any currency, then I am pretty sure that February was the “storming” phase. There was a lot of discussion, much of it around the modelling of the basic entities for describing courses and how they relate to core types in schema (the Modelling Course and CourseOffering & Course, a new dawn? threads). Pleased to say that the discussion did its job, and we achieved some sort of consensus (norming) around modelling courses in two parts

Course, a subtype of CreativeWork: A description of an educational course which may be offered as distinct instances at different times and places, or through different media or modes of study. An educational course is a sequence of one or more educational events and/or creative works which aims to build knowledge, competence or ability of learners.

CourseInstance, a subtype of Event: An instance of a Course offered at a specific time and place or through specific media or mode of study or to a specific section of students.

hasCourseInstance, a property of Course with expected range CourseInstance: An offering of the course at a specific time and place or through specific media or mode of study or to a specific section of students.

(see Modelling Course and CourseInstance on the group wiki)

This modelling, especially the subtyping from existing schema.org types allows us to meet many of the requirements arising from the use cases quite simply. For example, the cost of a course instance can be provided using the offers property of schema.org/Event.

The wiki is working to a reasonable extent as a place to record the outcomes of the discussion. Working from the outline use cases page you can see which requirements have pages, and those pages that exist point to the relevant discussion threads in the mail list and, where we have got this far, describe the current solution.  The wiki is also the place to find examples for testing whether the proposed solution can be used to mark up real course information.

As well as the wiki, we have the proposal on github, which can be used to build working test instances on appspot showing the proposed changes to the schema.org site.

The next phase of the work should see us performing, working through the requirements from the use cases and showing how they can be me. I think we should focus first on those that look easy to do with existing properties of schema.org/Event and schema.org/CreativeWork.

Is there a Library shaped black hole in the web? Event summary.⤴

from @ Open World

Is there a Library shaped black hole in the web? was the question posed by an OCLC event at the Royal College of Surgeons last week that focused on exploring the potential benefits of using linked data to make library data available to users through the web. For a comprehensive overview of the event, I’ve put together a Storify of tweets here: https://storify.com/LornaMCampbell/oclc-linked-data

Following a truly dreadful pun from Laura J Wilkinson…

Owen Stephens kicked off the event with an overview of linked data and its potential to be  a lingua franca for publishing library data.  Some of the benefits that linked data can afford to libraries including improving search, discovery and display of library catalogue record information, improved data quality and data correction, and the ability to work with experts across the globe to harness their expertise.  Owen also introduced the Open World Assumption which, despite the coincidental title of this blog, was a new concept to me.  The Open World Assumption states that

“there may exist additional data, somewhere in the world to complement the data one has at hand”.

This contrasts with the Closed World Assumption which assumes that

“data sources are well-known and tightly controlled, as in a closed, stand-alone data silo.”

Learning Linked Data
http://lld.ischool.uw.edu/wp/glossary/

Traditional library catalogues worked on the basis of the closed world assumption, whereas linked data takes an open world approach and recognises that other people will know things you don’t.  Owen quoted Karen Coyle “the catalogue should be an information source, not just an inventory” and noted that while data on the web is messy, linked data provides the option to select sources we can trust.

Cathy Dolbear of Oxford University Press, gave a very interesting talk from the perspective of a publisher providing data to libraries and other search and discovery services. OUP provides data to library discovery services, search engines, wiki data, and other publishers.  Most OUP products tend to be discovered by search engines, only a small number of referrals, 0.7%, come from library discovery services.  OUP have two OAI-PMH APIs but they are not widely used and they are very keen to learn why.  The publisher’s requirements are primarily driven by search engines, but they would like to hear more from library discovery services.

Neil Jeffries of the Bodleian Digital Library was not able to be present on the day, but he overcame the inevitable technical hitches to present remotely.  He began by arguing that digital libraries should not be seen as archives or museums; digital libraries create knowledge and artefacts of intellectual discourse rather than just holding information. In order to enable this knowledge creation, libraries need to collaborate, connect and break down barriers between disciplines.  Neil went on to highlight a wide range of projects and initiatives, including VIVO, LD4L, CAMELOT, that use linked data and the semantic web to facilitate these connections. He concluded by encouraging libraries to be proactive and to understand the potential of both data and linked data in their own domain.

Ken Chad posed a question that often comes up in discussions about linked data and the semantic web; why bother?  What’s the value proposition for linked data?  Gartner currently places linked data in the trough of disillusionment, so how do we cross the chasm to reach the plateau of productivity?  This prompted my colleague Phil Barker to comment:

Ken recommended using the Jobs-to-be-Done framework to cross the chasm. Concentrate on users, but rather than just asking them what they want focus on, asking them what they are trying to do and identify their motivating factors – e.g. how will linked data help to boost my research profile?

For those willing to take the leap of faith across the chasm, Gill Hamilton of the National Library of Scotland presented a fantastic series of Top Tips! for linked data adoption which can be summarised as follows:

  • Strings to things aka people smart, machines stupid – library databases are full of things, people are really smart at reading things, unfortunately machines are really stupid. Turn things into strings with URIs so machines can read them.
  • Never, ever, ever dumb down your data.
  • Open up your metadata – license your metadata CC0 and put a representation of it into the Open Metadata Registry.  Open metadata is an advert for your collections and enables others to work with you.
  • Concentrate on what is unique in your collections – one of the unique items from the National Library of Scotland that Gill highlighted was the order for the Massacre of Glencoe.  Ahem. Moving swiftly on…
  • Use open vocabularies.

Simples! Linked Data is still risky though; services go down, URIs get deleted and there’s still more playing around than actual doing, however it’s still worth the risk to help us link up all our knowledge.

Richard J Wallis brought the day to a close by asking how can libraries exploit the web of data to liberate their data?  The web of data is becoming a web of related entities and it’s the relationships that add value.  Google recognised this early on when they based their search algorithm on the links between resources.  The web now deals with entities and relationships, not static records.

One way to encode these entities and relationships is using Schema.org. Schema.org aims to help search engines to interpret information on web pages so that it can be used to improve the display of search results.  Schema.org has two components; an ontology for naming the types and characteristics of resources, their relationships with each other, and constraints on how to describe these characteristics and relationships, and the expression of this information in machine readable formats such as microdata, RDFa Lite and JSON-LD. Richard noted that Schema.org is a form or linked data, but “it doesn’t advertise the fact” and added that libraries need to “give the web what it wants, and what it wants is Schema.org.”

If you’re interested in finding out more about Schema.org, Phil Barker and I wrote a short Cetis Briefing Paper on the specification which is available here: What is Schema.org?  Richard Wallis will also be presenting a Dublin Core Metadata Initiative webinar on the Schema.org and its applicability to the bibliographic domain on the 18th of November, registration here http://dublincore.org/resources/training/#2015wallis.


Is there a Library shaped black hole in the web? Event summary.⤴

from @ Open World

Is there a Library shaped black hole in the web? was the question posed by an OCLC event at the Royal College of Surgeons last week that focused on exploring the potential benefits of using linked data to make library data available to users through the web. For a comprehensive overview of the event, I’ve put together a Storify of tweets here: https://storify.com/LornaMCampbell/oclc-linked-data

Following a truly dreadful pun from Laura J Wilkinson…

Owen Stephens kicked off the event with an overview of linked data and its potential to be  a lingua franca for publishing library data.  Some of the benefits that linked data can afford to libraries including improving search, discovery and display of library catalogue record information, improved data quality and data correction, and the ability to work with experts across the globe to harness their expertise.  Owen also introduced the Open World Assumption which, despite the coincidental title of this blog, was a new concept to me.  The Open World Assumption states that

“there may exist additional data, somewhere in the world to complement the data one has at hand”.

This contrasts with the Closed World Assumption which assumes that

“data sources are well-known and tightly controlled, as in a closed, stand-alone data silo.”

Learning Linked Data
http://lld.ischool.uw.edu/wp/glossary/

Traditional library catalogues worked on the basis of the closed world assumption, whereas linked data takes an open world approach and recognises that other people will know things you don’t.  Owen quoted Karen Coyle “the catalogue should be an information source, not just an inventory” and noted that while data on the web is messy, linked data provides the option to select sources we can trust.

Cathy Dolbear of Oxford University Press, gave a very interesting talk from the perspective of a publisher providing data to libraries and other search and discovery services. OUP provides data to library discovery services, search engines, wiki data, and other publishers.  Most OUP products tend to be discovered by search engines, only a small number of referrals, 0.7%, come from library discovery services.  OUP have two OAI-PMH APIs but they are not widely used and they are very keen to learn why.  The publisher’s requirements are primarily driven by search engines, but they would like to hear more from library discovery services.

Neil Jeffries of the Bodleian Digital Library was not able to be present on the day, but he overcame the inevitable technical hitches to present remotely.  He began by arguing that digital libraries should not be seen as archives or museums; digital libraries create knowledge and artefacts of intellectual discourse rather than just holding information. In order to enable this knowledge creation, libraries need to collaborate, connect and break down barriers between disciplines.  Neil went on to highlight a wide range of projects and initiatives, including VIVO, LD4L, CAMELOT, that use linked data and the semantic web to facilitate these connections. He concluded by encouraging libraries to be proactive and to understand the potential of both data and linked data in their own domain.

Ken Chad posed a question that often comes up in discussions about linked data and the semantic web; why bother?  What’s the value proposition for linked data?  Gartner currently places linked data in the trough of disillusionment, so how do we cross the chasm to reach the plateau of productivity?  This prompted my colleague Phil Barker to comment:

Ken recommended using the Jobs-to-be-Done framework to cross the chasm. Concentrate on users, but rather than just asking them what they want focus on, asking them what they are trying to do and identify their motivating factors – e.g. how will linked data help to boost my research profile?

For those willing to take the leap of faith across the chasm, Gill Hamilton of the National Library of Scotland presented a fantastic series of Top Tips! for linked data adoption which can be summarised as follows:

  • Strings to things aka people smart, machines stupid – library databases are full of things, people are really smart at reading things, unfortunately machines are really stupid. Turn things into strings with URIs so machines can read them.
  • Never, ever, ever dumb down your data.
  • Open up your metadata – license your metadata CC0 and put a representation of it into the Open Metadata Registry.  Open metadata is an advert for your collections and enables others to work with you.
  • Concentrate on what is unique in your collections – one of the unique items from the National Library of Scotland that Gill highlighted was the order for the Massacre of Glencoe.  Ahem. Moving swiftly on…
  • Use open vocabularies.

Simples! Linked Data is still risky though; services go down, URIs get deleted and there’s still more playing around than actual doing, however it’s still worth the risk to help us link up all our knowledge.

Richard J Wallis brought the day to a close by asking how can libraries exploit the web of data to liberate their data?  The web of data is becoming a web of related entities and it’s the relationships that add value.  Google recognised this early on when they based their search algorithm on the links between resources.  The web now deals with entities and relationships, not static records.

One way to encode these entities and relationships is using Schema.org. Schema.org aims to help search engines to interpret information on web pages so that it can be used to improve the display of search results.  Schema.org has two components; an ontology for naming the types and characteristics of resources, their relationships with each other, and constraints on how to describe these characteristics and relationships, and the expression of this information in machine readable formats such as microdata, RDFa Lite and JSON-LD. Richard noted that Schema.org is a form or linked data, but “it doesn’t advertise the fact” and added that libraries need to “give the web what it wants, and what it wants is Schema.org.”

If you’re interested in finding out more about Schema.org, Phil Barker and I wrote a short Cetis Briefing Paper on the specification which is available here: What is Schema.org?  Richard Wallis will also be presenting a Dublin Core Metadata Initiative webinar on the Schema.org and its applicability to the bibliographic domain on the 18th of November, registration here http://dublincore.org/resources/training/#2015wallis.

ETA  Phil Barker has also written a comprehensive summary of this even over at his own blog , Sharing and Learning, here: A library shaped black hole in the web?