Tag Archives: JobPosting

JDX: a schema for Job Data Exchange⤴

from @ Sharing and learning

[This rather long blog post describes a project that I have been involved with through consultancy with the U.S. Chamber of Commerce Foundation.  Writing this post was funded through that consultancy.]

The U.S. Chamber of Commerce Foundation has recently proposed a modernized schema for job postings based on the work of HR Open and Schema.org, the Job Data Exchange (JDX) JobSchema+. It is hoped JDX JobSchema+ will not just facilitate the exchange of data relevant to jobs, but will do so in a way that helps bridge the various other standards used by relevant systems.  The aim of JDX is to improve the usefulness of job data including signalling around jobs, addressing such questions as: what jobs are available in which geographic areas? What are the requirements for working in these jobs? What are the rewards? What are the career paths? This information needs to be communicated not just between employers and their recruitment partners and to potential job applicants, but also to education and training providers, so that they can create learning opportunities that provide their students with skills that are valuable in their future careers. Job seekers empowered with greater quantity and quality of job data through job postings may secure better-fitting employment faster and for longer duration due to improved matching. Preventing wasted time and hardship may be particularly impactful for populations whose job searches are less well-resourced and those for whom limited flexibility increases their dependence on job details which are often missing, such as schedule, exact location, and security clearance requirement. These are among the properties that JDX provides employers the opportunity to include for easy and quick identification by all.  In short, the data should be available to anyone involved in the talent pipeline. This broad scope poses a problem that JDX also seeks to address: different systems within the talent pipeline data ecosystem use different data standards so how can we ensure that the signalling is intelligible across the whole ecosystem?

The starting point for JDX was two of the most widely used data standards relevant to describing jobs: HR Open Standards Recruiting standard, part of the foremost suite of standards covering all aspects of the HR sector and the schema.org JobPosting schema, which is used to make data on web pages accessible to search engines, notably Google’s Job Search. These, and an analysis of the information required around jobs, job descriptions and job postings, their relationships to other entities such as organizations, competencies, credentials, experience and so on, were modelled in RDF to create a vocabulary of classes, properties, and concept schemes that can be used to create data. The full data model, which can be accessed on GitHub, is quite extensive: the description of jobs that JDX enables goes well beyond what is required for a job posting advertising a vacancy. A subset of the full model comprising those terms useful for job postings was selected for pilot testing, and this is available in a more accessible form on the Chamber Foundation’s website and is documented on the Job Data Exchange website. The results of the data analysis, modelling and piloting were then fed back into the HR Open and schema.org standards that were used as a starting point.

This is where things start to get a little complicated, as it means JDX has contributed to three related efforts.

JobPostings in schema.org

The modelling and piloting highlighted and addressed some issues that were within schema.org’s scope of enabling the provision of structured data about job postings on the web. These were discussed through a W3C Community Group on Talent Marketplace Signalling, and the solutions were reconciled with schema.org’s wider model and scope as a web-wide vocabulary that covers many other types of things apart from Jobs. The outcomes include that schema.org/JobPosting has several new properties (or modifications to how existing properties are used) allowing for such things as: a job posting with more than one vacancy, a job posting with a specified start date, a job posting with requirements other than competencies — i.e. physical, sensory and security clearance requirements, and more specific information about contact details and location within the company structure for the job being advertised.

Because schema.org and JDX are both modelled in RDF as sets of terms that can be used to make independent statements about entities (rather than a record-based model such as XML documents) it was relatively easy to add terms to schema.org that were based on those in JDX. The only reason that the terms added to schema.org are not exactly the same as the terms in JDX JobSchema+ is that it was sometimes necessary to take into account already existing properties in schema.org, and the wider purpose and different audience of schema.org.

JDX in HROpen

As with schema.org, JDX highlighted some issues that are within the scope of the HROpen Standards Recruiting standard, and the aim is to incorporate the lessons learnt from JDX into that standard. However the Recruiting standard is part of the inter-linked suite of specifications that HROpen maintains across all aspects of the HR domain, and these standards are in plain JSON, a record-based format specified through JSON-Schema files not RDF Schema. This makes integration of new terms and modelling approaches from JDX into HROpen more complicated than was the case with schema.org. As a first step the property definitions have been translated into JSON-Schema, and partially integrated into the suite of HROpen standards, however some of the structures, for example for describing Organizations, were significantly different to how other HROpen standards treat the same types of entity, and so these were kept separate. The plan for the next phase is to further integrate JDX into the existing standards, enhance the use cases and documentation and include RDF, JSON Schema, and XML XSD.

JDX JobPosting+ RDF Schema

Finally, of course, JDX still exists as an RDF Schema, currently on github.  The work on integration with HROpen surfaced some errors and other issues, which have been addressed. Likewise feeding back into schema.org JobPosting means that there are new relationships between terms in JDX and schema.org that can be encoded in the JDX schema. Finally there is potential for other changes and remodelling as a result of findings from the JDX pilot of job postings. But given the progress made with integrating lessons learnt into schema.org and the HROpen Recruiting standard, what is the role of the RDF Schema compared to these other two?

Standard Strengths and Interoperability

Each of the three standards has strengths in its own niche. Schema.org provides a widely scoped vocabulary, mostly used for disseminating information on the open web. The most obvious consumers of data that use terms from schema.org are search engines trying to make sense of text in web pages, so that they can signal the key aspects of job postings with less ambiguity than can easily be done by processing natural text. Of course such data is also useful for any system that tries to extract data from webpages. Schema.org is also widely used as a source of RDF terms for other vocabularies, after all it doesn’t make much sense for every standard to define its own version of a property for the name of the thing being described, or a textual description of it—more on this below in the discussion of harmonization.

HROpen Standards are designed for system-to-system interoperability within the HR domain. If organization A and organization B (not to mention organizations C through to Z) have systems that do the same sort of thing with the same sort of data, then using an agreed standard for the data they care about clearly brings efficiencies by allowing for systems to be designed to a common specification and for organizations to share data where appropriate. This is the well understood driving force for interoperability specifications.

it is useful to have a common set of “terms” from which data providers can pick and choose what is appropriate for communicating different aspects of what they care about

But what about when two organizations are using the same sort of data for different things? For example, it might be that they are part of different verticals which interact with each other but have significant differences aside from where they overlap; or it might be that one organization provides a horizontal service, such as web search, across several verticals. This is where it is useful to have a common set of “terms” from which data providers can pick and choose what is appropriate for communicating different aspects of what they care about to those who provide services that intersect or overlap with their own concern. For example a fully worked specification for learning outcomes in education would include much that is not relevant to the HR domain and much that overlaps; furthermore HR and education providers use different systems for other aspects of their work: HR will care about integration with payroll systems, education about integration with course management systems. There is no realistic prospect that the same data standards can be used to the extent that the record formats will be the same; however with the RDF approach of entity-focused description rather than defining a single record structure, there is no reason why some of the terms that are used to describe the HR view of competency shouldn’t also be used to describe the education view of learning outcomes. Schema.org provides a broad horizontal layer of RDF terms that can be used across many domains; JDX provides a deeper dive into the more specific vocabulary used in jobs data.

Data Harmonization

This approach to allowing mutual intelligibility between data standards in different domains to the extent that the data they care about overlaps (or, for that matter, competing data standards in the same domain) is known as data harmonization. RDF is very much suited to harmonization for these reasons:

  • its entity-based modelling approach does not pre-impose the notion of data requirements or inter-relationships between data elements in the way that a record-based modelling approach does;
  • in the RDF data community it is assumed that different vocabularies of terms (classes and properties for describing aspects of a resource) and concepts (providing the means to classify resources) will be developed in such a way that someone can mix and match terms from relevant vocabularies to describe all the entities that they care about; and
  • as it is assumed that there will be more than one relevant vocabulary it has been accepted that there will be related terms in separate vocabularies, and so the RDF schema that describe these vocabularies should also describe these relationships.

JDX was designed in the knowledge that it overlaps with schema.org. For example JDX deals with providing descriptions of organizations (who offer jobs), and with things that have names and so does schema.org. It is not necessary for JDX to define its own class of Organizations or property of name, it simply uses the class and property defined by schema.org. That means that any data that conforms to the JDX RDF schema automatically has some data that conforms with schema.org. No need to extract and transform RDF data before loading it when the modelling approach and vocabularies used are the same in the first place.

Sometimes the match in terminology isn’t so good. At some point in the future we might, for example, be prepared to say that everything JDX calls a JobPosting is something that schema.org calls a JobPosting and vice versa. In this case we could add to the JDX schema a declaration that these are equivalent classes. In other cases we might say that some class of things in JDX form a subset of what schema.org has grouped as a class, in which case we could add to the JDX schema a declaration that the JDX class is a subclass of the schema.org class. Similar declarations can be made about properties.

by querying the data provided about things along with information about relationships between the data terms used we can achieve interoperability across data provided in different data standards

The reason why this is useful is that RDF schema are written in RDF and RDF data includes links to the definitions of the terms in the schema, so data about jobs and organizations and all the other entities described with JDX can be in a data store linked to the definitions of the terms used to describe them. These definitions can link to other definitions of related terms all accessible for querying.  This is linked data at the schema level. For a long time we have referred to this network of data along with definitions, which were seen as sprawling across the internet, as the Semantic Web, but more recently it has been found to be useful for datastores to be more focused, and the result of data about a domain along with the schema for those data is now commonly known as a knowledge graph. What matters is the consequence that by querying the data provided about things along with information about relationships between the data terms used we can achieve interoperability across data provided in different data standards. If a query system knows that some data relates to what JDX calls a JobPosting (because the data links to the JDX schema), and that everything JDX calls a JobPosting schema.org also calls a JobPosting (let’s say this is declared in the schema) then when asked about schema.org  JobPostings the query system knows it can return information about JDX JobPostings. RDF data management systems do this routinely and, for the end user, transparently.

That’s lovely if your data is in RDF; what if it is not? Most system-to-system interoperability standards don’t use RDF. This is the problem taken on by the  Data Ecosystem Schema Mapper (DESM) Tool. The approach it takes is to create local RDF schema describing the classes, properties and classifications used in these standards. The local RDF schema can assert equivalences between the RDF terms corresponding to each standard, or from each standard to an appropriate formal RDF vocabulary such as JDX.  Data can then be extracted from the record formats used and expressed as RDF using technologies such as the RDF Mapping Language (RML). This would allow us to build knowledge graphs that draw on data provided in existing systems, and query them without knowing what format or standard the data was originally in. For example, an employer could publish data in JSON using HR Open Standards’ Recruiting Standard. This data could be translated to the RDF representation of the standard created with the DESM Tool. Relationships expressed in the schema for the RDF representation would allow mapping of some or all of the data to JDX JobSchema+, schema.org JobPosting and other relevant standards. (The other standards may cover only part of the data, for example mapping skills requirements to standards used for competencies as learning objectives in the education domain.) This provides a route to translating data between standards that cover the same ground, and also provides data that can link to other domains.


Stuart Sutton, of Sutton & Associates, led the creation of the JDX JobSchema+ and originated many of the ideas described in this blog post.

Many thanks to people who commented on drafts of this post, including Stuart Sutton, Danielle Saunders, Jeanne Kitchens, Joshua Westfall, Kim Bartkus. Any errors remaining are my fault.

Writing this post was part of work funded by the U.S. Chamber of Commerce Foundation.

The post JDX: a schema for Job Data Exchange appeared first on Sharing and learning.

One year of Talent Marketplace Signaling⤴

from @ Sharing and learning

I chair the Talent Marketplace Signaling W3C Community Group, this progress report is cross-posted from its blog

It is one year since the initial call for participation in the Talent Marketplace Signaling W3C Community Group. That seems like a good excuse to reflect on what we have done so far, where we are, and what’s ahead.

I’m biassed, but I think progress has been good. We have 35 participants in the group, we have had some expansive discussions to outline the scope and aims of the group, the detail of which we filled in with issues and use cases. We also had some illuminating discussions about how we conceptualize the domain we are addressing (see most of August in the mail list archive). Most importantly, I think that we have made good on the aim arising from our initial kick-off meeting to identify issues arising from use cases and fix them individually with discrete enhancements to schema.org. Here’s a list of the fixes we have suggested that have been accepted by schema.org, drawn from the schema.org release log:

Translating those back to our use cases / issues we can now:

Looking forward…

First I want to note that many of those contributions have been accepted into what schema.org calls its pending section, which it defines as “a staging area for work-in-progress terms which have yet to be accepted into the core vocabulary”. While there are caveats about terms in pending being subject to change and that they should be used with caution, their acceptance into the core of the schema.org vocabulary relies on them being shown to be useful. So we have a task remaining of promoting and highlighting the use of these terms and showing how they are used. Importantly, “use” here means not just publishing data, but the existence of services built on that data.

Looking at the remaining issues that we identified from our use cases and examples, it seems that we have come to the end of those that can be picked off individually and dealt with without consequences elsewhere. Several are issues of choice, along the lines of “there’s more than one way to do X, can we clarify which is best?” Best practice is difficult to define and identify, and there will be winners and losers whatever option is picked. The choice will depend on analysis of whatever existing practice currently is as well as trade-offs such simplicity versus expressiveness. Another example where existing practice is important comes with issues that will affect how Google services such as Job Search work. Specifically, Google recommends values for employmentType that don’t seem to match all requirements, and these values are just textual tokens whereas we might want to suggest the more flexible and powerful DefinedTerm. However, we don’t want to recommend practice that conflicts with getting job postings listed properly by Google. While some Google search products leverage schema.org terms, the requirements that they specify for value spaces like the different employmentTypes are not defined in schema.org; and while schema.org development is open, other channels are required to make suggestions that affect Google products. The final category of open issue that I see is where a new corner of our domain needs to be mapped, rather just one or two new terms provided. This is the case for providing information about assessments, and for where we touch on providing information about the skills etc. that a person has.

So, there is more work to be done. I think starting with some further work on examples and best practice is a good idea. This will involve looking at existing usage, and mapping relevant parts of schema.org to other specifications (that latter task is happening in other fora, so probably something to report on here rather than start as a separate task). As ever, more people in the group and engagement from key players is key to success, so we should continue to try to grow the membership of the group.

Thank you all for your attention and contributions over the last year; I’m looking forward to more in the coming months.

Ackowledgement / disclosure

I (Phil Barker) remain grateful to the continued support of the US Chamber of Commerce Foundation, who fund my involvement in this group.

The post One year of Talent Marketplace Signaling appeared first on Sharing and learning.

On Talent Pipeline Management⤴

from @ Sharing and learning

I’m prompted by a #femEdTech tweet to write about some of the work I’m involved in regarding linking education to employment:

This is going to be a tricky topic to write about, if I get it wrong one way or another I will either offend people with whom I enjoy working or seem to be giving the opposite message to the one I intend.

The work in question is on Talent Signalling for the Job Data Exchange, but what I have in mind in particular is some of the wider context for that work, which goes under the banner of Talent Pipeline Management. Now, there is a lot that I don’t like about the rhetoric and metaphors here, I won’t dwell on them, if you’re likely to get it you won’t need it explaining. Once I got passed that, what  impressed me, was the idea brought in from supply chain management, explained to me by Bob Sheets, that if you want to go beyond a low quality commodity-like approach (by analogy cheap components  sourced with price as the only criterion) you needed to “go deep”. That is, you need to build a deep relationship to create understanding–it’s all social constructivism now–between those all involved education, training and learning, those involved in recruitment, and those involved in strategic planing for the local economy.

The approach seems much deeper than I have seen in the UK, for example in industry liaison committees at Universities, because it involves getting all levels & contexts of education provider together to work with industry and business on things like curricula and training opportunities. This is described in detail through the TPM Academy. Again, anyone from an education background will flinch at the industry-focused utilitarian view of education shown in how it is presented, but the underlying idea seems valuable.

So my current thoughts and questions are: how does this look from the learner/worker/job seeker point of view? [Quick note to self: check on whether they are included in the conversations defining curricula.] I think that is key to keeping this work on the right side of education being just about satisfying the need for cheap labour. Secondary question: is my glibly stated opinion that this goes deeper than approaches I’ve seen in the UK just an admission of ignorance? [Answers in the comments please!]

Going forward, my work will continue to look at the data that can be communicated through things like job adverts, course and qualification descriptions, trying to build the underlying infrastructure that allows “faster clearer signals” and stonger linkages between employment and education / training to help build these deeper relationships. I’m also getting involved in how  individuals’ acheivements can be represented semantically, so that will bring in a whole raft of questions about who controls the creation and dissemination of this data.

The post On Talent Pipeline Management appeared first on Sharing and learning.

Talent marketplace signalling and schema.org JobPostings⤴

from @ Sharing and learning

For some time now I have been involved in the Data Working Group of the Jobs Data Exchange (JDX) project. That project aims to help employers and technology partners better describe their job positions and hiring requirements in a machine readable format. This will allow employers to send clearer signals to individuals, recruitment, educational and training organizations about the skills and qualifications that are in demand.  The data model behind JDX, which has been developed largely by Stuart Sutton working with representatives from the HR Open Standards body, leverages schema.org terms where possible. Through the development of this data model, as well as from other input, we have many ideas for guidance on, and improvements to the schema.org JobPosting schema. In order to advance those ideas through a broader community and feed them back to schema.org, we have now created the Talent Marketplace Signaling W3C Community Group.

In the long term I hope that the better expression of job requirements in the same framework as can be used to describe qualifications and educational courses will lead to better understanding and analysis of what is required and provided where, and to improvements in educational and occupational prospects for individuals.circles and lines representing entity-relationship domain models


About the Talent Marketplace Signaling Community Group

Currently, workforce signaling sits at the intersection of a number of existing schema.org types: Course, JobPosting, Occupation, Organization, Person and the proposed EducationalOccupationalCredential. The TalentSignal Community Group will focus initially on the JobPosting Schema and related types. I think the TalentSignal CG can help by:

  • providing guidance on how to use existing schema.org terms to describe JobPostings;
  • proposing refinements (e.g. improved definitions) to existing schema.org types serving the talent pipeline; and
  • suggesting new types and properties where improved signaling cannot otherwise be achieved.

I hope that the outcomes of this work will be discrete improvements to the JobPostings schema, e.g. small changes to definitions, changes to how things like competences are represented and linked to JobPostings, and guidance, probably on the schema.org wiki, about using the JobPosting schema to mark up job adverts. Of course, whatever the Community Group suggests, it’s up to the schema.org steering group to decide on whether they are adopted into schema.org, and then it’s up to the search engines and other data consumers as to whether they make any use of the mark up.

The thinking behind the having a wider remit than the currently envisaged work is to avoid setting up a whole series of new groups every time we have a new idea [lesson learnt from moving from LRMI to Course description to educational and occupational credentials].

Call for participation

If you’ve read this far you must be somewhat interested  in this area of work, so why not join the TMS Community Group to show your support for the JDX and more broadly the need and importance for improved workforce signaling in the talent marketplace? You can join via pink/tan button on the Talent Signal CG web page. You will need to have a W3C account and to be signed in order to join (see the top right of the page to sign-in or join). The only restriction on joining is that you must give some assurances about the openness of the IPR of any contributions that you make. The outcomes of this work will feed into a specification that anyone can use, so there must be no hidden IPR restrictions in there.

The group  is open to all stakeholders so please feel free to share this information with your colleagues and network.


I’m being paid by the US Chambers of Commerce Federation to carry out this work. Thank you US CCF!

The post Talent marketplace signalling and schema.org JobPostings appeared first on Sharing and learning.