Tag Archives: PressBooks

Using the WordPress REST API to post a book from WikiSource to PressBooks with python⤴

from @ Sharing and learning

I am using Pressbooks to build an online edition of Southey and Coleridge’s Omniana. I transcribed the text for Volume I on wikisource. This post is about how I got that text into pressbooks; copy and paste didn’t appeal, so I thought I would try using the WordPress REST API. You could probably write a PHP plugin that would do this, but I find python a bit easier for exploratory work, so I used that.

Getting the data from Wikisource is reasonably trivial. On wikisource I have transcluded the page transcriptions into a single HTML file of the whole book. This file is relatively easy to parse into the individual articles for posting to Pressbooks, especially as I added <hr /> tags before each article (even the first) and added stop at the end.

In the longer term I want to start indexing the PressBook Omniana using wikidata for linked data. This will let me look at the semantic graph of what Southey and Coleridge were interested in.

First steps with the WordPress API

I’ve not used the WordPress API before, but it is well documented and there is a useful series of articles on envatoTuts+: Introducing the WP REST API.

Put /wp-json onto the end of a WordPress blog URL and you can see the routes and endpoints (e.g. this blog, my Pressbooks/Omniana). (I use the JSON viewer chrome plugin to make these easier to read.) I found wp-api-python very useful in helping make requests against these in python. It’s available via pip as wordpress-api and I found it required python the libraries request beautifulsoup4requests-oauthlib and six. It authenticates via  OAuth, so on WordPress you need the  WordPress REST API – Oauth1.0a plugin or similar; there’s more than you need to know about how OAuth works  on envatotuts+.

I installed the Oauth1.0a plugin for the network on a WordPress multisite and PressBook test servers. Network activation seemed to generate errors on Pressbooks and plain multisite WordPress, so I activated it only for the individual blog/book. Then in the Users tab on the admin screen I was will be able to view and set up applications:

Add Application screen from the OAuth1.0a plugin

Filling out the details and clicking on save consumer and  gave me a client key and client secret.

Back in python I used these to poke around the various API endpoints of my test multisite installation of WordPress, e.g.

from wordpress import API
base_url = "http://wordpress.home.local/test"
api_path = "/wp-json/wp/v2/"
wpapi = API(
    url=base_url,
    consumer_key="thisismykey",
    consumer_secret="thisismysecret",
    api="wp-json",
    version="wp/v2",
    wp_user="phil",
    wp_pass="thisismypassword",
    oauth1a_3leg=True,
    creds_store="~/.wc-api-creds.json",
    callback="http://wordpress.home.local/test/api-test"
)
print("listing posts")
resource = "posts"
try:
    response = wpapi.get(base_url+api_path+resource)
    for post in response.json():
        print(post['id'], post['title'])
except Exception as e:
    print("couldn't get posts")
    print(e)

wpapi uses requests methods, documented here.  Other useful properties and methods are

  • r.ok: boolean, True if HTTP status code is <400
  • r.content, response content in bytes,
  • r.text, response content in text
  • r.headers, response headers
  • r.iter_lines() content a line at a time
  • r.json() response as a json object

Posting to WordPress

Following the envatoTuts+ Creating, Updating, and Deleting Data article and translating to python:

from wordpress import API
base_url = "http://wordpress.home.local/test"
api_path = "/wp-json/wp/v2/"
wpapi = API(
    url=base_url,
    consumer_key="thisismykey",
    consumer_secret="thisismysecret",
    api="wp-json",
    version="wp/v2",
    wp_user="phil",
    wp_pass="thisismypassword",
    oauth1a_3leg=True,
    creds_store="~/.wc-api-creds.json",
    callback="http://wordpress.home.local/test/api-test"
)

print("creating new post")
resource = "posts"
title = "86. Glover's Leonidas."
content = """Glover's Leonidas was unduly praised at its first appearance, and more unduly ...
..."""
excerpt = """Glover's Leonidas was unduly praised at its ..."""
data = {
    "content": content,
    "title": title,
    "excerpt": excerpt,
    "status": "draft",
    "categories": [190]
}
try:
    response = wpapi.post(base_url+api_path+resource, data)
    print(response.json())
except Exception as e:
    print("couldn't post")
    print(e)

The posts resource collection allows creation and retrieval  (POST and GET methods); a specific posts/(?P<id>[\d]+) resource allows update and delete (PUT, PATCH and DELETE methods).

The keys for the data dict are the same as the schema for the WordPress API method, which are also shown in the arguments listed in the JSON returned by wp-json for each endpoint under each route.

Posting to Pressbooks

Pressbooks has a whole extended set of api routes and endpoints, no ‘posts’ resources, but front-matter, back-matter, parts and chapters; all under the /pressbooks/v2/ path.

There is some documentation on the Pressbooks site.  I’m posting articles as chapters into a Pressbook site that already has some organised content, so I don’t have to worry about setting them up. Adapting from the above, changing to URL and credentials to those for my local test instance of Pressbooks, and changing the api-path, version, and resource name, this posts a test chapter to the content part of my book, as a “numberless” chapter-type:

from wordpress import API
base_url = "http://books.home.local/omniana"
api_path = "/wp-json/pressbooks/v2/"
wpapi = API(
    url=base_url,
    consumer_key="thisismykey",
    consumer_secret="thisismysecret",
    api="wp-json",
    version="pressbooks/v2",
    wp_user="phil",
    wp_pass="thisismypassword",
    oauth1a_3leg=True,
    creds_store="~/.wc-api-creds3.json",
    callback="http://books.home.local/omniana/api-test"
)
print("creating new chapter")
resource = "chapters"
data = {
"content": "test",
"title": "test",
"status": "publish",
"chapter-type": 48,
"part": 27
}
try:
response = wpapi.post(base_url+api_path+resource, data)
pprint(response.json())
except Exception as e:
print("couldn't post")
print(e)

Finding the ids for chapter-type and part need a little detective work. You can, of course use an API call to GET the parts and  list their names and ids, in a similar way to listing the posts in the first example above; or you can just edit the part or chapter-type in the Bookpress admin interface and inspect the url. It’s also worth noting that you need a different creds_store for each OAUTH provider you connect to.

Next Steps

As I said, parsing reading through and parsing the transcluded the page transcriptions wasn’t too hard (I put some markers in the transclusion to help). I made some changes to the content before posting it: perhaps the most interesting issue was  changing the wiki style footnotes to Pressbook style.

At the time of writing, I have started posting to the live/public instance of Omniana on Pressbooks but still have to sort some formatting issues: removing line breaks, making sure that the CSS selectors are appropriate for WordPress; that shouldn’t take long to fix.

Then I want to start indexing the articles using wikidata for linked data.

The post Using the WordPress REST API to post a book from WikiSource to PressBooks with python appeared first on Sharing and learning.

PressBooks and ePub as an OER format.⤴

from @ Sharing and learning

PressBooks does a reasonable job of importing ePub, so that ePub can be used as a portable format for open text books. But, of course, there are limits.

I have been really impressed with PressBooks, the extension to WordPress for authoring eBooks. Like WordPress it is available as a hosted service from PressBooks.com and to host yourself from PressBooks.org. I have been using the latter for a few months. It looks like a great way of authoring, hosting, using, and distributing open books. Reports like this from Steel Wagstaff about Publishing Open Textbooks at UW-Madison really show the possibilities for education that open up if you do that. There you can read what work Steel and others have been doing around PressBooks for authoring open textbooks, with interaction (using hypothe.is, and h5p), connections to their VLE (LTI), and responsible learning analytics (xAPI).

PressBooks also supports replication of content from one PressBook install to another, which is great, but what is even greater is support of import from other content creation systems. We’re not wanting monoculture here.

Open text books are, of course, a type of Open Educational Resource, and so when thinking about PressBooks as a platform for open text books you’re also thinking about PressBooks and OER. So what aspects of text-books-as-OER does PressBooks support? What aspects should it support?

OER: DERPable, 5Rs & ALMS

Frameworks for thinking about requirements for openness in educational resources go back to the very start of the OER movement. Back in the early 2000s, when JISC was thinking about repositories and Learning Objects as ways of sharing educational resources, Charles Duncan used to talk about the need for resources to be DERPable: Discoverable, Editable, Repurposable and Portable. At about the same time in the US, David Wiley was defining Open Content in terms of four, later five Rs and ALMS. The five Rs are well known: the permissions to Retain, Reuse, Revise, Remix and Redistribute. ALMS is a less memorable, more tortured acronym, relating to technical choices that affect openness in practice. The choices relate to: Access to editing tools, the Level of expertise required to use these tools, the content being Meaningfully editable, and being Self-sourced (i.e. there not being separate source and distribution files).

Portability of ePub and editing in PressBooks

I tend to approach these terms back to front: I am interested in portable formats for disseminating resources, and systems that allow these to be edited. For eBooks / open textbooks my format of choice for portability is currently ePub, which is essentially HTML and other assets (images, stylesheets, etc.) with metadata, in a zip archive. Being HTML-based, ePub is largely self-sourced, and can be edited with suitable tools (though there may be caveats around some of the other assets such as images and diagrams). Furthermore, WordPress in general and PressBooks specifically makes editing, repurposing and distributing easy without requiring knowledge of HTML. It’s a good platform for remixing, revising, reusing, retaining content. And the key to this whole ramble of a blog post is the ‘import from ePub‘ feature.

So how does  the combination of ePub and PressBooks work in practice. I can go to OpenStax, and download one of their text books as ePub. As far as I can see the best-known open textbook project doesn’t seem to make ePub available (Apple’s iPub is similar, but I don’t do iBooks so couldn’t download one). So I went to Siyavula and downloaded one of their CC:BY textbooks as an ePub. Chose that download for import into PressBooks and got a screen that lets me choose which parts of the ePub to import and what type of content to import it as.

List of sections of the ePub with tick box for whether to import in PressBooks, and radio button options for what type of book part to import as

After choosing which parts to import and hitting the import button at the bottom of the page, the content is there to edit and republish in PressBooks.

From here you can edit or add content (including by import from other sources), rearrange the content, and set options for publishing it. There is other work to be done. You will need to choose a decent theme to display your book with style. You will also need to make sure internal links work as your PressBooks permalink URL scheme might not match the URLs embedded in the content. How easy this is will vary depending on choices made when the book was created and your own knowledge of some of the WordPress tools that can be used to make bulk edits.

I am not really interested in distributing maths text books, so I won’t link to the end result of this specific example. I did once write a book in a book sprint with some colleagues, and that was published as an ePub. So here an imported & republished version of Into The Wild (PressBook edition).  I didn’t do much polishing of this: it uses a stock theme, and I haven’t fixed internal links, e.g. footnotes.

Limitations

Of course there are limits to this approach. I do not expect that much (if any) of the really interesting interactive content would survive a trip through ePub. Also much of Steel’s work that I described up at the top is PressBook platform specific. So that’s where cloning from PressBooks to PressBooks becomes useful. But ePub remains a viable way of getting textbook content into the PressBooks platform.

Also, while WordPress in general, and hence PressBooks, is a great way of distributing content, I haven’t looked much at whether metadata from the ePub is imported. On first sight none of it is, so there is work to do here in order to make the imported books discoverable. That applies to the package level metadata in ePubs, which is a separate file from the content. However, what also really interests me is the possibility of embedding education-specific schema.org metadata into the HTML content in such a way that it becomes transportable (easy, I think) and editable on import (harder).

The post PressBooks and ePub as an OER format. appeared first on Sharing and learning.