Tag Archives: Transcription

Video including closed captions, method 2⤴

from @ @cullaloe | Tech, tales and imagery

Further to this recent post on capturing, processing and captioning video, my colleague, Dr. Audrey Cameron, advised me to try YouTube for capturing the .srt captions file more quickly. I am thankful to her for this, because although I was aware of YouTube’s rapidly improving automatic captioning (I use it myself when I watch with sound off, for example), I didn’t know that the .srt file can be downloaded. Here’s a revised approach I am using today.

Capturing the video and audio

For capturing an old-fashioned lecture-style talk using KeyNote, I use the facility to record a presentation (from the Play menu). The presentation can be exported as a .m4v video file with sound. At the same time as recording the presentation in KeyNote, I also record myself on my Fuji X-T2 camera and capture a high quality audio track separately on a Zoom H-1 recorder. A “pro” tip to is to clap just before you start presenting – it leaves a nice spike in the audio waveforms, making it easy to line up the separate tracks. You can also pretend to be Steven Spielberg or Alfred Hitchcock, according to your genre, by saying, “Action!” at the same time.

Processing

Import the video and audio tracks into iMovie, align them using the spike from your clap, and check that the audio is the same length as in the video track of the speaker. I found that for longer videos, over 15 minutes, they can be different. This difference produces an echo effect, eventually separating the video from the soundtrack, like a Swedish movie. Adjusting the audio to align properly can easily be done in iMovie using the speed adjustment. Once the clips are aligned, you can turn the audio level of the video clips down to zero, so only the high quality track remains.

The next thing I do is to change the video track of the presenter to “Picture in picture”, so viewers can see me presenting within the slides: I think this is a bit of a substitute for one of the features I miss from live presenting, which is managing the attention of the viewer. I normally do this by blanking the display, which has the effect of moving the eyes in the room from the screen to my face – a powerful way to add contrast to your talk. This “mini me” within the slides can be faded in or out, according to what you want the students to focus on at any point. Other effects are possible, like switching to embedding the presentation within the presenter video.

The finished project can be exported via the Share menu to a .mp4 file.

Getting the transcript

The video can be uploaded to YouTube now: you’ll need a verified account to upload clips longer than 15 minutes, which means giving Google your phone number. I baulked at this at first, but expedient is the slayer of principle, and in this case, privacy. Make sure you click “Private” when saving the clip. These lectures are not for public consumption.

After a while – maybe 30 minutes, depending on the length of your video – the automatically-generated captions file can be edited using a really nice editing interface in YouTube Studio designed for the purpose. You will need to add punctuation and if you wish, add comments to your own commentary. Once that has been done, the .srt file is ready to download.

Media Hopper Create

Your video can now be used within your local VLE, in my case, Blackboard Learn, by uploading via the media manager, Media Hopper. Once uploaded, you can then add closed captions by uploading the .srt file alongside it. Students then have a choice whether to access captions within the video sequences or not.

Another Hitch

I was about to post this, all smug, like, as I uploaded the latest video made with this method, when I hit a “file too large” error when uploading to Media Hopper. The video I had made was just short of 18 minutes and had a file size of 1.2GB. Now, mp4 is an efficient container format so I maybe made too many “best quality” choices in making the video: high definition 1080p for the presenter, same for the KeyNote. Rather than go back and do it all again, I resorted to ffmpeg to make me a reduced bitrate version. I thought halving the bitrate might produce a file half the size.

$ ffmpeg -i mybigvideo.mp4 	# find out what the current bitrate is..
...
  Duration: 00:17:28.55, start: 0.000000, bitrate: 9978 kb/s
...
$ ffmpeg -i mybigvideo.mp4 -b 4489k mysmallervideo.mp4

This made (after thrashing my 5-year old MacBook Air for about 25 minutes) a file – as hoped for – half the size (673 MB).

Deployment to a website

To use the video and captions file together within a webpage is straightfroward, except that the captions need to be in a different format. This format is Web Video Text Tracks (VTT), and is easily obtained using ffmpeg:

$ ffmpeg -i srtfile.srt subtitlefile.vtt

The web page needs the following code (adapted to your own file paths, obviously):

<p>  
<video width="640" height="360" controls="controls">
	<source src="https://www.learn.ed.ac.uk/path-to-video.mp4" type="video/mp4">  
	<track src="https://www.learn.ed.ac.uk/path-to-vtt-file.vtt" kind="captions" srclang="en" label="English" default>
	Your browser does not support the video tag.
</video>
</p>

Conclusion

Video production for ‘digital first’ teaching strategies needed in response to COVID-19 measures or similar, is a non-trivial task. Including closed captions is an additional time multiplier. Personally, I don’t like asynchronous teaching at all: it misses so many important aspects of good pedagogy, aspects which are easily ignored by administrators of education in the pursuit of apparent economies or easy fixes. I am at ease, however, with an established workflow.

I am thankful to my colleague, Audrey, for her patience and support in helping me get to this solution. We both have a lot of videos to make, and now it will not take me as long as it might have done.

Video capture and production including closed captions⤴

from @ @cullaloe | Tech, tales and imagery

Introduction

I am required to produce video resources for my students who are coming to the University very soon, either in person or digitally: our teaching under the COVID adjustments is “digital first”. We are also particularly keen to support those students who might require captions on their videos. This isn’t just those who might be hearing or visually impaired, it’s all students who might like the ability sometimes to have the extra clarity provided by words on the screen that reflect the words spoken by the presenter.

Here’s my take on a workflow model to make this work well. There are existing facilities to do this semi-automatically – uploaded videos can have an automated transcription generated but this takes a lot of time, and requires the creator to go back on subsequent days to hand-edit any errors or make any other adjustments.

I like to do a task and complete it: I like to get it right and take my time over that. Once it’s done, I like to move on to the next set of tasks. To that end, here is a way I have found of creating video, with quality captioning for those who need it, and the ability to switch it off for those who find it a distraction.

Contents

Subtitles vs. captions

Subtitles are embedded in DVD movies and the like for languages other than that on the audio. Multiple subtitle tracks can be embedded within a title, or switched off if the audio track is in the native language of the speaker. Captions, on the other hand, are a transcription of dialogue in the video. Closed captions are distinguished from open captions by being able to be switched off if required. Open captions are often embedded in the video and cannot be turned off. My intention in this workflow is to provide closed captions.

Capturing the video and audio

I captured my video for the proof-of-concept using a Fuji X-T2 APS-C digital mirrorless camera which takes 1080p (1920 x 1080) video. Although the camera records stereo audio, I prefer to capture audio separately using a Zoom H-1 recorder. The quality is much better, not least because the mic is with the speaker, not the camera.

Processing

I import the video and audio tracks into iMovie, trim out the top and tail and make other edits, and remove the embedded audio track captured by the camera. This is replaced with the Zoom audio, which can be a bit of a fiddle to align well with the video. The audio waveform can help here but that depends on the recording environment. Depending on the exposure settings on the camera, you might want to “enhance” the video in iMovie for a contrastier image. Once you have added any title sequences (or credits) and transitions, export the finished video (via the “share” menu) as an mp4 file.

Getting the transcript

Open the mp4 file in Quicktime and export as an audio file, the default m4a format is OK. This file can be uploaded to a blank Word document in Office 365 via the Dictate drop-down menu (select Transcribe). This will do a pretty decent job of turning your dialogue into text.

Save the .docx file and convert it to a plain text format - I prefer markdown, although this isn’t necessary provided you can finish up with a plain text file. I do the conversion using pandoc:

$ pandoc transcript.docx -o transcript.md

This file can now be edited, correcting any errors in the transcription and chunking the dialogue to make it show at the right time during the video. This is easily done with the video window open on the desktop next to your editor. You should add section markers and timestamps for subtitles as you go through the video. A minimal example:

1
00:00:01,00 --> 00:00:01,30
Welcome to my interesting video.

Finally, you will need this transcript in the correct format for embedding into the video file. This is a simple matter of changing the filetype to .srt (which is a “SubRip Subtitle” file).

Add Closed Captions to the video file

The ffmpeg tool is best for this. This tool (and others you might need) can be installed simply using brew if it isn’t already on your system. I don’t propose to detail how to do this, but in essence, it’s a matter of typing$ brew install ffmpeg in a terminal window. Once you have it, add the captions:

$ ffmpeg -i videofile.mp4 -i transcriptfile.srt -c:v copy -c:a copy  -c:s mov_text -metadata:s:s:0 language=eng output.mp4

The program accepts input files, which are you video and the srt file containing the captions and timings. Video and audio are merely copied to the output file.

Finished, part one

The video file you have just created is now shareable: users can play it on their machines and opt to switch captioning on or off if they wish. Their computer may choose to control this behaviour automatically if local settings allow it.

I need to distribute this video using resources within the VLE (virtual learning environment, in my case, Blackboard Learn and Media Hopper. This is where it gets sticky.

Media Hopper Create

It’s easy to upload a video to the Blackboard VLE by clicking on Media Hopper Create in the Tools menu. This is very nice but this process strips out the captions track. Embedding the uploaded video offer no CC option to viewers and no captions are visible. This is clearly a fault in the Media Hopper Create system.

You can ask for subtitles to be created for the uploaded video but this is an automated and low-quality service that isn’t really any good. It creates, ironically, a CC track that is inferior to the one included in the uploaded file.

A workaround

I have found a way of getting around this difficulty: within Learn, add a new item. This is effectively a webpage, and using the editing tools, you can upload two files. The first is the mp4 video file (it is not necessary for this file to have the embedded captions track).

The second file contains the captions and timing information in our srt file, but needs to be in a different format. This format is Web Video Text Tracks (VTT), and is easily obtained using ffmpeg:

$ ffmpeg -i srtfile.srt subtitlefile.vtt

Having uploaded these two files to the Learn Item, it is necessary to edit the HTML using the built-in HTML editor (click the double chevron at the right end of the tool bar to reveal it). The item source should be edited to contain a <video> tag:

<p>  
<video width="640" height="360" controls="controls">
	<source src="https://www.learn.ed.ac.uk/path-to-video" type="video/mp4">  
	<track src="https://www.learn.ed.ac.uk/path-to-vtt-file" kind="captions" srclang="en" label="English" default>
	Your browser does not support the video tag.
</video>
</p>

The path to video and the vtt files is available in the links to them put thee by Learn when you uploaded the files. It is not necessary to keep these links.

Finished, part two

This is a workaround but we now have the facility to make and upload good quality video with closed captioning that can be viewed by students within their Learn course.