Using HTML5 and Javascript to Deliver Text-Based Audio Descriptions

IBM Research-Tokyo recently partnered with NCAM to research ways to deliver online audio descriptions via text-to-speech (TTS) methods, rather than using human recordings. IBM and NCAM explored two approaches which exploit new HTML5 media elements– <video>, <audio> and <track>– as well as Javascript and TTML:

  1. Writing and time-stamping a description script, then delivering the descriptions as hidden text in real time in such a way that a user’s screen reader will read them aloud. The descriptions remain otherwise invisible and inaudible to non-screen-reader users.
  2. Writing and time-stamping descriptions, then recording them using TTS technology. At the time of playback, each description is individually retrieved and played aloud at intervals corresponding to the time-stamped script.

Read a full description of the project, see the demonstration files and, if you want, download the code to see how it all works; then come back and let us know what you think.