Helloooo, can anybody hear me?!
Can PDF users actually hear me? Not likely, but if you are using Acrobat 4.05 and Windows, you can definitely hear your PDF files! That’s right, Text-to-Speech technology (TTS) has finally arrived.
PDF TTS is one area that Adobe seems to have nailed down tight and solid the very first time. After running extensive tests with it, I was left very impressed. For someone with no prior knowledge or experience with TTS, the whole installation and configuration experience was almost effortless.
Where do I start?
You start your venture into this new PDF technology by downloading a 10-page Adobe White Paper titled ‘Optimizing Adobe PDF Files for Accessibility‘ [PDF: 86kb].
Go through the well laid out web pages to find out exactly what you will need for your system, and where to get it. Adobe has done a fine job of documenting everything they are responsible for, and they have provided all the required links for any extra things you may need.
What’s it all about?
PDF TTS allows Acrobat to read your PDF files out loud through your computer’s sound system. It does so with human like voices (with Microsoft’s TTS engines, thanks Bill). PDF TTS is designed mainly for meeting the accessibility needs of visually challenged PDF users. However, as Adobe points out, and as you will soon discover, being able to hear your PDF document is not only appealing to visually challenged users. This technology has many other potential uses as well.
How does it all work?
TTS Properties include:
TTS Methods include:
According to Adobe’s Components for Enabling Accessible PDF Forms Web page, the following actions have sound and speech cues attached to them:
Sound cues are user configurable items that play a wav file and give the user an indication of what is happening inside Acrobat. Sound cues are stored in the Acrobat/SoundCues folder. The following actions have sound cues attached to them:
- Document open, close, activate, save, print (DocOpen.wav, DocClose.wav, DocSave.wav, DocPrint.wav)
- Page turn (PageTurn.wav)
- Keystroke handling when in a text edit:
- If a backspace or delete is pressed, speak the letter and a sound cue for deletion (ActionDelete.wav)
- Cut/copy/paste, play a sound cue (ActionCut.wav, ActionCopy.wav, ActionPaste.wav)
- Home/end, play a sound cue and/or speak the letter (KeyHome.wav, KeyEnd.wav)
Speech cues are user configurable items that invoke the text to speech engine to give the user context. The following actions will have speech cues attached to them (actual text may vary given usability testing, see AFSpeech.js for final strings):
- Application initialize (string not customizable).
- Document open, close, activate, save, print: speaks the document title and action type. E.g. Document 1040EZ printed.
- Page turn: speaks the page number. E.g. Page 2 of 10.
- Field focus and blur: speaks the value of the field. E.g. Text field ‘Last Name’, value is Smith.
- Alert dialogs. E.g. Alert dialog error: you have not filled in the income field. Press okay to continue.
- Keystroke handling when in a text edit:
- As individual keys are pressed the letters for those keys should be spoken.
- After a space is pressed, the entire word should be spoken.
- If a backspace or delete is pressed, speak the letter and a sound cue for deletion.
- If a selection is made, speak the selection in high pitched voice.
- Home/end, play a sound cue and/or speak the letter.
- Cursor movement, speak the letter.
- Control-left-arrow and control-right-arrow should move by words (and speak the word moved to).
For example, given the text: ‘This is a test’. Pressing HOME speaks ‘T’. Pressing RIGHT ARROW speaks ‘h’. Pressing LEFT ARROW would speak ‘T’. And pressing RIGHT ARROW again would speak ‘i’.
Speech cues are interruptible. That is, if two actions are performed (both of which require a speech cue) then the first speech cue may be interrupted by the second speech cue. This is to allow for quick traversal over the document without buffering up an undue amount of speech. An example of this behavior can be demonstrated by quickly tabbing through the fields on a page.
OK, impress me even more!
Because this is a PDF Forms-based implementation of TTS, there are countless ways you can use it interactively and dynamically. Try this:
You can really let your imagination run wild with this. There are almost endless uses for this system, and all you need is a bare-bones PDF Form with two simple fields on it. It couldn’t be any simpler.
What’s the drawback?
At the present time, this PDF TTS system only works with the contents of PDF Forms. It will not read out the text of a standard PDF page (more about that below), nor will it read out the text contained in any PDF images. However, with a bit of careful planning and some hidden text fields, you can easily overcome this so that your PDF files are fully TTS compliant!
Here’s an insanely easy example. An entire existing ebook, in PDF format, could easily be made to read out loud by simply generating a transparent overlay template containing a hidden text field, on each page of the book. Then it would only be a matter of copying and pasting the ASCII text for each page into the corresponding hidden field.
The vast majority of this task can easily be automated with my FREE PDF Enhancement Generator [PDF: 100 kb] available at Planet PDF.
Suddenly ebooks are accessible to a much bigger market! I see even more profits being derived from existing PDF formatted ebooks. Ebook publishers everywhere should love me to pieces for telling them this and handing over everything they need to make it happen, free of charge, and with minimal effort. I’ve done my part as a PDF researcher, now it’s up to ebook publishers to do theirs. It’s an important job, so get it done.
It’s also an investment for the future. Eventually, I suspect, the Microsoft TTS engines will be available for other platforms. When they are, your ebooks will already be ready for action because this PDF TTS technology is cross-platform compatible.
All this is merely the beginning. There’s no longer any reason for Windows users not to have this type of accessibility, starting now. The technology is free, it’s also quick and easy to implement. (Planet PDF has all the resources, tools and training material you need for the task). Let’s use the technology to help make the world a better place for everyone.
All this new TTS stuff is so great that I almost forgot to mention all the other free accessibility tools that Adobe provides for dealing with actual PDF text. Here is a listing and their URLs.