PDFs and their content – Part 2

Editor’s Note:
Jim King is a Senior Principal Scientist and PDF Architect at Adobe. This article originally appeared on Inside PDF, and has been reprinted with permission.

I think PDF forms represents a very powerful and significant tool. Increasingly we want both humans and computers to read documents. However, the requirements for easy reading for each is considerably different.

For a long time the primary use of computers has been ‘data processing’. Business data processing has worked with structured data consisting primarily of numbers and text strings whose meaning and properties are well defined and known a priori. Much of the data semantics is build into the data processing software. In the last decade or two, distribution and sharing of information among humans has moved in to share the primary spot.

Humans, when given numbers and strings, also need a context in which to understand their meaning and significance.

When creating a PDF form, a designer makes a very explicit decision of which information is needed by humans and which is needed by our data processing software. People do document processing and for the most part computers do data processing.

Those places where fill-able blanks occur in a form define the data that is being collected or displayed. The background or "artwork" of the form turns the raw data into a document that provides the context in which the human can understand the data.

Here is a diagram showing how an artwork presentation layer and a data layer come together to make a document from which both the human and the computer can obtain exactly what they need.

PDF forms maintains this separation of layers and the data layer can be imported or exported into and out of the form artwork layer. The humans see the composed document and the computer can process the data only, with the traditional data processing software. Either the computer or the human can supply the contents for the data layer for presenting or gathering the data.

So, I think forms offer a very clever way for computers and humans to both see that part of a document most suitable and necessary for them to process.

You May Also Like

About the Author: Jim King

Leave a Reply