With the advent of Adobe Acrobat 5.0, it is now possible to use a PDF form in your web applications. Adobe has beefed up their JavaScript implementation so that most of the things you would ever want to do with a PDF can be done programmatically. There is even an object called the ADBC, or Acrobat Database Connectivity, object that allows for database access in much the same way as Microsoft's ActiveX Data Objects, and it serves as an interface to ODBC for the PDF. Subordinate to it there are the Connection object and Statement object. As the names might imply, the Connection object manages the actual connectivity to the database, and the Statement object allows for the execution of SQL statements.

This is all well and good, but why do this when you could do a standard HTML form with server scripting behind it? There are several reasons.

  • Your form looks just like the printed version. WYSIWYG. This may make the learning curve easier if you are moving from paper forms to the web. Or perhaps your forms are already in PDF, and you don't want to recreate the wheel.
  • The ability to save the document in PDF. Duh.
  • Collaboration. Version 5.0 allows users to annotate PDFs as if they were on their desktop, and the annotations are instantly updated on the server so that real time collaboration is possible. It is also possible to download the form, mark it up offline, and upload the annotations later.
  • Encryption. Acrobat 5.0 offers 128-bit RSA encryption right out of the box. You can publish your document on your web server, restricting access only to those you specify on your trusted certificate list.
  • Digital signatures. This is a pretty cool feature. It uses the RSA algorithm to produce the public/private key pair and the X.509 standard for certificates.

The neat thing is, when you have your form published on your web server — intranet, internet, whatever — if they are in your trusted certificate list, users can log in to the document, fill out the form, sign it, and submit it. Through JavaScript, you can have data validation down to the field level. You can also specify, through code, to only submit the document if the signature is valid and authenticated. The submitted form data can be sent in Adobe's Forms Data Format, HTML, or XML. You can also send the entire PDF. Choosing to send it by HTML puts the field data into the body of a HTTP Request, which can be parsed and dealt with at the server.

I'm using this setup at work. I have a customer that does oversight for a federal program, and he wants to move away from paper. The problem was having a signature is a requirement for this program. So he sent me the form in Microsoft Word format, I distilled it into PDF, added Acrobat form fields, and published it. The signatures are all housed in a secure directory, and the scripting writes data to the database only when Acrobat validates the signature on the form. Though we haven't put it into production yet, it seems to work as expected.

Caveat: This only works in version 5.0. Also, everyone accessing the form has to have either Acrobat 5.0 on their machine or Acrobat Approval 5.0. The annotation, saving to disk, and signature features are not available in Acrobat Reader. Also, 128-bit encryption might have been strong a few years ago, but it's pretty much obsolete having been cracked by Ian Goldberg in 1997. Using 250 computers, it took him about four hours to break it. I wouldn't rely on this if you are doing top secret work, but it will keep the boys in accounting or whatever from peeping where they shouldn't. Unless they are hella 1337 accountants and have a beowulf cluster or something.