Overview

Formatting objects (FO) are a part of XSLT specification, and are used to create printable output from an XML document. Take XHTML and CSS, which are used to present information in a web browser. The document is usually one gigantic page, which can be a huge pain in the ass if you want hard copy output. FO is used to provide this hard copy output, and theoretically you could use it to layout and print an entire book. Since formatting objects is currently an infantile technology, there aren't very many FO processors available yet. The two most common and complete are the Apache project's FOP, and XEP by RenderX. These two processors render the FO document directly to PDF, PS, PCL, and some other optional formats. Other processors are available, many of which convert the FO document to a TeX document, which must then be processed by TeX to get printable output.

The FO specification gives a stylesheet author the ability to define a number of pages and all of the properties of those pages; margins, headers, footers, and so on. These pages can then be arranged in a page sequence.

Say you wanted to print a report, duplexed (printed on both sides of the paper), with a two inch right margin for odd pages and a two in left margin for odd pages (for hole punching and putting into binders). FO gives you the power to do this, by allowing you to define separate page layouts for a cover page, an abstract page, a table of contents page, and then the even and odd pages. You can put these pages together using a page sequence, so that you have one title page, one abstract page, one table of contents page, and then alternating even and odd pages.

Process

I'll compare the process of getting a PDF out of an XML document with the process of getting HTML out of the same document.

It is pretty straight forward to write an XSLT stylesheet to transform an XML document into HTML. In some cases, you only have to replace the XML tags with proper HTML tags and add the header material (<head>, <title>, etc.). Once you have the XML document and a stylesheet to transform it, you can run a processor on the pair to create an HTML. This can either be done on the server, before a web browser looks at it, or if you have a browser with a built-in XSLT processor, such as IE 5.5 or Mozilla, you can simple include a stylesheet directive in the XML document pointing to the stylesheet, and when the browser loads the XML document, it will automatically load and process the stylesheet as well. To summarize:

  1. Generate the XML document (maybe from a database).
  2. Use a XSLT stylesheet to transform the XML into HTML.
  3. View the HTML in a browser.

Compare this to the process of generating a PDF file (using FOP) from an XML document. For this, you'll also have to create a XSLT stylesheet, but rather than spitting out html, it will convert the XML to another XML document where the tags are in the FO namespace. This stylesheet will in general be more complicated, but only because you have to include a lot more header information that describes the page formats and sequences.

Once you have the stylesheet, you process your XML document with it the same way you would for the HTML example. Instead of getting HTML, you now have an FO document, which must now be passed through a FO processor like FOP to get printable (PDF) output. It's just one extra step.

  1. Generate the XML document (maybe from a database).
  2. Use a XSLT stylesheet to transform the XML into a FO XML document.
  3. Run a FO processor (such as FOP) on the FO document.
  4. View the PDF in a reader.

Uses

So far, formatting object haven't really found a lot of use in the real world. This is probably mostly because of the newness of the technologies surrounding FO, and due to the fact that the formatting object specification is still in flux. However, there are some places that formatting objects will work to great advantage.

In particular, I envision FO being used to generate reports for business applications (something I am experimenting with now). Anyone who has used Crystal Reports knows what a bitch it can be; I think that formatting objects will eventually become easier to use than Crystal, but not before a good WYSIWYG editor is created for creating page layouts and generating the XSLT stylesheets usedf to convert XML into FO documents. If there were a web based application that did this, programmers could give their users the XML over the web and let users create their own reports, storing the resulting stylesheets on the server and doing all the processing there as well. Of course, there is a long way to go in this arena before FO approaches the usability of Crystal.

Conclusion

Since the only difference between HTML and PDF output is the stylesheet and an extra processing step, there is a lot of flexibility and extensibility available to present the same data in a lot of different formats. This is, of course, one of the big promises of XML and XSLT, and FO gives another type of output which makes this presentation mutability more of a reality.

References

Harold, Elliotte Rusty. "XSL Formatting Objects" XML Bible. 3 Jun. 2001 <http://www.ibiblio.org/xml/books/bible2/chapters/ch18.html>.

The Extensible Stylesheet Language (XSL). 10 Jul. 2001. The World Wide Web Consortium. 11 Aug. 2001. <http://www.w3c.org/Style/XSL/>.

Log in or register to write something here or to contact authors.