[Update: thanks for the suggestions; we went with RenderX and have it running from cron to rebuild our product manual as it changes]
I've been playing around with DocBook this weekend, converting our product manual to HTML and PDF. I'm using the docbook xsl stylesheets to convert to HTML and XSL-FO, and then using an FO processor to convert from XSL-FO to PDF.
Apache FOP is a free FO processor, but the version that gentoo emerged for me borks on our manual; it either stops generating pages after page 36, or spins in an infinite loop.
I also tried XMLroff, which is the only C based FO processor I've found (based on libxml2). It segfaults straight away for me, so it's not immediately useful; maybe a future release will work.
I've downloaded trial versions of two commercial offerings; RenderX XEP and Lunasil XINC.
RenderX seems to work ok, but blanks out every odd page after page 11, so it's a little bit hard to figure out if we want to pay $300 for the full version. It does look promising though, and the price doesn't sound that bad.
Lunasil XINC appears to be based on an older version of Apache FOP, and doesn't have support for PDF bookmarks. It works though, which is more than can be said for the real Apache FOP that I tried. Lunasil XINC is only $95 for the full version.
Does anyone else have any experience in this area and care to share it? Has anyone dared to implement XSL-FO -> PDF using PHP ?
At SitePoint (http://www.sitepoint.com/), we've used RenderX XEP for several years to publish all of our books. As the one responsible for keeping our book rendering system healthy, I can certainly testify to the quality and reliability of the product.
We're in the process of implementing PDF rendering of all the articles on sitepoint.com using the product, but as the render is a relatively heavy process we will be pre-rendering on content submission, rather than upon request.
Also Apache FOP cosumes too much memory. It takes more than 1 GB to prcess the PHP manual. I'd also be glad to know if there is a better (and free) FO processor.
I wrote a complete FO script to convert a technical manual from a custom XML format to PDF using Apache FOP. I tryed running the script under RenderX to see if it could run any faster, and the results were not those expected at all. I have the feeling the implementation of the XSL-FO specification is still immature... a bit like HTML a few years ago.
The script is now used in production using Apache FOP. It generated a manual of over 600 pages, including multiple SVG graphics. It does work well, but works better under the graphic designer's monster PC. Rasterizing the SVG images eats up a lot of memory and CPU. removing the images, generation time dropped from 15 minutes to a few seconds.
The solution is definetly not for web-real-time transactions, but if you can afford generating the documents over night, it's good enough.
Good part about FOP over RenderX is that there is no GUI.
I also generated a few documents using PDFLib with PHP (take care, there are changes between PHP4 and PHP5... and documentation was a problem a few months ago). It's not too hard, but for complex documents it might get harder. I can't really compare on large scales.
We are using Ecrion XF Rendering Server from www.ecrion.com (http://www.ecrion.com) to process invoices from XML to PDF (via XSL-FO). It is very fast and it costs only $995. Very reliable and consumes a lot less memory.
