Apache PDFBox 1.4.0 发布

发布于 2010年12月21日
PDFBox 是一个用来操作 PDF 文档的 Java 类库。

PDFBox 1.4.0 是 1.x 系列的功能改进版本,包含很多关于文本提取、AES加密和异常PDF文档处理的改进和bug修复。


New Features

   [PDFBOX-865] - Optional Content Groups (OCGs aka layers): initial support
   [PDFBOX-913] - Add program which decompresses object streams


   [PDFBOX-521] - Improved PDF Text Extraction that notes paragraph boundaries
   [PDFBOX-885] - Add constructors from super class to PDFTextStripperByArea to support encoding
   [PDFBOX-893] - Performance improvement in PDFStreamEngine and Matrix (patch included)
   [PDFBOX-909] - Add support for a 6 element matrix
   [PDFBOX-914] - Using TextToPDF to create a PDF from the empty string produces unreadble PDF file (patch included)

Bug Fixes

   [PDFBOX-28] - Spliiting a PDF creates unnecessarily large chunks
   [PDFBOX-671] - Cannot use PDFToImage to convert Chinese PDF pages into images.
   [PDFBOX-751] - Text Extraction truncates last character when image page has sideways text
   [PDFBOX-759] - Special characters not extracted
   [PDFBOX-779] - All English characters and some Chinese words are separated by a space
   [PDFBOX-806] - Failure to extract dc:description when the value is the node text
   [PDFBOX-854] - PDPageContentStream.drawString() doesn't work with all PDFs
   [PDFBOX-872] - ERROR org.apache.pdfbox.filter.FlateFilter - Stop reading corrupt stream
   [PDFBOX-881] - Incorrect output when word spacing is achieved by matrix translation
   [PDFBOX-883] - Special characters are not correctly handled anymore when printing or exporting to image
   [PDFBOX-887] - CCITTFaxDecodeFilter doesn't use the abbreviated names for image parameters
   [PDFBOX-888] - Decrypt doesn't allow more then 3 args
   [PDFBOX-889] - Empty page causes NPE in importPage
   [PDFBOX-896] - PDFViewer doesn't render landscape mode correctly
   [PDFBOX-897] - NullPointerException PDFFont#getEncodingFromFont with a PDF book because Type1Encoding is null
   [PDFBOX-898] - COSStreamArray NullPointerException. firstStream is null if COSArray contains no items
   [PDFBOX-900] - ArrayIndexOutOfBoundsException with extracting labels from malformed document
   [PDFBOX-902] - ClassCastException caused by unhandled Markup Annotations.
   [PDFBOX-907] - Encrypted Key not correctly calculated when the meta data is not encrypted
   [PDFBOX-910] - certain sequences (such as endstrea[^m] are eaten by BaseParser#readUntilEndStream
   [PDFBOX-918] - Can't parse PDF
   [PDFBOX-921] - NumberFormatException when parsing a type1 font

Apache PDFBox 1.4.0 发布