All of the interesting technological, artistic or just plain fun subjects I'd investigate if I had an infinite number of lifetimes. In other words, a dumping ground...

Monday 28 July 2008

Java access to Microsoft file formats

Apache POI - Java API To Access Microsoft Format Files


The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you can read and write MS Excel files using Java. Soon, you'll be able to read and write Word, PowerPoint and Visio files using Java. POI is your Java Excel solution as well as your Java Word solution. However, we have a complete API for porting other OLE 2 Compound Document formats, and welcome others to participate.

OLE 2 Compound Document Format based files include most Microsoft Office files such as XLS and DOC as well as MFC serialization API based file formats.

At this time, none of our releases support the new Office Open XML file formats, such as .xlsx or .docx. Work to support these is in progress, and people interested should follow the dev list. We expect this support to make it into a full release by the summer.

As a general policy, we try to collaborate as much as possible with other projects to provide this functionality. Examples include: Cocoon for which there are serializers for HSSF; Open Office.org with whom we collaborate in documenting the XLS format; and Lucene for which we provide format interpretors. When practical, we donate components directly to those projects for POI-enabling them.


No comments:

tim's shared items

Add to Google Reader or Homepage