<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> <!--*** This is a generated file. Do not edit. ***--> <link rel="stylesheet" href="../skin/tigris.css" type="text/css"> <link rel="stylesheet" href="../skin/mysite.css" type="text/css"> <link rel="stylesheet" href="../skin/site.css" type="text/css"> <link media="print" rel="stylesheet" href="../skin/print.css" type="text/css"> <title>POI-HDGF - Java API To Access Microsoft Visio Format Files</title> </head> <body bgcolor="white" class="composite"> <!--================= start Banner ==================--> <div id="banner"> <table width="100%" cellpadding="8" cellspacing="0" summary="banner" border="0"> <tbody> <tr> <!--================= start Group Logo ==================--> <td width="50%" align="left"> <div class="groupLogo"> <a href="http://poi.apache.org"><img border="0" class="logoImage" alt="Apache POI" src="../resources/images/group-logo.jpg"></a> </div> </td> <!--================= end Group Logo ==================--> <!--================= start Project Logo ==================--><td width="50%" align="right"> <div align="right" class="projectLogo"> <a href="http://poi.apache.org/"><img border="0" class="logoImage" alt="POI" src="../resources/images/project-logo.jpg"></a> </div> </td> <!--================= end Project Logo ==================--> </tr> </tbody> </table> </div> <!--================= end Banner ==================--> <!--================= start Main ==================--> <table width="100%" cellpadding="0" cellspacing="0" border="0" summary="nav" id="breadcrumbs"> <tbody> <!--================= start Status ==================--> <tr class="status"> <td> <!--================= start BreadCrumb ==================--><a href="http://www.apache.org/">Apache</a> | <a href="http://poi.apache.org/">POI</a><a href=""></a> <!--================= end BreadCrumb ==================--></td><td id="tabs"> <!--================= start Tabs ==================--> <div class="tab"> <span class="selectedTab"><a class="base-selected" href="../index.html">Home</a></span> | <script language="Javascript" type="text/javascript"> function printit() { if (window.print) { window.print() ; } else { var WebBrowser = '<OBJECT ID="WebBrowser1" WIDTH="0" HEIGHT="0" CLASSID="CLSID:8856F961-340A-11D0-A96B-00C04FD705A2"></OBJECT>'; document.body.insertAdjacentHTML('beforeEnd', WebBrowser); WebBrowser1.ExecWB(6, 2);//Use a 1 vs. a 2 for a prompting dialog box WebBrowser1.outerHTML = ""; } } </script><script language="Javascript" type="text/javascript"> var NS = (navigator.appName == "Netscape"); var VERSION = parseInt(navigator.appVersion); if (VERSION > 3) { document.write(' <a title="PRINT this page OUT" href="javascript:printit()">PRINT</a>'); } </script> </div> <!--================= end Tabs ==================--> </td> </tr> </tbody> </table> <!--================= end Status ==================--> <table id="main" width="100%" cellpadding="8" cellspacing="0" summary="" border="0"> <tbody> <tr valign="top"> <!--================= start Menu ==================--> <td id="leftcol"> <div id="navcolumn"> <div class="menuBar"> <div class="menu"> <span class="menuLabel">Apache POI</span> <div class="menuItem"> <a href="../index.html">Top</a> </div> </div> <div class="menu"> <span class="menuLabel">HDGF</span> <div class="menuItem"> <span class="menuSelected">Overview</span> </div> </div> </div> </div> <form target="_blank" action="http://www.google.com/search" method="get"> <table summary="search" border="0" cellspacing="0" cellpadding="0"> <tr> <td><img height="1" width="1" alt="" src="../skin/images/spacer.gif" class="spacer"></td><td nowrap="nowrap"> Search Apache POI<br> <input value="poi.apache.org" name="sitesearch" type="hidden"><input size="10" name="q" id="query" type="text"><img height="1" width="5" alt="" src="../skin/images/spacer.gif" class="spacer"><input name="Search" value="GO" type="submit"></td><td><img height="1" width="1" alt="" src="../skin/images/spacer.gif" class="spacer"></td> </tr> <tr> <td colspan="3"><img height="7" width="1" alt="" src="../skin/images/spacer.gif" class="spacer"></td> </tr> <tr> <td class="bottom-left-thick"></td><td bgcolor="#a5b6c6"><img height="1" width="1" alt="" src="../skin/images/spacer.gif" class="spacer"></td><td class="bottom-right-thick"></td> </tr> </table> </form> </td> <!--================= end Menu ==================--> <!--================= start Content ==================--><td> <div id="bodycol"> <div class="app"> <div align="center"> <h1>POI-HDGF - Java API To Access Microsoft Visio Format Files</h1> </div> <div class="h3"> <a name="Overview"></a> <div class="h3"> <h3>Overview</h3> </div> <p>HDGF is the POI Project's pure Java implementation of the Visio file format.</p> <p>Currently, HDGF provides a low-level, read-only api for accessing Visio documents. It also provides a <a href="http://svn.apache.org/repos/asf/poi/trunk/src/scratchpad/src/org/apache/poi/hdgf/extractor/">way</a> to extract the textual content from a file. </p> <p>At this time, there is no <em>usermodel</em> api or similar, only low level access to the streams, chunks and chunk commands. Users are advised to check the unit tests to see how everything works. They are also well advised to read the documentation supplied with <a href="http://web.archive.org/web/20071212220759/http://www.gnome.ru/projects/vsdump_en.html">vsdump</a> to get a feel for how Visio files are structured.</p> <p>To get a feel for the contents of a file, and to track down where data of interest is stored, HDGF comes with <a href="http://svn.apache.org/repos/asf/poi/trunk/src/scratchpad/src/org/apache/poi/hdgf/dev/">VSDDumper</a> to print out the contents of the file. Users should also make use of <a href="http://web.archive.org/web/20071212220759/http://www.gnome.ru/projects/vsdump_en.html">vsdump</a> to probe the structure of files.</p> <div class="frame note"> <div class="label">Note</div> <div class="content"> This code currently lives the <a href="http://svn.apache.org/viewcvs.cgi/poi/trunk/src/scratchpad/">scratchpad area</a> of the POI SVN repository. Ensure that you have the scratchpad jar or the scratchpad build area in your classpath before experimenting with this code. </div> </div> <a name="Steps+required+for+write+support"></a> <div class="h4"> <h4>Steps required for write support</h4> </div> <p>Currently, HDGF is only able to read visio files, it is not able to write them back out again. We believe the following are the steps that would need to be taken to implement it.</p> <ol> <li>Re-write the decompression support in LZW4HDGF as HDGFLZW, which will be much better documented, and also under the ASL. <strong>Completed October 2007</strong> </li> <li>Add compression support to HDGFLZW. <strong>In progress - works for small streams but encoding goes wrong on larger ones</strong> </li> <li>Have HDGF just write back the raw bytes it read in, and have a test to ensure the file is un-changed.</li> <li>Have HDGF generate the bytes to write out from the Stream stores, using the compressed data as appropriate, without re-compressing. Plus test to ensure file is un-changed.</li> <li>Have HDGF generate the bytes to write out from the Stream stores, re-compressing any streams that were decompressed. Plus test to ensure file is un-changed.</li> <li>Have HDGF re-generate the offsets in pointers for the locations of the streams. Plus test to ensure file is un-changed.</li> <li>Have HDGF re-generate the bytes for all the chunks, from the chunk commands. Tests to ensure the chunks are serialized properly, and then that the file is un-changed</li> <li>Alter the data of one command, but keep it the same length, and check visio can open the file when written out.</li> <li>Alter the data of one command, to a new length, and check that visio can open the file when written out.</li> </ol> <div id="authors" align="right">by Nick Burch</div> </div> </div> </div> </td> <!--================= end Content ==================--> </tr> </tbody> </table> <!--================= end Main ==================--> <!--================= start Footer ==================--> <div id="footer"> <table summary="footer" cellspacing="0" cellpadding="4" width="100%" border="0"> <tbody> <tr> <!--================= start Copyright ==================--> <td colspan="2"> <div align="center"> <div class="copyright"> Copyright © 2002-2012 The Apache Software Foundation. All rights reserved.<br> Apache POI, POI, Apache, the Apache feather logo, and the Apache POI project logo are trademarks of The Apache Software Foundation. </div> </div> </td> <!--================= end Copyright ==================--> </tr> <tr> <td align="left"> <!--================= start Host ==================--> <!--================= end Host ==================--></td><td align="right"> <!--================= start Credits ==================--> <div align="right"> <div class="credit"> <a href="http://validator.w3.org/check/referer"><img width="88" height="31" alt="Valid HTML 4.01!" src="../skin/images/valid-html401.png" class="logoImage"></a><a href="http://jigsaw.w3.org/css-validator/"><img width="88" height="31" alt="Valid CSS!" src="../skin/images/vcss.png" class="logoImage"></a><a href="http://forrest.apache.org/"><img border="0" class="logoImage" alt="Built with Apache Forrest" src="../skin/images/built-with-forrest-button.png" width="88" height="31"></a> </div> </div> <!--================= end Credits ==================--> </td> </tr> </tbody> </table> </div> <!--================= end Footer ==================--> </body> </html>