Techical documentation

Content of the odt2hdoc folder

  • The ant script "odt_to_hdoc.ant" script which is called when convert button is clicked on the hdocConverter website.

  • The "source" folder contains :

    • a copy of the current odt doc with the extension ".zip

    • a unzip folder containing all the element of the odt file

    • the xsl sheet used for xsl conversion

  • The "sourcesJava" folder contains :

    • The src folder which can be imported as a Java project in order to edit it (notice that you should create a Jar file with the current jre used by your server after having edited it).

    • the jar file created from the Java sources and called by the ant script to process a first transformation.

  • The "outXslt" folder contains :

    • the build folder in which the content.xml updated by the xsl sheet will be dropped

  • The "outJava" folder contains :

    the content.xml updated by the Sax Parser (Java)

  • The "out" folder contains :

    • the build folder which will be zipped and which contains all the needed files for the hdoc creation.

The ant script

The ant script is the entry point of the converter. Here's how it works:

  1. First of all, this script will copy the input file in the source folder with the .zip extension (instead of odt)

  2. Secondly, it will unZip the file to the odt folder

  3. Then, the most interesting part of the script:

    • It will run the Java Sax Parser that will add the sections

    • It will run the XSLT stylesheet that will find the content.xml created by the Java Parser.

      This stylesheet is « modele.xslt » which is contained in the source folder.

    • Next, it will move all the pictures from the odt doc to the out folder

    • Then, it will create the structure of the hdoc file in the out folder.

  4. To conclude, it will zip the archive and delete all the temporary files.

The Sax Parser (Java)

The aim of the java parser is to add the sections recursively when a title is encountered. It was possible to do it by using a XPATH in Xslt but a Java solution was definitely smarter.

The task take two arguments : the input file path given through the ANT variable ${InputPath} and the output file path, set to outJava/content.xml.

The Xslt sheet

The XSL transformation sheet is named modele.xslt and is located in the source folder. The goal of this transformation is to wrap all the HDoc content generated so far in a valid HDoc structure, with html, head and body tags. If you want to see more closely what the XSL looks like and or want to improve it, feel free to open the modele.xslt file. The output is set to the xslt/outXslt directory by the ANT script.