Convert PDF Files to HTML

This sample illustrates how to convert a PDF into a HTML page.

Solid Framework has supported a PdfToHtmlConverter class for many years. This class provides a very simple mechanism for creating HTML from a PDF, and creates HTML that is “re-flowed” which is ideal for reading the PDF on a small device. However “re-flowed” HTML pages do not look the same as the original PDF since the order of some elements may have been modified.

This sample builds the HTML using the Core Model rather than using the PdfToHtmlConverter class.  This allows the fine tuning of the conversion process and creates an HTML page that is visually similar to the original PDF.

To use the sample:

  1. Download a license from solidframework.net.
  2. Download the latest version of SolidFramework.dll.
  3. Download the sample PdfToHtml sample code which is a zip file.
  4. Extract the contents of the zip file.
  5. In Visual Studio, add a reference to SolidFramework.dll
  6. The solution should then build successfully.
  7. Specify the path to the PDF as a  command line option.

    Setting a command line option

  8. Run the project.
  9. You will get an error the first time that you try to convert a file, as you will need to enter license information. The easiest way is to set the path to the location where you saved the license file.

    Specifying the path to the license file

  10. Alternatively you can enter the contents of the license.xml file directly into your code..
  11. Running the sample will then create an HTML file in the same folder as the source file.