Solid Framework 10.0.9726

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.9726.

Among other improvements, this version includes faster performance and improvements in the handling of Non Standard Encoded files.

Solid Framework 10.0.9638

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.9638.

Solid Framework 10.0.9452

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.9452.

MAIN FEATURES

HTML Improvements

  • Prevention of duplicated text when exporting to HTML.
  • Implementation of “Windows Connected File” Feature when Exporting to HTML with Linked Images, allowing the folder containing the images to be deleted automatically if the HTML file is deleted.
  • Support renaming of constructed HTML file to an arbitrary name, without causing navigation issues.

Note that this version contains some changes in the way that files are created when exporting to HTML, which could potentially cause minor breakage to existing code.

Other Improvements

  • Add support for optionally retaining headers and footers when converting to Excel.
  • Add support for converting to Unicode Text.
  • Improved editability of some files that were previously reconstructed with text boxes.

 

Breaking Change

The property PdfToOfficeJobEnvelope.OcrEngine was always set to TextRecoveryEngine.SolidOCR which caused problems if the license did not support OCR.

The property has now been removed. This will cause compile time errors that can be resolved by removing any explicit reference to the property.

This affects: PdfToWordJobEnvelope, PdfToExcelJobEnvelope, PdfToPowerPointJobEnvelope, PdfToTextJobEnvelope, PdfToHtmlJobEnvelope and PdfToDataJobEnvelope.

SOLID FRAMEWORK – 10.0.9340

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.9340.

MAIN FEATURES

Improved OCR

  • Optical character recognition has been improved particularly with regards to the identification of list items.

Improved cell detection when exporting to Excel

This relates to some PDFs where several data items share a cell border. Previously the the data were considered to be part of a single large cell. they are now separated into distinct cells.

Improved extraction of list items that are beneath a transparent watermark

 

SOLID FRAMEWORK – 10.0.9084

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.9084.

MAIN FEATURES

Better reconstruction when converting to HTML

  • Layout is now correct even for non-contiguous page ranges.
  • Layout is improved when reconstructing tables.

Smaller files created when saving a modified PDF

  • Object Stream Compression is now supported

Improvements for Accessibility

  • Alt-text is now correctly extracted when reconstructing Word documents.

Further Improvements in the handling of Non-Standard Encoded text

 

Updated 3rd Party Libraries

 

SOLID FRAMEWORK – 10.0.8920

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.8920.

MAIN FEATURES

Improved consistency when dealing with borderless tables.

Better Table Detection

Identifying whether or not text is part of a table is complex, particularly when they are no borders to delimit the table edges. This release increases the accuracy with which this is achieved.

More Logical Tables are now Merged Correctly

Solid Framework aims to recreate a single table where it appears as if it has been split over multiple pages within the PDF (i.e. it is a single “logical” table).

This release improves detection of table columns, which allows more logical tables to be recognised and reconstructed.

Improved Column Title Detection

This release is better at identifying text that represents column titles (i.e. the first row of a table).

Solid Framework – 10.0.8870

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 10.0.8870.

This is a Major Release. and includes a number of breaking changes.

MAIN FEATURES

Extraction of wide tables that were tiled over multiple pages

If a table within a spreadsheet is wider than the paper size then it will be created as a multi-page “tiled” PDF. Previously this would have resulted in the file being reconstructed with multiple tables.

Solid Framework 10 is able to reconstruct these pages to give a single wide table.

 

Example of a tiled PDF being reconstructed as a wide spreadsheet

This functionality is enabled by default, but can be disabled by setting DetectTiledPages to false.

 

Exact mode for converting to HTML

Solid Framework has been able to reconstruct HTML for many years. This has been done by “reflowing” the document which results in a web page that may be easy to read, but that may not look like the original PDF.

Solid Framework 10 now allows HTML to be created that looks very similar to the PDF by setting ExactMode to true.

 

Better Handling of Z-Order

Some PDFs with complex layers were not being reconstructed correctly. Solid Framework 10 now handles these files better.

 

FURTHER IMPROVEMENTS IN OCR

Improvements have been made in being more consistent with font types and styles to provide a more aesthetic document.

 

IMPROVED TOLERANCE OF CORRUPT PDFS

A number of PDF files that could not previously be converted due to errors can now be corrected and a valid document reconstructed from them.

Breaking Changes

A number of changes have been made to simplify the API and improve consistency between the managed and native C++ SDK

Run time errors

License.Import now throws InvalidLicenseException  immediately for invalid licenses, rather than delaying the error until conversion is attempted. This could result in slightly different behaviour than was previously the case.

 

Compile time errors

The following may cause compile time errors:

 

Code Removal

  • LicenseCollection has been removed. Use SolidFramework.License.Import instead
  • ValidateOnly and VerifyOnly properties have been removed from PdfToPdfAConverter. Use Validate and Verify methods instead

Deprecation

  • PagesModel.PagesCount has been deprecated in favor of PagesModel.PageCount

Renamed properties

  • PdfPageHolder.CommentsCount has been renamed to PdfPageHolder.CommentCount
  • PdfPageHolder.LinksCount has been renamed to PdfPageHolder.LinkCount
  • OcrTextRegion.OcrLines has been renamed to OcrTextRegion.OcrLineCount
  • OCRTransformationResult.GetPageWordsCount has been renamed to OCRTransformationResult.GetPageWordCount
  • OCRTransformationResult.GetPageConfidentWordsCount has been renamed to OCRTransformationResult.GetPageConfidentWordCount
  • OCRTransformationResult.GetDocumentWordsCount has been renamed to OCRTransformationResult.GetDocumentWordCount
  • OCRTransformationResult.GetDocumentConfidentWordsCount has been renamed to OCRTransformationResult.GetDocumentConfidentWordCount

Property replaced with Method

  • The ViewerPreferences property of PdfDocument.Catalog now has been replaced with a GetViewerPreferences(bool create) method and a RemoveViewerPreferences() method

C++ API specific changes

Changes to Names

  • Methods in the C++ API are now all use Pascal-case (e.g. setOutputPath is now SetOutputPath)
  • Methods starting GetIs, GetHas and GetWas have had their ‘Get‘ dropped
  • PagesModelBase class has been renamed to PagesModel
  • ConverterBase classes have been renamed to Converter (e.g. PdfToWordConverterBase is now PdfToWordConverter)

 

Other Changes

  • SolidFramework.cpp no longer includes stdafx.h
  • CustomData properties have been removed from the Converter classes. (Custom data can captured within a lambda expression when setting the std::function progress/warning callback or stored within a subclass when overriding OnProgress/OnWarning)
  • Collections in the C++ API are now exported as std::vectors

 

Solid Framework 9.2.8681

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 9.2.8681.

 

 

During routine testing of our large PDF collection, it was found a very small number were no longer able to be converted. This affected approximately 1 PDF in 100,000.

This release resolves the issue.

 

There are no other changes from version 9.2.8680.

Solid Framework 9.2.8680

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 9.2.8680.

 

MAIN FEATURES

 

Improved Handling of Z-Order when Reconstructing PowerPoint

PowerPoint and Word both support multiple layers of content, with those on top potentially obscuring those beneath. In some situations the order of these layers was incorrect, resulting in text or other items on the page being “lost”. This problem has been resolved for a number of samples which exhibited this problem.

 

Further improvements in OCR

Improvements have been made in beingh more consistent with font types and styles to provide a more aesthetic document.

 

Improved tolerance of corrupt PDFs

A number of PDF files that could not previously be converted due to errors can now be corrected and a valid document reconstructed from them.

 

 

Ongoing improvements in self-documentation

A number of parameters that used obscure numbers now use Enums instead. This is aimed at improving long term maintainability of code but may cause compile time errors.

This affects PagesModel and PdfDocument classes only.

 

SOLID FRAMEWORK 9.2.8564.1 RELEASED

The latest public release of Solid Framework SDK is now available for download from the developer portal at www.solidframework.net. This is version 9.2.8564.

 

MAIN FEATURES

TextRecoveryLanguage

TextRecoveryLanguage is used to specify the language of the document that needs to have text recovered by OCR when creating a CoreModel.

Previously this was always set to “english”. It now defaults to “automatic”. For non-English documents this option will now allow the same result to be created whether conversion is performed using the “Converter” classes or the CoreModel.

 

Improved Merging of Logical Tables

If a table is split over multiple pages within the PDF then an attempt is made to stitch these back together into a single table.

We have resolved issues that prevented some tables from merging correctly.

 

Improved handling of Chinese Language files

Several issues associated with reconstruction of Chines Language files have been resolved.

 

Renaming of properties to improve self-documentation

KeepNonTableContent  has been created as an alias for TablesFromContent. The option specifies how non-table text and images within a PDF should be handled when reconstructing Excel documents.

The name TableFromContent does not clearly identify what the option does.

It has therefore been deprecated and users are advised ot use “KeepNonTableContent” instead.