SOLID FRAMEWORK 10.0.20070

Language detection, page orientation detection and OCR character recognition have been improved for non Latin languages. To include these improvements please update required files using traineddata.zip from Solid Framework downloads.

Next release expected on 08 July 26

Improvements: 

  • [xlsx] Improved detection of vertical text in table headers.
  • [json] Optimized the use of memory during the export of the internal database to JSON.
  • [docx] Improved the detection of vector text.
  • [pptx] Improved the order of objects to reflect read order instead of pdf order.
  • [docx] Improved detection of invalid unicodes in pdf encoding and replacement with the valid character.
  • [docx] Improved detection of strikethrough font effect.
  • [pptx] Improved the detection of embedded fonts when available on the machine.
  • [docx] Improved the detection of text with Type 3 fonts.
  • [xlsx] Improved borderless table column detection.

Bugfixes: 

  • [Office] Fixed a bug interfering with the detection of a dark background color on a document.
  • [docx] Fixed a bug causing an image to overlay the searchable text layer of a document.
  • [docx] Fixed a bug preventing the accurate color detection of a transparent background element.
  • [xlsx] Fixed a bug causing the height of a row to partially clip text.
  • [docx] Fixed a bug causing certain cells in the header of a table to become merged.
  • [docx] Fixed a bug that could cause font matching errors.

SOLID FRAMEWORK 10.0.19910

Language detection, page orientation detection and OCR character recognition have been improved for non Latin languages. To include these improvements please update required files using traineddata.zip from Solid Framework downloads.

Next release expected on 27 May 26

Improvements: 

  • [docx] Improved detection of vector text. 
  • [docx] Improved paragraph detection. 
  • [docx] Improved soft hyphen detection. 
  • [docx] Improved detection of certain non-standard encoded characters. 
  • [docx] Implemented detection of additional list items in Chinese, Japanese and Korean language documents. 
  • [office] Implemented the inclusion of producer information in conversion output. 

Bugfixes: 

  • [docx] Fixed a bug preventing the detection of text in a specific document. 
  • [docx] Fixed a bug where the height of an empty textbox prevented the document from being opened in Microsoft Word. 
  • [docx] Fixed a bug causing Type 3 text in a font bounding box to be clipped. 
  • [docx] Fixed a bug causing the substitution of a symbol for a non-standard encoded character. 
  • [docx] Fixed a bug causing performance issues in a document. 
  • [docx] Fixed a bug causing underline style to be detected as a line shape. 
  • [xlsx] Fixed a bug resulting in rows becoming merged on a document. 
  • [pptx] Fixed a bug interfering with image transparency detection.  

SOLID FRAMEWORK 10.0.19752

Language detection, page orientation detection and OCR character recognition have been improved for non Latin languages. To include these improvements please update required files using traineddata.zip from Solid Framework downloads.

Next release expected on 1 Apr 26

Improvements: 

  • [office] Enabled text detection support for PDFs with non-standard encoding and based on the following languages:  Bengali, Gujarati, Hindi, Kannada, Malayalam, Manipuri (Meetei Meyah), Oriya, Punjabi, Santali, Tamil, Telugu and Thai.
  • [docx] Implemented a text filtering procedure for scanned pages when text presented with hexadecimal string.
  • [docx] Improved underline property detection.
  • [office] Implemented use of the font family from a PDF file to select fonts installed on the operating system.
  • [office] Improved cell merge algorithm table detection.
  • [docx] Improved detection of text when converted with trial watermark

Bugfixes: 

  • [docx] Fixed a structure detection issue preventing successful conversion of a document.
  • [docx] Fixed a bug where auto rotation prevented accurate text detection.
  • [pptx] Fixed a bug preventing the successful conversion of a document on a specific platform.
  • [docx] Fixed a bug causing the overlapping of text and shapes.
  • [docx] Fixed a bug where a signature component interfered with detection of adjacent text.