SOLID FRAMEWORK 10.0.19752

Language detection, page orientation detection and OCR character recognition have been improved for non Latin languages. To include these improvements please update required files using traineddata.zip from Solid Framework downloads.

Next release expected on 1 Apr 26

Improvements: 

  • [office] Enabled text detection support for PDFs with non-standard encoding and based on the following languages:  Bengali, Gujarati, Hindi, Kannada, Malayalam, Manipuri (Meetei Meyah), Oriya, Punjabi, Santali, Tamil, Telugu and Thai.
  • [docx] Implemented a text filtering procedure for scanned pages when text presented with hexadecimal string.
  • [docx] Improved underline property detection.
  • [office] Implemented use of the font family from a PDF file to select fonts installed on the operating system.
  • [office] Improved cell merge algorithm table detection.
  • [docx] Improved detection of text when converted with trial watermark

Bugfixes: 

  • [docx] Fixed a structure detection issue preventing successful conversion of a document.
  • [docx] Fixed a bug where auto rotation prevented accurate text detection.
  • [pptx] Fixed a bug preventing the successful conversion of a document on a specific platform.
  • [docx] Fixed a bug causing the overlapping of text and shapes.
  • [docx] Fixed a bug where a signature component interfered with detection of adjacent text.