SOLID FRAMEWORK 10.0.18950

Solid Framework SDK has been updated.

Note:

Language detection, page orientation detection and OCR character recognition has been improved for non Latin languages. To include these improvements please update required files using traineddata.zip from Solid Framework downloads.

Improvements: 

  • [docx]  Content detection algorithm extended using pdf tags to bias output.
  • [office]  Improved detection of bound orientation for specific languages and the preferred orientation of that language.
  • [office]  Improve content detection of Korean language documents.
  • [docx]  Improved the detection of textboxes with similar layout to a two-cell table.
  • [docx]  Improved the detection of table header content.
  • [docx]  Improved table detection when cells in multiple columns contain only one hyphen.
  • [docx]  Improved the detection of line spacing.
  • [docx]  Improved detection of textboxes between column breaks.
  • [docx]  Improved the detection of grouped objects with similar layout to a table.
  • [docx]  Improved the detection of page headers.
  • [json]  Implemented strikethrough property in json export.
  • [json]  Improved the detection of text bounds.
  • [docx]  Improved the detection of tables within columns of text.
  • [docx]  Improved the detection of header content after page orientation change within a document.
  • [pptx]  Improved detection of character spacing.
  • [office]  Improved OCR analysis of non Latin glyphs.

Bugfixes: 

  • [docx]  Fixed a bug preventing the detection of some tables in a certain document.
  • [docx]  Fixed a bug causing four characters on a variant color background to be omitted from conversion output.
  •   Fixed a bug causing additional text to be included in the target of a URL link.
  • [pptx]  Fixed a bug causing a word to be mislocated on a slide.
  • [xlsx]  Fixed a bug causing the cells of one row to become merged on a certain document.

SOLID FRAMEWORK 10.0.18816

Solid Framework SDK has been updated.

Improvements: 

  • [docx] Improved the detection of text when the left margin of the scanned document contains noise. 
  • [docx] Improved list detection in pdfs using image bullets. 
  • [office] Improved the detection of graphic tables. 
  • [xlsx] Improved column detection of partially bordered tables.  
  • [docx] Implemented internal bookmark to a specific page in the Word document to match the pdf links.  
  • [json] Improved detection of small caps.  
  • [docx] Improved the order detection of overlapping shapes and images.  
  • [docx] Improved detection of column breaks.  
  • [docx] Improved detection of vertical Japanese text. 
  • [docx] Improved detection of borderless tables.  
  • [json] Improved detection of ‘Table of Contents’ bounds.  
  • [json] Improved handling of arbitrary text rotation in json export.

Bugfixes: 

  • [json] Fixed an issue causing partial detection of a line of text. 
  • [docx] Fixed an issue causing incorrect merging of separate tables on a page. 
  • [docx] Fixed an issue causing rows of certain table content to become merged.

SOLID FRAMEWORK 10.0.18708

Solid Framework SDK has been updated.

Improvements: 

  • [docx]     Improved column and row detection of hybrid split tables.
  • [office]     Implemented the recognition of non standard encoded vertical Japanese characters.
  • [office]     Improved the precision of non standard encoded Arabic character coordinates.
  • [docx]     Improved detection of single column non-table content.
  • [docx]     Improved table detection.
  • [office]     Improved the rendering of Type 3 font glyphs.
  • [docx]     Improved the optical character recognition of large images on 32 bit platforms.
  • [docx]     Implemented list recognition in pdfs using image bullets.
  • [json]     Implemented the option to ignore the detection of tiled pages in json export.
  • [json]     Improved json export of pages that exceed Microsoft size
  • [json]     Support nested tables being placed inside corresponding cell contents.
  • [office]     Applied custom language string options for Chinese text recovery.
  • [office]     Improved initialization mode for use of Thai trained data language file.
  • [docx]     Improved the detection of list items to prevent inclusion of undesirable footnote content.
  • [office]     Implemented automatic rotation detection of Japanese and Korean documents using optical character recognition.

Bugfixes: 

  • [docx]     Fixed a bug causing the misdetection of multiple glyph shapes representing a single letter “e” in a document.
  • [docx]     Fixed a bug preventing detection of a borderless table when the table contained extended spaces between rows.
  • [docx]     Fixed a bug causing the detection of unnecessary column breaks in a document with right to left aligned Arabic text.
  • [docx]     Fixed a relative height calculation issue preventing a very large document from opening in Microsoft Word.
  • [docx]     Fixed a bug preventing the detection of columns when ignoring the tagged table structure of a pdf.