SOLID FRAMEWORK 10.0.18610

Solid Framework SDK has been updated.

Improvements: 

  • [docx]     Improved optical character recognition of black text on a gray background.
  • [json]      Support link property in JSON conversion output.
  • [json]      Improved the detection of annotation identification.
  • [docx]     Improved the forced detection of non-standard encoding of Arabic, Chinese, Japanese and Korean characters of multiple page documents.
  • [docx]     Improved the optical character recognition preprocessing of vector text.
  • [docx]     Improved detection of paragraph styles.
  • [docx/xlsx]     Improved the detection of borderless tables.
  • [json]     Improved hyperlink detection.
  • [json]     Improved reliability of paragraph coordinates of rotated textboxes.
  • [json]     Improved detection of the page rotation angle when autorotate is manually disabled.

Bugfixes: 

  • [docx]     Fixed a bug causing an extra line of body text to be detected in the header.
  • [json]      Improved the detection of page orientation containing inconsistently orientated text.
  • [docx]     Fixed a bug resulting in the partial loss of specific bounding boxes of a document.

SOLID FRAMEWORK 10.0.18460

Solid Framework SDK has been updated.

Improvements: 

  • [docx]    Improved the detection of small images comprised of thousands of objects in the original pdf container.
  • [docx]    Improved the order detection of overlapping shapes and images.
  • [docx]    Improved the detection of Japanese characters.
  • [docx]    Improved the stability of font color detection in text boxes with varying fill colors.
  • [docx]    Improved the optical character recognition preprocessing of vector text.
  • [docx]    Improved the stability of page orientation when vertical Japanese text is detected.
  • [docx]    Improved the detection of bullet and list items in Korean language documents.
  • [docx]    Improved the detection of Latin characters in Korean language documents.
  • [docx]    Improved the detection of Japanese language.
  • [docx]    Implemented post processing of images to improve optical character recognition.
  • [docx]    Improved detection of full page black image overlaid with images of white text.

Bugfixes: 

  • [xlsx]     Fixed an issue resulting in various rows of a large table to become combined.
  • [docx]    Fixed an issue causing a textbox to convert with an incorrect fill color.
  • [docx]    Fixed an issue preventing the detection of a hyperlink.
  • [docx]    Fixed an issue resulting in text being converted as an image.
  • [docx]    Fixed an issue that caused the conversion time of a specific document to be extended.

Security 

A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.18370

Improvements: 

  • [docx] Improved the detection of standard office bar charts and variants. 
  • [docx] Improved the detection of Chinese language.  
  • [docx] Improved the optical character recognition preprocessing of vector text.  
  • [docx] Improved the column detection of left to right aligned text.  
  • [docx] Improved the stability of graphic color detection.  
  • [docx] Improved the detection of header content.  
  • [docx] Improved detection of white text located on a dark background.   
  • [docx] Improved handling of text where the text and background colour match.  
  • [docx] Improved table detection.  
  • [docx] Improved the detection of diagrams.  
  • [docx] Improved detection of black text located on a grey background.    
  • [office] Improved language and page orientation detection. 

Bugfixes: 

  • [docx] Fixed an issue causing Latin characters in a Chinese document to be misplaced. 
  • [docx] Fixed an issue where a large graphic element caused text recovery failure. 
  • [pdf] Fixed an issue preventing the marked property from being retained. 
  • [docx] Fixed an issue preventing the detection of the correct bounds of a graphic element. 
  • [docx] Fixed a performance issue where dense vector graphics prevented successful optical character recognition of a file. 
  • [docx] Fixed an issue causing conversion delay of complex one-page document. 
  • [docx] Fixed a bug preventing the rendering of the first page of a detected Table of Contents. 

SOLID FRAMEWORK 10.0.18270

Solid Framework SDK has been updated.

Improvements: 

  • [pdf] Introduce option to save pdf page orientation as tagged data instead of auto rotating.
  • [office] Improved algorithm for averaging text properties of a paragraph that contains unicode groups to describe a single Arabic glyph. 
  • [docx] Improved list detection. 
  • [docx] Improved the text line assembly of Arabic content with diacritics. 
  • [office] Improved detection of small caps text. 
  • [office] Improved detection of Arabic language when minimal English text is near the Arabic language. 
  • [docx] Improved detection of header content. 
  • [docx] Improved z-order placement of graphic in conversion output. 
  • [docx] Improved detection of narrow columns on borderless tables. 
  • [docx] Improved conversion result of self-interesecting glyph outlines. 
  • [pdf] Improved tag support of various layout options. 
  • [office] Improved page margin calculation to be multiples of 1/4 inch for the imperial measurement system and 1/4 centimetre for the metric measurement system. 
  • [docx] Improved the rendering of Type 3 fonts. 
  • [docx] Improved use of tab stops to space content on a single line.  
  • [docx] Improved conversion result when encoding of original pdf contains large, broken text areas.  
  • [json] Support detection of table headers. 
  • [json] Support rectangle span element. 
  • [json] Improved detection of even-odd page header bounds. 
  • [json] Improved detection of table headers. 
  • [json] Support XObject ID for annotation graphic groups and textboxes. 
  • [json] Improved detection of span bounds for line with small caps.

Bugfixes: 

  • [pdf] Fixed a compression algorithm issue that caused the corruption of data during conversion of a specific file. 
  • [docx] Fixed an issue causing the background of an image to become transparent.  

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.18108

Solid Framework SDK has been updated.

Improvements: 

  • [Office] Improved algorithms required to layout right to left body paragraph text.  
  • [Office] Improvements to right to left text character matching and diacritic handling. 
  • [Office] Improved Tatweel (Arabic) language detection. 
  • [Office] Optimized detection performance for non-standard encoded characters using Tesseract.  
  • [docx] Improved detection of footnote text. 
  • [docx] Improved header detection. 
  • [docx] Column detection improvements. 
  • [Office] Improved rendering of specific Type3 font. 

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 

Misc: 

  • All projects are now compiled using C++17 language features

SOLID FRAMEWORK 10.0.18028

Solid Framework SDK has been updated.

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 
  • [docx] Fixed a chunk usage error preventing successful conversion of a file.
  • [docx] Fixed an issue preventing all text boxes located in front of graphic from being detected. 
  • [pptx] Fixed and issue preventing an orange colored graph line from rendering. 
  • [office] Fixed and issue causing Arabic characters to be misordered. 
  • [docx] Fixed an issue preventing detection of the right table border. 
  • [office] Fixed a glyph width issue causing illegible text output. 
  • [docx] Fixed a bug detecting an extra tab after a bullet point.   

Office Fidelity: 

  • [docx] Improved detection of parenthesis in right-to-left aligned text when the PDF characters are incorrectly encoded.
  • [docx] Improved detection of hanging indents for right-to-left aligned text. 
  • [office] Improved detection of Latin characters within Arabic document. 
  • [office] Improved detection of parenthesis in right-to-left aligned Arabic text. 
  • [office] Improved alignment detection of right-to-left aligned Hebrew document. 
  • [office] Improved alignment detection of right-to-left aligned Hebrew document. 
  • [office] Improved detection of Arabic text when the PDF characters are incorrectly encoded. 
  • [docx] Improved hybrid table cell detection. 
  • [docx] Improved detection of custom numbered list. 
  • [docx] Improved hybrid table row detection. 
  • [docx] Implement custom metadata field support. 
  • [docx] Improved hybrid table column detection.

SOLID FRAMEWORK 10.0.17926

Solid Framework SDK has been updated.

Feature Update: 

  • Enable support of a licensed installation of IRIS. 

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 
  • [docx] Fixed an issue preventing one image of many from being correctly rendered. 
  • [docx] Fixed an issue preventing successful conversion of a file on Linux operating systems only. 
  • [docx] Fixed an issue preventing the detection of a Table of Contents due to the text order of the file. 
  • [pdf] Fixed an issue preventing PDFA-2b validation of a document when certain font combinations are installed.  

Office Fidelity: 

  • [docx] Improved detection of breaks on scanned documents containing Arabic text. 
  • [office] Streamlined optical character recognition workflow of large documents containing non-standard encoded text. 
  • [office] Allowed page snapshot deletion where annotations exist. 
  • [office] Improved processing of non-standard encoded characters to unicode. 
  • [office] Improved detection of combined characters. 
  • [office] Improved detection of Arabic diacritic characters. 
  • [office] Improved detection of transparent watermarks over scanned pages. 
  • [docx] Improved detection of Table of Contents. 
  • [rtf] Improved detection of characters when converting to RTF. 
  • [docx] Improved detection of shapes when converting to DOCX. 
  • [docx] Improved detection of serial images that contain underlines.  

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.17650

Solid Framework SDK has been updated.

Feature Update: 

  • Enabled language detection for Arabic language documents. 

Bugfixes: 

  • [docx] Improved detection of bold font style on a specific document. 
  • [docx] Resolved a page count issue on macOS. 
  • [docx] Improved detection of small caps font style effect. 

Office Fidelity: 

  • [office] Improved whitespace detection between Arabic words. 
  • [docx] Improved header and footer detection targeting one-page documents with a horizontal line below body content. 
  • [docx] Improved detection of tables that overlap with body content. 
  • [docx] Improved detection of word order with right-to-left aligned text. 
  • [docx] Improved detection of Arabic characters when the text layer does not match the character glyphs. 

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.   

SOLID FRAMEWORK 10.0.17490

Solid Framework SDK has been updated.

Improvements: 

  • Improvements to pdf rendering clipping algorithms: 
    • Added method for detecting a polygon that had degenerated into a polyline. 
    • Added methods for detecting self-intersections of polygon contours. 
    • Added method for automatic error detection in the clipping algorithm. 
    • Added method for changing the direction of a polygon. 
    • Support for winding/alternate rules has been added prior to polygon clipping. 
    • Rewrote the method for finding polygon intersections. 
    • Rewrote the method for adding the found intersections of polygons to the polygon structures. 
    • All clipping algorithm methods have been updated to operate with the same tolerance. 
    • Improved the accuracy of determining the type of vertices found near a polygon. 
    • Improved processing of polygon edges located very close to each other. 
  • Header and footer improvements specifically targeting one-page documents: 
    • Improved the exclusion of graphic lines, images and labels. 
    • Improved the exclusion of large tables, images and footnotes. 
    • Improved the exclusion of headings and titles. 
    • Improved the exclusion of images or text located close to other page content. 

Bugfixes: 

  • [pdf] Fixed an issue preventing conversion with PDF/A-1a and A-2b standards due to a specific page structure. 
  • [docx] Resolved an issue with CAD source content where vertical text around architectural details is displaced. 
  • [docx] Fixed an issue resulting in the top of characters in one line of text to be clipped. 
  • [docx] Resolved an issue causing five rows of a table to be incorrectly merged. 
  • [docx] Fixed an issue causing two columns of a table to be merged into one. 
  • [docx] Fixed an issue in the clipping engine preventing successful rendering of a specific pdf. 
  • [docx] Fixed an issue causing the last line of a right-to-left direction paragraph to have a hanging indent. 
  • [docx] Resolved an issue where right-to-left text was incorrectly left aligned. 
  • [docx] Fixed an issue preventing the rendering of leader (tabbing) characters in table of contents containing right-to-left text. 
  • [docx] Fixed incorrectly wrapped right-to-left text causing a page overflow issue.  
  • [docx] Resolved an indentation and alignment issue at list items in a right-to-left document. .  

Office Fidelity: 

  • [docx] Improved table of contents detection by optimizing sections across pages. 
  • [office] Improved GNSE detection to independently recognize glyphs and unicodes in separate stages. 
  • [office] Improved support for Arabic diacritical marks using analysis of scale and character spacing. 
  • [docx] Improved border line termination in specific table cases 
  • Improved the left margin alignment of a document. 
  • [docx] Resolved an issue causing text misplacement when viewed on Office 2016 only. 
  • [docx] Fixed a hybrid table detection issue resulting in two additional columns. 
  • [docx] Fixed an issue causing line shapes to be rendered as underlines. 
  • [pptx] Fixed an issue resulting in a block of text in a table to be incorrectly divided into six rows. 
  • [docx] Resolved an issue that caused one table to be incorrectly split into two tables. 
  • [docx] Resolved an issue causing a textbox to be divided in two parts. 
  • [docx] Improved Arabic language character unicode detection 
  • [docx] Improved alignment and indentation of content with right-to-left text direction.

SOLID FRAMEWORK 10.0.17360

Solid Framework SDK has been updated.

Feature Updates: 

  • Added the ability to set the text recovery language to any language when the corresponding Tesseract traineddata file is available. 

Bugfixes: 

  • [docx] Fixed an issue that caused an extra paragraph to be inserted after a specific graphic group. 
  • [docx] Resolved an issue preventing the detection of a first page header. 
  • [docx] Fixed an issue that caused an extra space to be inserted incorrectly affecting content layout. 
  • [docx] Resolved an issue preventing the detection of a small portion of red background color on a graphic. 
  • [docx] Fixed an issue converting invisible text as visible when OCR detection results in very few words. 
  • [docx] Fixed an issue preventing successful conversion of a file on Linux arm64 operating systems only. 
  • [docx] Fixed an issue preventing successful GNSE detection of text when Roboto font is used. 
  • [docx] Resolved an issue preventing the successful conversion of a file containing detailed images. 
  • [docx] Fixed an issue that caused header content to be shifted down one line on certain pages of a document. 
  • [docx] Resolved an issue that caused a document to be incorrectly clipped diagonally causing content loss.  
  • [docx] Fixed an issue preventing text from being detected on certain pages of a specific scanned document. 
  • [docx] Resolved an issue that caused a font style change in the empty space following an underlined descending letter. 

Office Fidelity: 

  • [docx] Improved detection of page X of Y page number format. 
  • [docx] Improved consistency of detection of multi-line headers. 
  • [docx] Improved detection of repeated table headers as body content instead of as header content. 
  • [docx] Improved the recovery of Korean text when GNSE is enabled. 
  • [docx] Improved borderless table detection. 
  • [docx] Improved detection of alternating headers supporting odd and even pages. 
  • [docx] Improved detection of alternating footers on odd and even pages. 
  • [docx] Improved table detection. 
  • [docx] Improved detection of merged cells. 
  • [docx] Improved detection of self intersecting glyph outlines when GNSE is enabled. 
  • [docx] Improved recognition of multi column layouts. 
  • [docx] Extend our character detection for non standard encoding to use tesseract OCR when required. 

Security: 

  • Security scanning of our codebase is automated as part of our compilation process.