SOLID FRAMEWORK 10.0.18460

Solid Framework SDK has been updated.

Improvements: 

  • [docx]    Improved the detection of small images comprised of thousands of objects in the original pdf container.
  • [docx]    Improved the order detection of overlapping shapes and images.
  • [docx]    Improved the detection of Japanese characters.
  • [docx]    Improved the stability of font color detection in text boxes with varying fill colors.
  • [docx]    Improved the optical character recognition preprocessing of vector text.
  • [docx]    Improved the stability of page orientation when vertical Japanese text is detected.
  • [docx]    Improved the detection of bullet and list items in Korean language documents.
  • [docx]    Improved the detection of Latin characters in Korean language documents.
  • [docx]    Improved the detection of Japanese language.
  • [docx]    Implemented post processing of images to improve optical character recognition.
  • [docx]    Improved detection of full page black image overlaid with images of white text.

Bugfixes: 

  • [xlsx]     Fixed an issue resulting in various rows of a large table to become combined.
  • [docx]    Fixed an issue causing a textbox to convert with an incorrect fill color.
  • [docx]    Fixed an issue preventing the detection of a hyperlink.
  • [docx]    Fixed an issue resulting in text being converted as an image.
  • [docx]    Fixed an issue that caused the conversion time of a specific document to be extended.

Security 

A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.18370

Improvements: 

  • [docx] Improved the detection of standard office bar charts and variants. 
  • [docx] Improved the detection of Chinese language.  
  • [docx] Improved the optical character recognition preprocessing of vector text.  
  • [docx] Improved the column detection of left to right aligned text.  
  • [docx] Improved the stability of graphic color detection.  
  • [docx] Improved the detection of header content.  
  • [docx] Improved detection of white text located on a dark background.   
  • [docx] Improved handling of text where the text and background colour match.  
  • [docx] Improved table detection.  
  • [docx] Improved the detection of diagrams.  
  • [docx] Improved detection of black text located on a grey background.    
  • [office] Improved language and page orientation detection. 

Bugfixes: 

  • [docx] Fixed an issue causing Latin characters in a Chinese document to be misplaced. 
  • [docx] Fixed an issue where a large graphic element caused text recovery failure. 
  • [pdf] Fixed an issue preventing the marked property from being retained. 
  • [docx] Fixed an issue preventing the detection of the correct bounds of a graphic element. 
  • [docx] Fixed a performance issue where dense vector graphics prevented successful optical character recognition of a file. 
  • [docx] Fixed an issue causing conversion delay of complex one-page document. 
  • [docx] Fixed a bug preventing the rendering of the first page of a detected Table of Contents. 

SOLID FRAMEWORK 10.0.18270

Solid Framework SDK has been updated.

Improvements: 

  • [pdf] Introduce option to save pdf page orientation as tagged data instead of auto rotating.
  • [office] Improved algorithm for averaging text properties of a paragraph that contains unicode groups to describe a single Arabic glyph. 
  • [docx] Improved list detection. 
  • [docx] Improved the text line assembly of Arabic content with diacritics. 
  • [office] Improved detection of small caps text. 
  • [office] Improved detection of Arabic language when minimal English text is near the Arabic language. 
  • [docx] Improved detection of header content. 
  • [docx] Improved z-order placement of graphic in conversion output. 
  • [docx] Improved detection of narrow columns on borderless tables. 
  • [docx] Improved conversion result of self-interesecting glyph outlines. 
  • [pdf] Improved tag support of various layout options. 
  • [office] Improved page margin calculation to be multiples of 1/4 inch for the imperial measurement system and 1/4 centimetre for the metric measurement system. 
  • [docx] Improved the rendering of Type 3 fonts. 
  • [docx] Improved use of tab stops to space content on a single line.  
  • [docx] Improved conversion result when encoding of original pdf contains large, broken text areas.  
  • [json] Support detection of table headers. 
  • [json] Support rectangle span element. 
  • [json] Improved detection of even-odd page header bounds. 
  • [json] Improved detection of table headers. 
  • [json] Support XObject ID for annotation graphic groups and textboxes. 
  • [json] Improved detection of span bounds for line with small caps.

Bugfixes: 

  • [pdf] Fixed a compression algorithm issue that caused the corruption of data during conversion of a specific file. 
  • [docx] Fixed an issue causing the background of an image to become transparent.  

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.18108

Solid Framework SDK has been updated.

Improvements: 

  • [Office] Improved algorithms required to layout right to left body paragraph text.  
  • [Office] Improvements to right to left text character matching and diacritic handling. 
  • [Office] Improved Tatweel (Arabic) language detection. 
  • [Office] Optimized detection performance for non-standard encoded characters using Tesseract.  
  • [docx] Improved detection of footnote text. 
  • [docx] Improved header detection. 
  • [docx] Column detection improvements. 
  • [Office] Improved rendering of specific Type3 font. 

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 

Misc: 

  • All projects are now compiled using C++17 language features

SOLID FRAMEWORK 10.0.18028

Solid Framework SDK has been updated.

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 
  • [docx] Fixed a chunk usage error preventing successful conversion of a file.
  • [docx] Fixed an issue preventing all text boxes located in front of graphic from being detected. 
  • [pptx] Fixed and issue preventing an orange colored graph line from rendering. 
  • [office] Fixed and issue causing Arabic characters to be misordered. 
  • [docx] Fixed an issue preventing detection of the right table border. 
  • [office] Fixed a glyph width issue causing illegible text output. 
  • [docx] Fixed a bug detecting an extra tab after a bullet point.   

Office Fidelity: 

  • [docx] Improved detection of parenthesis in right-to-left aligned text when the PDF characters are incorrectly encoded.
  • [docx] Improved detection of hanging indents for right-to-left aligned text. 
  • [office] Improved detection of Latin characters within Arabic document. 
  • [office] Improved detection of parenthesis in right-to-left aligned Arabic text. 
  • [office] Improved alignment detection of right-to-left aligned Hebrew document. 
  • [office] Improved alignment detection of right-to-left aligned Hebrew document. 
  • [office] Improved detection of Arabic text when the PDF characters are incorrectly encoded. 
  • [docx] Improved hybrid table cell detection. 
  • [docx] Improved detection of custom numbered list. 
  • [docx] Improved hybrid table row detection. 
  • [docx] Implement custom metadata field support. 
  • [docx] Improved hybrid table column detection.

SOLID FRAMEWORK 10.0.17926

Solid Framework SDK has been updated.

Feature Update: 

  • Enable support of a licensed installation of IRIS. 

Bugfixes: 

  • [docx] Fixed an issue preventing successful conversion of a file. 
  • [docx] Fixed an issue preventing one image of many from being correctly rendered. 
  • [docx] Fixed an issue preventing successful conversion of a file on Linux operating systems only. 
  • [docx] Fixed an issue preventing the detection of a Table of Contents due to the text order of the file. 
  • [pdf] Fixed an issue preventing PDFA-2b validation of a document when certain font combinations are installed.  

Office Fidelity: 

  • [docx] Improved detection of breaks on scanned documents containing Arabic text. 
  • [office] Streamlined optical character recognition workflow of large documents containing non-standard encoded text. 
  • [office] Allowed page snapshot deletion where annotations exist. 
  • [office] Improved processing of non-standard encoded characters to unicode. 
  • [office] Improved detection of combined characters. 
  • [office] Improved detection of Arabic diacritic characters. 
  • [office] Improved detection of transparent watermarks over scanned pages. 
  • [docx] Improved detection of Table of Contents. 
  • [rtf] Improved detection of characters when converting to RTF. 
  • [docx] Improved detection of shapes when converting to DOCX. 
  • [docx] Improved detection of serial images that contain underlines.  

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.

SOLID FRAMEWORK 10.0.17650

Solid Framework SDK has been updated.

Feature Update: 

  • Enabled language detection for Arabic language documents. 

Bugfixes: 

  • [docx] Improved detection of bold font style on a specific document. 
  • [docx] Resolved a page count issue on macOS. 
  • [docx] Improved detection of small caps font style effect. 

Office Fidelity: 

  • [office] Improved whitespace detection between Arabic words. 
  • [docx] Improved header and footer detection targeting one-page documents with a horizontal line below body content. 
  • [docx] Improved detection of tables that overlap with body content. 
  • [docx] Improved detection of word order with right-to-left aligned text. 
  • [docx] Improved detection of Arabic characters when the text layer does not match the character glyphs. 

Security 

  • A limited number of third-party libraries have been updated to include the latest security fixes.   

SOLID FRAMEWORK 10.0.17490

Solid Framework SDK has been updated.

Improvements: 

  • Improvements to pdf rendering clipping algorithms: 
    • Added method for detecting a polygon that had degenerated into a polyline. 
    • Added methods for detecting self-intersections of polygon contours. 
    • Added method for automatic error detection in the clipping algorithm. 
    • Added method for changing the direction of a polygon. 
    • Support for winding/alternate rules has been added prior to polygon clipping. 
    • Rewrote the method for finding polygon intersections. 
    • Rewrote the method for adding the found intersections of polygons to the polygon structures. 
    • All clipping algorithm methods have been updated to operate with the same tolerance. 
    • Improved the accuracy of determining the type of vertices found near a polygon. 
    • Improved processing of polygon edges located very close to each other. 
  • Header and footer improvements specifically targeting one-page documents: 
    • Improved the exclusion of graphic lines, images and labels. 
    • Improved the exclusion of large tables, images and footnotes. 
    • Improved the exclusion of headings and titles. 
    • Improved the exclusion of images or text located close to other page content. 

Bugfixes: 

  • [pdf] Fixed an issue preventing conversion with PDF/A-1a and A-2b standards due to a specific page structure. 
  • [docx] Resolved an issue with CAD source content where vertical text around architectural details is displaced. 
  • [docx] Fixed an issue resulting in the top of characters in one line of text to be clipped. 
  • [docx] Resolved an issue causing five rows of a table to be incorrectly merged. 
  • [docx] Fixed an issue causing two columns of a table to be merged into one. 
  • [docx] Fixed an issue in the clipping engine preventing successful rendering of a specific pdf. 
  • [docx] Fixed an issue causing the last line of a right-to-left direction paragraph to have a hanging indent. 
  • [docx] Resolved an issue where right-to-left text was incorrectly left aligned. 
  • [docx] Fixed an issue preventing the rendering of leader (tabbing) characters in table of contents containing right-to-left text. 
  • [docx] Fixed incorrectly wrapped right-to-left text causing a page overflow issue.  
  • [docx] Resolved an indentation and alignment issue at list items in a right-to-left document. .  

Office Fidelity: 

  • [docx] Improved table of contents detection by optimizing sections across pages. 
  • [office] Improved GNSE detection to independently recognize glyphs and unicodes in separate stages. 
  • [office] Improved support for Arabic diacritical marks using analysis of scale and character spacing. 
  • [docx] Improved border line termination in specific table cases 
  • Improved the left margin alignment of a document. 
  • [docx] Resolved an issue causing text misplacement when viewed on Office 2016 only. 
  • [docx] Fixed a hybrid table detection issue resulting in two additional columns. 
  • [docx] Fixed an issue causing line shapes to be rendered as underlines. 
  • [pptx] Fixed an issue resulting in a block of text in a table to be incorrectly divided into six rows. 
  • [docx] Resolved an issue that caused one table to be incorrectly split into two tables. 
  • [docx] Resolved an issue causing a textbox to be divided in two parts. 
  • [docx] Improved Arabic language character unicode detection 
  • [docx] Improved alignment and indentation of content with right-to-left text direction.

SOLID FRAMEWORK 10.0.17360

Solid Framework SDK has been updated.

Feature Updates: 

  • Added the ability to set the text recovery language to any language when the corresponding Tesseract traineddata file is available. 

Bugfixes: 

  • [docx] Fixed an issue that caused an extra paragraph to be inserted after a specific graphic group. 
  • [docx] Resolved an issue preventing the detection of a first page header. 
  • [docx] Fixed an issue that caused an extra space to be inserted incorrectly affecting content layout. 
  • [docx] Resolved an issue preventing the detection of a small portion of red background color on a graphic. 
  • [docx] Fixed an issue converting invisible text as visible when OCR detection results in very few words. 
  • [docx] Fixed an issue preventing successful conversion of a file on Linux arm64 operating systems only. 
  • [docx] Fixed an issue preventing successful GNSE detection of text when Roboto font is used. 
  • [docx] Resolved an issue preventing the successful conversion of a file containing detailed images. 
  • [docx] Fixed an issue that caused header content to be shifted down one line on certain pages of a document. 
  • [docx] Resolved an issue that caused a document to be incorrectly clipped diagonally causing content loss.  
  • [docx] Fixed an issue preventing text from being detected on certain pages of a specific scanned document. 
  • [docx] Resolved an issue that caused a font style change in the empty space following an underlined descending letter. 

Office Fidelity: 

  • [docx] Improved detection of page X of Y page number format. 
  • [docx] Improved consistency of detection of multi-line headers. 
  • [docx] Improved detection of repeated table headers as body content instead of as header content. 
  • [docx] Improved the recovery of Korean text when GNSE is enabled. 
  • [docx] Improved borderless table detection. 
  • [docx] Improved detection of alternating headers supporting odd and even pages. 
  • [docx] Improved detection of alternating footers on odd and even pages. 
  • [docx] Improved table detection. 
  • [docx] Improved detection of merged cells. 
  • [docx] Improved detection of self intersecting glyph outlines when GNSE is enabled. 
  • [docx] Improved recognition of multi column layouts. 
  • [docx] Extend our character detection for non standard encoding to use tesseract OCR when required. 

Security: 

  • Security scanning of our codebase is automated as part of our compilation process. 

 

SOLID FRAMEWORK 10.0.17268

Solid Framework SDK has been updated.

Feature Update:

Our Windows releases are now compiled with Visual Studio 2022 build tools.

PDF to .DOCX conversion improvements include:

  1. Fixed an issue preventing the underline style from applying to all characters of a word in a specific document.  
  2. Improved our compliance to the OpenXML standard when handling malformed hyperlinks. 
  3. Improved our compliance to the OpenXML standard after text layout improvements. 
  4. Improved our detection of list hierarchy. 
  5. Resolved a page count issue on macOS caused by the Helvetica font. 
  6. Fixed an issue causing a false link to be detected. 
  7. Resolved an issue preventing the detection of a page number in the footer on specific layout styles.  
  8. Improved inconsistent footer detection on specific layout styles.
  9. Improved detection and implementation of inline small graphic groups. 
  10. Improved detection of headers and footers located unusually far from the page edge. 
  11. Improved detection of underline style for descending letters g, p and y. 
  12. Improved detection of different odd and even headers and footers. 
  13. Reduced false detections of section title body content as headers. 
  14. Improved detection of headers when watermark graphics cross the header content. 
  15. Improved detection of borderless tables. 

PDF to PDF/A conversion improvements include:

  1. Fixed an issue with annotations and fields preventing verification of compliance with PDF/A-2a and 2b standards. 
  2. Resolved an issue preventing conversion with PDF/A-1a standards due to the limit for real values.