Developing software using Solid Framework

What operating systems does Solid Framework support?

Solid Framework comes in two flavors.

Native Solid Framework

The native version is pure C++, and runs on Windows and OSX.

We are currently working hard on providing a version for Linux which we hope to release in the near future.

Solid Framework for .NET

The .NET wrapped version is a breeze to work with. It requires the .NET Framework 4.0 or later.  It is not supported on Windows 9x.

The framework has been tested on:

  • Windows 10
  • Windows 8
  • Windows 7
  • Windows Server 2016
  • Windows Server 2012 R2 x64
  • Windows Server 2008 R2 x64

The Solid Framework .dll is built as an “AnyCPU” framework and automatically runs x86 or x64 native, depending on the process that loads it.

Is SolidFramework just for .NET?

We offer two versions of the library – one as a .NET library, but we also supply a Native C++ DLL which can be used without .NET

What .NET programming languages can be used when writing software that works with Solid Framework?

Solid Framework is a CLS Compliant class library. This means that the class library only exposes features that are common across all .NET languages. For example, unsigned types and overloaded methods are not used since these features are not available in all supported languages.

For simplicity, all the samples and documentation are in C#. The other commonly used CLS Compliant languages are:

  • C#
  • Visual Basic .NET
  • J#

Is it possible to use C++ to write software that includes Solid Framework?

Absolutely! Several of our customers use C++ for the web based or app based products.

If C++ is used then .NET does not need to be available on the machine.

There are several samples to demonstrate the use of C++ available in the Downloads part of the Solid Framework portal.

Which versions of Visual Studio can Solid Framework be used with?

The Solid Documents team develop using VS 2013 and VS 2015 and we currently target the “v12” MSVC runtime (the version that shipped with VS 2013).

We also know that some of our customers are using VS 2017.

Is it possible to use Visual C++ 6.0?

The quick answer is that we have never tried so we don’t know.

The longer answer is that Solid Framework makes use of some language features, for example “Shared_Ptr” and “wstring” which were added to C++ in 2011. As such we think that there may be problems using Solid Framework with VC++6.

Having said, we deliberately do not use static linking to customer apps which makes us depend less on specific 3rd party library versions (such as the version of the MSVC runtime for example). If the customer can actually compile against our SolidFramework.h and SolidFramework.cpp API interface include files, then our library will work (these interface files then do version agnostic dynamic loading of the rest of our system).

How do I download the Solid Framework SDK?

The Solid Framework SDK can be downloaded using the self-service Solid Framework Developer Portal.

You will need to create an account in order to access the portal.

Click here to see video tutorial on how to download the SDK

After completing your download place SolidFramework.dll in the source folder of your project.

Add a reference to this assembly from your project, Click here to see video tutorial.

Parallel Processing under Solid Framework

Solid Framework is not intrinsically thread safe.

If parallel processing is required then we recommend using the “JobProcessor” class. This class will spin up a number of independent JobHandler processes, with each process performing a single conversion.

The JobProcessor will queue conversion requests and allocate them to a JobHandler process when one becomes idle.

By default, the JobProcessor will launch as many concurrent JobHandler processes as you have cores on your machine. You can restrict the number of parallel JobHandlers by setting JobProcessor::WorkerCount. 

Typically you will wish to set the number of workers to less than the number of cores on the machine, since this will allow other tasks to continue while Solid Framework is converting files.

Currently, JobProcessor is only available for .NET. We hope to release a native C++ version in the near future.


I have just updated my license using the latest Machine ID generator, and now I can’t run use Solid Framework. What has gone wrong?

The machine ID generator was recently updated to solve problems related to virtual or multiple network cards installed on the same machine.  The new generator generates longer IDs that are not compatible with versions of Solid Framework prior to 9.2.7514.

There are two solutions:

  1. Update the version of Solid Framework to at least 9.2.7514
  2. Download and use the version of the Machine ID generator compatible with older versions. This can be found at

How do I get and use a license for Solid Framework?

To use the features of Solid Framework, you need a license from Solid Documents. Licenses, including trial licenses, can be created using the self-service Solid Framework Developer Portal. These licenses depend on a machine-specific ID and there is a utility available at the Developer Portal to generate these ids.

To use the Solid Framework features you must embed the location of your license in your code Click here to see the video tutorial.

// Solid Framework (Professional) license
License.Import(new StreamReader(@"C:\Users\Joe\Documents\Visual Studio 2010\Projects\FrameworkProject\license.xml"));

Image Processing and OCR

What is the focus for OCR accuracy?

SolidFramework is primarily aimed at the reconstruction of business documents.

As such OCR is unashamedly biased towards accurately recognising the content of such documents.

What is CGM?

CGM is an abbreviation for Color Gray Mono. Originally our image processing was targeted at archiving functionality: creating small scanned pages for PDF/A archive files while preserving the page image quality as much as possible. To this end, we recognize zones of the page image based on their nature and break the page up into appropriate components. For example:

  • a color photograph is extracted from the rest of the page, downsampled (typically to 150dpi) and compressed as JPEG
  • for a color graphic or text heading (palette image) we resample the colors to a smaller palette (like 16 or 256 colors) and use lossless compression (think of it as a GIF or PNG)
  • for monochrome text we extract as either 1 or 2 bits per pixel (anti-aliased text) and store losslessly using CCITT FAX compression

This segmentation, selective use of lossy or lossless compression and selective downsampling allows us to build a composite image page in PDF which is far smaller than a single scanned image page would be.

What pre-processing does SolidOCR support?

Solid CGM also includes all the obvious pre-processing functionality required to process scanned page images.

  • deskew
  • auto rotate (determining dominant page text orientation)
  • despeckle (“salt” and “pepper” noise removal)
  • dynamic thresholding (OCR is typically a monochrome process but for that to work, we need to establish “paper” and “ink” shades and limits)
  • scanner noise removal (typically black bars at the edges of pages)
  • staple, punch hole and folded corner noise removal
  • 90 and 270 degree text component detection (minor text components not in the same orientation as the rest of the page)
  • vector table detection
  • vector underline removal and repair (fixing the text character descenders that the underline may have “sliced”)
  • inverse text component detection (either for the whole page or for smaller text components: typically white on black text but can be any colors)

What needs to be done to perform OCR on CJK or Greek Text?

SolidFramework uses Tesseract to preform OCR on Chinese, Japanese, Korean and Greek language documents.

Information about how to do this can be found in the document  Performing OCR using Tesseract.


How do I get hold of columns within a document?

Columns are a property of the “Section” object.

How can I find the location of a piece of text on a page?

Provided that the CoreModel has been created with the PdfOptions.ExposeTargetDocumentPagination set to true, then it is possible to get the LayoutDocument from the CoreModel once it has been created.

Each object within the CoreModel.Topic (except runs) has an associated Layout object. This layout object contains information about the location of the object within the document.

To get the layout object search the LayoutDocument.FindLayoutObject (ID), where ID is the identifier of the SolidObject which can be found using SolidObject.GetID().

For each paragraph in the CoreModel.Topic there will be a matching LayoutParagraph which provides access to the location.

When I try to convert a PDF to PDF/A, I get a conversion status of PdfAError, and yet conversion appears to have happened. What does this mean?

ConversionStatus.PdfAError means “There was a problem in the source document that meant that it was not PDF/A compliant”.

However, SolidFramework may have been able to resolve these problems to create a compliant document, in which case an output file would have been generated.

It is thereforenecessary to check whether the ConversionResults contains a path to a file, which would indicate that conversion was able to occur.

Typical code is as follows:

if (res == ConversionStatus.Success || res == ConversionStatus.PdfAError)
    // Get the location of the generated file
    if (conv.Results[0] != null)
        if (conv.Results[0].Paths.Count == 1)
            string path = conv.Results[0].Paths[0];
            //Do something with the file

How do I remove tagging from a PDF file?

Tagging uses a set of standard structure types to allow page content (text, graphics and images) to be extracted and reused for other purposes.

For example, Solid Framework uses tags, if present, to identify tables within a PDF. This can allow more accurate extraction of table data from a PDF. The problem with this is that tags are optional.

The same textual data may be identified as a table if tags are present, but identified as ordinary text if they are absent. This causes problems if you are trying to compare apparently similar files where one is tagged and the other is not.

Solid Framework allows tags to be removed from a PDF using the following code:

string taggedFile; //Path to PDF that contains tags
string untaggedFile; //Path to PDF that has had tags removed.

PdfDocument doc = new PdfDocument(taggedFile);
doc.SaveAs(untaggedFile, SolidFramework.Plumbing.OverwriteMode.ForceOverwrite, true);

Problem solving

How do I get debugging information from Solid Framework?

Solid Framework can emit a detailed text file log during processing. This can be very useful in allowing Solid Documents to identify where a problem is occurring.

Additional log files will be created by individual JobHandler processes if they are used. These files will have the letters “jh” and an ID included in their filename.

Note: in versions up to and including 9.2.8284, if the log file name does not end in “.txt” then all JobHandler processes will use the same log file which may cause file access contention and occasional conversion failures.

The pattern for using a log is typically something like this:

string logPath = @"c:\test\solidframework.txt";
if (System.IO.File.Exists(logPath)) {

SolidFramework.Plumbing.Logging.Instance.Path = logPath;