Getting Started with .NET¶

Hyland Document Filters provides robust document processing capabilities that can be easily integrated into your .NET applications. Follow these instructions to set up your environment.

Installing the Bindings¶

The Document Filters .NET bindings are available on NuGet under the package name Hyland.DocumentFilters, which provides the easiest and recommended way to integrate Document Filters into your .NET applications.

If needed, you can also access the source code for the bindings in the bindings/dotnet directory of the Document Filters GitHub repository. This allows for modifications and manual builds when customization is required.

Note

Earlier versions utilized Perceptive.DocumentFilters as a pre-compiled project, which remains included in the package. To upgrade a project to use Hyland.DocumentFilters, simply update the project reference and change the namespace accordingly.

Note

DocumentFilters DLLs are designed for a specific architecture. When your application targets AnyCPU, the architecture can be determined at runtime, so both 32-bit and 64-bit versions of the DocFilters DLLs must be present.

If you're using the NuGet package and targeting .NET Core, the selection of the appropriate DLLs is automated. However, if you are targeting .NET Framework, you must ensure that the DLLs are discoverable according to the Default Probing rules for Unmanaged (native) libraries.

Using Published Bindings¶

To use the official Document Filters .NET bindings, you can add the Hyland.DocumentFilters package from NuGet to your project using Visual Studio or the command line.

Via the Command Line:

Run the following command in your terminal:

dotnet add package Hyland.DocumentFilters

Via Visual Studio:

Follow these steps to add the bindings via Visual Studio:

Open your project in Visual Studio.
In the Solution Explorer, right-click on your project and select Manage NuGet Packages.
In the NuGet Package Manager window, select the Browse tab.
In the search box, type Hyland.DocumentFilters.
Once found, click on the Hyland.DocumentFilters package and select the desired version.
Click Install to add the package to your project.
After installation, the package will be listed under your project’s Dependencies in Solution Explorer, and you can start using the Document Filters API.

This ensures that you always have the latest stable version of the bindings, making integration and updates straightforward.

Building the Bindings¶

If you need to modify the .NET bindings or build them manually, follow these steps:

Clone the Document Filters GitHub repository.
Navigate to the bindings/dotnet directory.
Open the solution file in Visual Studio or another preferred IDE.
Make any necessary changes to the code.
Build the project to generate the custom .NET bindings for your application.

Once built, you can reference the custom bindings in your .NET project manually, or package them into a local NuGet package for reuse.

Initializing and calling Document Filters¶

C#

using Hyland.DocumentFilters;

var api = new Hyland.DocumentFilters.Api();
api.Initialize("YOUR_LICENSE_KEY_HERE", ".");

The preceding code loads the DocumentFilters package into global scope, then creates a new Api singleton and Initializes it with a license key. Replace "YOUR_LICENSE_KEY_HERE" with your actual license key.

The second argument controls where DocFilters should look for resources, such as configuration files and fonts. The . tells it to look in the same directory as the DocFilter's shared libraries.

Note: ISYSdf11.dll must be either in same folder as the currently executing Assembly or found by Default Probing rules for Unmanaged (native) libraries.

Extracting Text¶

C#

using Hyland.DocumentFilters;

var api = new Hyland.DocumentFilters.Api();
api.Initialize("YOUR_LICENSE_KEY_HERE", ".");

using var doc = api.GetExtractor("filename.doc");
doc.Open(Hyland.DocumentFilters.OpenType.BodyAndMeta)

while (!doc.EndOfStream)
{
    var text = doc.GetText(4096);
    Console.Out.WriteLine(text);
}

The preceding code loads the file filename.doc into an extractor doc. By using a scoped using block, the extractor will be closed when doc goes out of scope.

It then opens the extractor with BodyAndMeta indicating that we want to extract both the text (body) and the metadata of the document.

Finally, it loops over calling GetText until the extractor reports EndOfStream.

Converting a Document¶

C#

using Hyland.DocumentFilters;

var api = new Hyland.DocumentFilters.Api();
api.Initialize("YOUR_LICENSE_KEY_HERE", ".");

using var doc = api.GetExtractor("filename.doc");
using var canvas = api.MakeOutputCanvas("output.pdf", Hyland.DocumentFilters.CanvasType.PDF);

doc.Open(Hyland.DocumentFilters.OpenType.FormatImage)

canvas.RenderPages(doc);

The preceding code loads the file filename.doc into an extractor doc. By using a scoped using block, the extractor will be closed when doc goes out of scope.

It also creates a new canvas object of type CanvasType.PDF and stores it as canvas. The canvas will also be automatically closed when it goes out of scope.

It then opens the extractor with FormatImage indicating that we want to convert the file to an image based output. This triggers the file to be paginated.

Finally, it calls RenderPages to output every page from doc into the canvas. You could also manually iterate of over the pages and call RenderPages.

Did you know?

You can render more than one document to a canvas. If you want to stitch multiple files together, simply load each document into it's own Extractor, then call RenderPage/s onto a single canvas.