Getting Stated with Java¶

Hyland Document Filters provides robust document processing capabilities that can be easily integrated into your Java applications. Follow these instructions to set up your environment.

Clone and Include the Document Filters Repository¶

The Document Filters GitHub repository contains the necessary files and libraries.

Installing the Bindings¶

The Java bindings JAR file, ISYS11df.jar, can be found in the bindings/java/lib directory of the Document Filters GitHub repository. While the same JAR file can be used across all platforms, you will need to obtain the appropriate native binaries for each platform you wish to support. The native binaries are included in the release ZIP files for each platform.

Note

The run.sh and run.cmd scripts included with the Java samples in the Document Filters GitHub repository automatically handle downloading the release binaries for the current platform.

Integrating with Maven¶

Add the JAR and native binaries: Since ISYS11df.jar is not hosted on Maven Central, you'll need to manually include the JAR file and native binaries.
Add the dependencies: Copy the ISYS11df.jar into your project directory (e.g., libs folder).

Update your pom.xml:

pom.xml

<dependencies>
    <!-- Add Document Filters JAR as a system-scoped dependency -->
    <dependency>
        <groupId>com.perceptive</groupId>
        <artifactId>documentfilters</artifactId>
        <version>11.0</version>
        <scope>system</scope>
        <systemPath>${project.basedir}/libs/ISYS11df.jar</systemPath>
    </dependency>
</dependencies>

<build>
    <plugins>
        <!-- Ensure native libraries are accessible by setting up system properties -->
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-antrun-plugin</artifactId>
            <version>1.8</version>
            <executions>
                <execution>
                <phase>process-resources</phase>
                <configuration>
                    <tasks>
                    <copy file="path/to/native/binaries/ISYS11df.dll" todir="${project.build.directory}/native/"/>
                    </tasks>
                </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

Configure Native Libraries: You may need to set the library path in your code or Maven build script using the java.library.path system property.

Integrating with Gradle¶

Add the JAR and native binaries: Copy the ISYS11df.jar to your project's libs directory.

Update your build.gradle:

build.gradle

groovy
Copy code
dependencies {
    // Add Document Filters JAR as a compile-time dependency
    implementation files('libs/ISYS11df.jar')
}

task copyNativeLibs(type: Copy) {
    from 'path/to/native/binaries'
    into "$buildDir/nativeLibs"
}

// Ensure native libraries are available
run {
    dependsOn copyNativeLibs
    systemProperty 'java.library.path', "$buildDir/nativeLibs"
}

Configure Native Libraries: Like Maven, you can set the java.library.path in your Gradle run configuration.

Integrating with Ant¶

Add the JAR and native binaries: Place the ISYS11df.jar and native binaries in your project folder (e.g., lib and native folders).

Update build.xml:

build.xml

<project name="DocumentFiltersProject" basedir="." default="run">

<path id="classpath">
    <pathelement location="lib/ISYS11df.jar"/>
</path>

<target name="run">
    <java classname="com.perceptive.App" fork="true">
    <classpath refid="classpath"/>
    <jvmarg value="-Djava.library.path=./native"/>
    </java>
</target>

</project>

Configure Native Libraries: Set the java.library.path using the jvmarg to point to the directory containing the native binaries.

Initializing and calling Document Filters¶

Once the package is installed, you can begin using it in your application.

Java

App.java

import com.perceptive.documentfilters.*;

public class App {

    private static final String LICENSE_KEY = "YOUR_LICENSE_KEY_HERE";

    public static void main(String[] args) {
        try {
            DocumentFilters api = new DocumentFilters();
            api.Initialize(LICENSE_KEY, ".");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Explanation:

The code imports the DocumentFilters package.
A new DocumentFilters instance is created and initialized using a license key. Replace "YOUR_LICENSE_KEY_HERE" with your actual license key.
The second parameter . specifies the directory for configuration files and resources, such as fonts.

Note: ISYS11df.(dll/so/dylib) will be loaded by a call to System.loadLibrary("ISYS11df"). For more details, refer to System.LoadLibrary.

Extracting Text¶

Once the Document Filters library is initialized, you can begin extracting text from documents. The following Java code snippet demonstrates how to load a document and extract its contents using the Document Filters API. This example focuses on extracting text from a Word document (.doc file).

Java

App.java

import com.perceptive.documentfilters.*;

public class App {

    private static final String LICENSE_KEY = "YOUR_LICENSE_KEY_HERE";

    public static void main(String[] args) {
        try {
            DocumentFilters api = new DocumentFilters();
            api.Initialize(LICENSE_KEY, ".");

            try (Extractor doc = api.GetExtractor("filename.doc")) {
                doc.Open(isys_docfilters.IGR_BODY_AND_META);

                while (!doc.getEOF()) {
                    String text = doc.GetText(4096);
                    System.out.println(text);                
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Explanation:

The code initializes DocumentFilters and loads a document filename.doc into an Extractor instance.
It uses IGR_BODY_AND_META to extract both the document body and metadata.
The GetText method reads the document's content in chunks of 4096 characters, looping until the EOF (End of File) is reached.

Converting a Document¶

After initializing the Document Filters library, you can convert documents into different formats, such as PDF. The following Java code snippet demonstrates how to load a Word document (.doc file) and convert it into a PDF using the Document Filters API.

Java

App.java

import com.perceptive.documentfilters.*;

public class App {

    private static final String LICENSE_KEY = "YOUR_LICENSE_KEY_HERE";

    public static void main(String[] args) {
        try {
            DocumentFilters api = new DocumentFilters();
            api.Initialize(LICENSE_KEY, ".");

            try (Extractor doc = api.GetExtractor("filename.doc");
                Canvas canvas = api.MakeOutputCanvas("output.pdf", isys_docfilters.IGR_DEVICE_IMAGE_PDF, "")) {

                doc.Open(isys_docfilters.IGR_FORMAT_IMAGE);

                for (int pageIndex = 0, pageCount = doc.GetPageCount(); pageIndex < pageCount; ++pageIndex) {
                    try (Page page = doc.GetPage(pageIndex)) {
                        canvas.RenderPage(page);
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Explanation:

This code converts filename.doc to a PDF by rendering each page into a Canvas.
The extractor is opened with IGR_FORMAT_IMAGE, which sets the document for image-based output, triggering pagination.
Each page of the document is rendered using RenderPage, looping through all available pages until complete.