Tutorials¶
This section provides a comprehensive collection of tutorials designed to help users effectively utilize the Hyland Document Filters SDK. Each tutorial offers step-by-step guidance, covering various aspects of document processing, including document conversion, data extraction, and customization options. Whether you are a beginner or an experienced developer, these tutorials will enhance your understanding and facilitate your integration of Document Filters into your applications. Follow along to discover best practices and leverage the full potential of our powerful document processing capabilities.
Opening Files¶
How do I open a document from disk? | This sample demonstrates how to use the Hyland Document Filters SDK to extract text from a document. It provides a high-level workflow for initializing the Document Filters API, opening a document, and reading its text content in chunks until the end of the document is reached. |
How do I open a document from memory? | This sample demonstrates how to use the Hyland Document Filters SDK to extract text from a document loaded into memory. It provides a high-level workflow for initializing the Document Filters API, opening a document, and reading its text content in chunks until the end of the document is reached. |
Converting Files¶
How do I convert a document to a PDF file? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into PDF format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering its pages into a structured PDF output. |
How do I convert a document to Classic HTML? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into classic HTML format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and extracting its content and images for web presentation. |
How do I convert a document to JSON in Hi-Def mode? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into HiDef JSON format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering its content into a structured JSON output suitable for web presentation. |
How do I convert a document to Markdown in Hi-Def mode? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into HiDef Markdown format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering its content into a structured Markdown output suitable for web presentation. |
How do I convert a document to Markdown in Text mode? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into Text-mode Markdown format. It outlines a high-level workflow for initializing the Document Filters API, opening a document in Text mode, and rendering its content into structured Markdown output suitable for web presentation. |
How do I convert a document to paginated HiDef HTML? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into HiDef HTML format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering its content into a structured HTML output suitable for web presentation. |
How do I convert a document to PNG images? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into PNG format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering each page into individual PNG images. |
How do I convert a document to Structured XML? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a document into XML format. It provides a high-level workflow for initializing the Document Filters API, opening a document, and rendering its pages into a structured XML output. |
How do I convert a PDF to Markdown with Table Detection? | This sample demonstrates how to use the Hyland Document Filters SDK to convert a PDF document into HiDef Markdown format with table detection enabled. It provides a high-level workflow for initializing the Document Filters API, opening a PDF document, and rendering its content into a structured Markdown output suitable for web presentation. |
How do I convert HTML to Hi-Def? | This sample demonstrates how to use the Hyland Document Filters SDK to convert an HTML document into Hi-Def format. It outlines a high-level workflow for initializing the Document Filters API, opening an HTML file, and handling embedded images. The tutorial also covers different methods for resolving images, including external URLs, local disk references, and in-memory stream processing. |
Extracting Files¶
How do I extract metadata from a document? | This sample demonstrates how to use the Hyland Document Filters SDK to extract metadata from a document. It provides a high-level workflow for initializing the Document Filters API, opening a document, and retrieving its metadata. |
How do I extract sub-documents from documents and archives? | This sample demonstrates how to use the Hyland Document Filters SDK to extract metadata from subfiles within a document archive. It provides a high-level workflow for initializing the Document Filters API, opening a document, and iterating through its subfiles to read their associated metadata. |
How do I extract text and metadata from a document? | This sample demonstrates how to use the Hyland Document Filters SDK to extract text from a document. It provides a high-level workflow for initializing the Document Filters API, opening a document, and reading its text content in chunks until the end of the document is reached. |
How do I process password-protected files? | Document Filters provides a flexible callback-based approach to unlock password-protected files, whether they are documents or nested within containers. This includes scenarios where each item may have different passwords. |
Other¶
How do I capture log messages? | Hyland Document Filters provides a callback-based logging mechanism that allows applications to dynamically control log levels and capture detailed diagnostic information. This system gives developers full visibility into internal operations across modules while maintaining control over verbosity and performance. |
How do I compare documents? | Comparing documents presents a multifaceted challenge due to the dynamic nature of content. Documents can undergo various changes such as additions, deletions, modifications, reformatting, and even movement across pages. These alterations not only impact the visual appearance but also the underlying structure and semantics of the content. |
How do I create a barcode? | This sample demonstrates how to use the Hyland Document Filters SDK to generate a QR code and save it as a PNG image. It provides a high-level workflow for initializing the Document Filters API, creating an output canvas, and annotating it with a QR code. |
How do I integrate with another OCR engine | Hyland Document Filters provides built-in OCR integration with Tesseract, but you can also use a custom OCR engine such as EasyOCR, PaddleOCR, or any AI-powered service by registering an OCR callback. This allows you to incorporate OCR into the document processing pipeline while leveraging Document Filters' layout analysis and flexible output formats like Markdown and JSON. |
How do I localize metadata? | This sample demonstrates how to use the Hyland Document Filters SDK to localize an Outlook message and generate a PDF document. It provides a high-level workflow for initializing the Document Filters API, localizing message fields, opening the document, and rendering its content into a PDF format. |