Document Filters 26.1 Release
AI and analytics workflows rely on two things, faithful conversions and output you can trust downstream. Document Filters 26.1 improves both, empowering developers to build validated, high-precision applications. This release accelerates proof-of-concept phases by unlocking Markdown output for trial users, enabling immediate testing of LLM ingestion pipelines. It also enhances downstream AI capabilities with word-level location tracking in MDAST, providing the granular context needed for citation and grounding. Additionally, expanded format coverage for healthcare imaging ensures that specialized medical records are processed with the same reliability as standard business documents.
Release Highlights
Markdown output in trial mode – Developers building Generative AI solutions often need to validate extraction quality before committing. By enabling Markdown output in trial mode, teams can now fully prototype and test RAG pipelines and LLM ingestion workflows during the evaluation phase, ensuring Document Filters meets their accuracy needs from day one.
More precise structure for downstream AI – Precision is critical for grounded AI responses. The new word-level splitting in MDAST (using JSON_INCLUDE_WORD_LOCATIONS) enables applications to pinpoint exact text coordinates. This granular data supports features like precise citation highlighting in search UIs and accurate grounding for Large Language Model (LLM) responses.
Expanded format coverage for Healthcare – Medical archives are complex, often containing specialized imaging formats. Added support for DICOM JPEG SOF3 (3-channel) imaging ensures that healthcare applications can process complete patient records without missing critical visual data, improving interoperability and compliance in medical content management.
Release Links
- Document Filters 26.1 Release Notes
- Document Filters 26.1 Software Bill of Materials
- Enhancement Requests
