Skip to main content

Document Filters 25.4 Release

· 2 min read
Nabih Metri
Nabih Metri
Product Manager

Every detail matters when preparing documents for AI and analytics. The 25.4 release of Document Filters refines how precision meets performance by introducing new controls for rendering, OCR, and text extraction that make every output cleaner and more usable. This release brings flexible PDF rendering modes for display or print consistency, embedded image OCR across Excel and PDF formats for deeper text capture, and fine-grained exclusion margins to remove unwanted headers and footers from extracted text. Together, these enhancements create cleaner data, sharper visual fidelity, and more reliable results for AI and other systems.

Release Highlights

Expanded OCR of embedded images in text documentsOCR_INLINE_IMAGES now extracts text from embedded images in Excel and PDF files, making hidden content searchable in even more formats.

Smarter PDF rendering – New PDF_RENDER_MODE option lets you fine-tune output for on-screen viewing or high-quality print, improving annotation and layout fidelity.

Cleaner extractions – Define PDF_EXCLUSION_MARGINS to automatically skip headers, footers, or side notes during text or Markdown extraction for PDFs, improving results for AI, curation, and search.

Document Filters Resources