This framework efficiently extracts text, metadata, images, and structured data from over 97 document and image formats.
Kreuzberg is a high-performance document intelligence framework built on Rust that processes PDFs, Office files, images, and many other formats. It offers broad integration options with bindings for languages like Rust, Python, Java, Node.js, and C#. You can interact with it via its command-line interface, like running `kreuzberg process document.pdf`, or by integrating language-specific bindings into your applications (e.g., `import kreuzberg` in Python or Node.js).
This framework efficiently extracts text, metadata, images, and structured data from over 97 document and image formats.
Developers needing to programmatically extract structured and unstructured data from diverse document formats should consider Kreuzberg.