OCR - blame (366934) – ChemE contents

Blame

366934	Nuno Oliveira	2025-03-02 12:12:23
Conteúdo inicial.

# Available OCR systems

Most of these systems use a combination of classical OCR + AI for best results.

## olmOCR

olmOCR is an open-source tool designed for high-throughput conversion of

PDFs and other documents into plain text while preserving natural

reading order. It supports tables, equations, handwriting, and more.

Links:

  - [Demo](https://olmocr.allenai.org/).

  - [Github](https://github.com/allenai/olmocr).

  - [HackerNews discussion](https://news.ycombinator.com/item?id=43174298).

## Zerox OCR

A dead simple way of OCR-ing a document for AI ingestion. Documents are

meant to be a visual representation after all. With weird layouts,

tables, charts, etc. The vision models just make sense!

Links:

  - [Github](https://github.com/getomni-ai/zerox).

  - [Hosted version](https://getomni.ai/ocr-demo).