Blame

366934 Nuno Oliveira 2025-03-02 12:12:23
Conteúdo inicial.
1
# Available OCR systems
2
3
Most of these systems use a combination of classical OCR + AI for best results.
4
5
## olmOCR
6
7
olmOCR is an open-source tool designed for high-throughput conversion of
8
PDFs and other documents into plain text while preserving natural
9
reading order. It supports tables, equations, handwriting, and more.
10
11
Links:
12
- [Demo](https://olmocr.allenai.org/).
13
- [Github](https://github.com/allenai/olmocr).
14
- [HackerNews discussion](https://news.ycombinator.com/item?id=43174298).
15
16
17
## Zerox OCR
18
19
A dead simple way of OCR-ing a document for AI ingestion. Documents are
20
meant to be a visual representation after all. With weird layouts,
21
tables, charts, etc. The vision models just make sense!
22
23
Links:
24
- [Github](https://github.com/getomni-ai/zerox).
25
- [Hosted version](https://getomni.ai/ocr-demo).