Blame

366934 Nuno Oliveira 2025-03-02 12:12:23 1
# Available OCR systems
2
3
Most of these systems use a combination of classical OCR + AI for best results.
4
5
## olmOCR
6
7
olmOCR is an open-source tool designed for high-throughput conversion of
8
PDFs and other documents into plain text while preserving natural
9
reading order. It supports tables, equations, handwriting, and more.
10
11
Links:
12
- [Demo](https://olmocr.allenai.org/).
13
- [Github](https://github.com/allenai/olmocr).
14
- [HackerNews discussion](https://news.ycombinator.com/item?id=43174298).
15
16
2b1591 Nuno Oliveira 2025-03-09 11:37:56 17
## Mistral OCR
18
19
OCR engine for Le Chat / Mistral AI.
20
21
Links:
22
- [LeChat](https://chat.mistral.ai/chat)
23
- [HackerNews discussion](https://news.ycombinator.com/item?id=43282905).
24
25
366934 Nuno Oliveira 2025-03-02 12:12:23 26
## Zerox OCR
27
28
A dead simple way of OCR-ing a document for AI ingestion. Documents are
29
meant to be a visual representation after all. With weird layouts,
30
tables, charts, etc. The vision models just make sense!
31
32
Links:
33
- [Github](https://github.com/getomni-ai/zerox).
34
- [Hosted version](https://getomni.ai/ocr-demo).