Blame
366934 | Nuno Oliveira | 2025-03-02 12:12:23 | 1 | # Available OCR systems |
2 | ||||
3 | Most of these systems use a combination of classical OCR + AI for best results. |
|||
4 | ||||
5 | ## olmOCR |
|||
6 | ||||
7 | olmOCR is an open-source tool designed for high-throughput conversion of |
|||
8 | PDFs and other documents into plain text while preserving natural |
|||
9 | reading order. It supports tables, equations, handwriting, and more. |
|||
10 | ||||
11 | Links: |
|||
12 | - [Demo](https://olmocr.allenai.org/). |
|||
13 | - [Github](https://github.com/allenai/olmocr). |
|||
14 | - [HackerNews discussion](https://news.ycombinator.com/item?id=43174298). |
|||
15 | ||||
16 | ||||
2b1591 | Nuno Oliveira | 2025-03-09 11:37:56 | 17 | ## Mistral OCR |
18 | ||||
19 | OCR engine for Le Chat / Mistral AI. |
|||
20 | ||||
21 | Links: |
|||
22 | - [LeChat](https://chat.mistral.ai/chat) |
|||
23 | - [HackerNews discussion](https://news.ycombinator.com/item?id=43282905). |
|||
24 | ||||
25 | ||||
366934 | Nuno Oliveira | 2025-03-02 12:12:23 | 26 | ## Zerox OCR |
27 | ||||
28 | A dead simple way of OCR-ing a document for AI ingestion. Documents are |
|||
29 | meant to be a visual representation after all. With weird layouts, |
|||
30 | tables, charts, etc. The vision models just make sense! |
|||
31 | ||||
32 | Links: |
|||
33 | - [Github](https://github.com/getomni-ai/zerox). |
|||
34 | - [Hosted version](https://getomni.ai/ocr-demo). |