Blame
|
1 | # Available OCR systems |
||||||
| 2 | ||||||||
| 3 | Most of these systems use a combination of classical OCR + AI for best results. |
|||||||
| 4 | ||||||||
| 5 | ## olmOCR |
|||||||
| 6 | ||||||||
| 7 | olmOCR is an open-source tool designed for high-throughput conversion of |
|||||||
| 8 | PDFs and other documents into plain text while preserving natural |
|||||||
| 9 | reading order. It supports tables, equations, handwriting, and more. |
|||||||
| 10 | ||||||||
| 11 | Links: |
|||||||
| 12 | - [Demo](https://olmocr.allenai.org/). |
|||||||
| 13 | - [Github](https://github.com/allenai/olmocr). |
|||||||
| 14 | - [HackerNews discussion](https://news.ycombinator.com/item?id=43174298). |
|||||||
| 15 | ||||||||
| 16 | ||||||||
|
17 | ## Mistral OCR |
||||||
| 18 | ||||||||
| 19 | OCR engine for Le Chat / Mistral AI. |
|||||||
| 20 | ||||||||
| 21 | Links: |
|||||||
| 22 | - [LeChat](https://chat.mistral.ai/chat) |
|||||||
| 23 | - [HackerNews discussion](https://news.ycombinator.com/item?id=43282905). |
|||||||
| 24 | ||||||||
| 25 | ||||||||
|
26 | ## Zerox OCR |
||||||
| 27 | ||||||||
| 28 | A dead simple way of OCR-ing a document for AI ingestion. Documents are |
|||||||
| 29 | meant to be a visual representation after all. With weird layouts, |
|||||||
| 30 | tables, charts, etc. The vision models just make sense! |
|||||||
| 31 | ||||||||
| 32 | Links: |
|||||||
| 33 | - [Github](https://github.com/getomni-ai/zerox). |
|||||||
| 34 | - [Hosted version](https://getomni.ai/ocr-demo). |
|||||||
