Commit 366934
2025-03-02 12:12:23 Nuno Oliveira: Conteúdo inicial.| /dev/null .. OCR.md | |
| @@ 0,0 1,25 @@ | |
| + | # Available OCR systems |
| + | |
| + | Most of these systems use a combination of classical OCR + AI for best results. |
| + | |
| + | ## olmOCR |
| + | |
| + | olmOCR is an open-source tool designed for high-throughput conversion of |
| + | PDFs and other documents into plain text while preserving natural |
| + | reading order. It supports tables, equations, handwriting, and more. |
| + | |
| + | Links: |
| + | - [Demo](https://olmocr.allenai.org/). |
| + | - [Github](https://github.com/allenai/olmocr). |
| + | - [HackerNews discussion](https://news.ycombinator.com/item?id=43174298). |
| + | |
| + | |
| + | ## Zerox OCR |
| + | |
| + | A dead simple way of OCR-ing a document for AI ingestion. Documents are |
| + | meant to be a visual representation after all. With weird layouts, |
| + | tables, charts, etc. The vision models just make sense! |
| + | |
| + | Links: |
| + | - [Github](https://github.com/getomni-ai/zerox). |
| + | - [Hosted version](https://getomni.ai/ocr-demo). |
