Commit 366934

2025-03-02 12:12:23 Nuno Oliveira: Conteúdo inicial.
/dev/null .. OCR.md
@@ 0,0 1,25 @@
+ # Available OCR systems
+
+ Most of these systems use a combination of classical OCR + AI for best results.
+
+ ## olmOCR
+
+ olmOCR is an open-source tool designed for high-throughput conversion of
+ PDFs and other documents into plain text while preserving natural
+ reading order. It supports tables, equations, handwriting, and more.
+
+ Links:
+ - [Demo](https://olmocr.allenai.org/).
+ - [Github](https://github.com/allenai/olmocr).
+ - [HackerNews discussion](https://news.ycombinator.com/item?id=43174298).
+
+
+ ## Zerox OCR
+
+ A dead simple way of OCR-ing a document for AI ingestion. Documents are
+ meant to be a visual representation after all. With weird layouts,
+ tables, charts, etc. The vision models just make sense!
+
+ Links:
+ - [Github](https://github.com/getomni-ai/zerox).
+ - [Hosted version](https://getomni.ai/ocr-demo).
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9