kraken

OCR engine

An OCR system optimized for historical and non-Latin scripts

OCR engine for all the languages

GitHub

748 stars
27 watching
131 forks
Language: Python
last commit: 17 days ago
Linked from 1 awesome list

alto-xmlhandwritten-text-recognitionhocrhtrlayout-analysisneural-networksocroptical-character-recognitionpage-xml

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ocropus/hocr-tools Tools for manipulating and analyzing multi-lingual OCR results by representing them in a standard HTML format 370
calamari-ocr/calamari An OCR engine with modular design and a command-line interface, providing pre-trained models and a Python API for customization. 1,049
r1me/ttesseractocr4 An Object Pascal binding for the Tesseract OCR engine to perform optical character recognition 145
mstksg/advent-of-code-ocr A tool for parsing ASCII art word solutions from Advent of Code puzzles 5
ub-mannheim/ocr-fileformat Tool for converting and validating OCR file formats 180
ncsu-libraries/ocracoke A Rails application that enables the creation of OCR capabilities for indexing text from page images and providing search results in IIIF format. 33
hamdikahloun/windows_ocr An OCR library allowing developers to embed high-quality character recognition functionality in their products. 18
ibm/max-ocr An optical character recognition system deployed as a web service using a trained Tesseract OCR model 47
kba/hocr-spec A specification for an embedded OCR workflow and output format 74
openphilology/tei-ocr Customizes TEI XML for metadata from OCR processes to capture detailed layout and content information 1
jean-baptiste-camps/froc-mss Develops models to transcribe handwritten text from Old French and Old Occitan medieval manuscripts 0
kscanne/tesseract-gle-uncial Provides training data and scripts to enhance OCR accuracy for Irish Gaelic fonts 3
kraken-ci/kraken A continuous integration and testing system designed to focus on automated testing. 138
ub-mannheim/ocr-gt-tools A web-based tool for editing and annotating OCR transcriptions of scanned text 48