Resources
OCR resources
Resources and data for developing a language-aware OCR document error profiler and PoCoTo tools.
Manuals, lexica, OCR test data for PoCoTo and the profiler
15 stars
6 watching
2 forks
Language: Lex
last commit: over 3 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
cisocrgroup/pocoto | A Java-based tool for correcting errors in OCR'd historical documents | 40 |
ocropus/hocr-tools | Tools for manipulating and analyzing multi-lingual OCR results by representing them in a standard HTML format | 370 |
lascivaroma/lexical | Develops OCR models and ground truth data for a Latin lexical resource | 1 |
aslez/concor | A software package for concordance analysis in R | 9 |
lex4all/lex4all | Software tool to generate pronunciation lexicons for low-resource languages using speech recognition and machine learning algorithms. | 21 |
cpitclaudel/alectryon | Tools for processing Coq code and prose in technical documents | 236 |
ploc-org/cnpl | A collection of annual reports on domestic programming languages in China. | 234 |
talyssonoc/commonregexruby | Extracts common information from text strings in various formats | 79 |
chreul/ocr_testdata_earlyprintedbooks | Provides test data and models for training Optical Character Recognition (OCR) systems on historical printed books. | 10 |
peterc/whatlanguage | Language detection library using Bloom filters for speed and memory efficiency. | 685 |
osrf/osrf_testing_tools_cpp | Provides common testing tools and utilities for C++ projects | 33 |
oncybersec/oscp-enumeration-cheat-sheet | A cheat sheet for conducting enumeration during penetration testing and security assessments | 102 |
openseg-group/openseg.pytorch | Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. | 1,190 |
mittagessen/kraken | An OCR system optimized for historical and non-Latin scripts | 748 |