 OCR_GS_Data
 OCR_GS_Data 
 OCR dataset
 A collection of double-checked gold standard data for training and testing OCR engines.
Double-checked Gold Standard Data for Training and Testing OCR Engines
13 stars
 5 watching
 14 forks
 
Language: HTML 
last commit: over 8 years ago 
Linked from   1 awesome list  
 Related projects:
| Repository | Description | Stars | 
|---|---|---|
|  | Provides gold standard data for training and testing optical character recognition (OCR) engines. | 15 | 
|  | Provides test data and models for training Optical Character Recognition (OCR) systems on historical printed books. | 10 | 
|  | A dataset of 2D images and 3D data generated from the Grand Theft Auto game engine for object localization research. | 135 | 
|  | A dataset of human-labeled text excerpts validated against the Sustainable Development Goals. | 28 | 
|  | An OCR-as-a-Service using Tesseract and Docker with scalable architecture and support for multiple languages. | 1,346 | 
|  | Compiles lists of publicly available geospatial datasets from various cloud platforms. | 535 | 
|  | A C# wrapper around Google Test to enhance its user interface | 131 | 
|  | A platform for collecting and disseminating data for global sustainability indicators | 62 | 
|  | A collection of Go-based resources and tools for data science tasks | 879 | 
|  | Training data for a handwritten recognition system | 21 | 
|  | A collection of scanned book pages with ground truth annotations for OCR research and text analysis | 12 | 
|  | A large video dataset collected from various open-source websites for use in computer vision and multimedia applications. | 94 | 
|  | A JavaScript OCR engine using Emscripten compiled C code | 98 | 
|  | An OCR library allowing developers to embed high-quality character recognition functionality in their products. | 18 | 
|  | Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. | 1,191 |