ocr-fileformat
OCR file converter
Tool for converting and validating OCR file formats
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
182 stars
20 watching
23 forks
Language: JavaScript
last commit: 2 months ago
Linked from 1 awesome list
altofinereaderhocrocrocr-dpage-xmltransformationvalidation
Related projects:
Repository | Description | Stars |
---|---|---|
ub-mannheim/ocr-gt-tools | A web-based tool for editing and annotating OCR transcriptions of scanned text | 48 |
ocropus/hocr-tools | Tools for manipulating and analyzing multi-lingual OCR results by representing them in a standard HTML format | 373 |
hamdikahloun/windows_ocr | An OCR library allowing developers to embed high-quality character recognition functionality in their products. | 18 |
kba/hocr-spec | A specification for an embedded OCR workflow and output format | 74 |
cneud/ocr-conversion | A collection of scripts and stylesheets for converting data between different OCR formats. | 72 |
onb-rd/hocrtools | Utilities to process and transform hOCR files into ALTO format using XSLT transformations | 6 |
suyesh/ocr_space | An API wrapper for a service that converts images to text | 70 |
manisandro/gimagereader | A software tool that enables the conversion of images and documents into editable text using OCR technology. | 1,653 |
mittagessen/kraken | An OCR system optimized for historical and non-Latin scripts, providing layout analysis, character recognition, and support for various formats. | 757 |
ryanfb/ancientgreekocr-ocr-evaluation-tools | A collection of tools and scripts to evaluate the accuracy of Optical Character Recognition (OCR) systems | 22 |
hougesen/mdsf | A tool for formatting and verifying markdown code with various programming languages and formatters. | 26 |
ibm/max-ocr | An optical character recognition system deployed as a web service using a trained Tesseract OCR model | 47 |
dobro/uef-lib | A collection of Erlang functions for text manipulation and formatting | 15 |
gotenberg/gotenberg-php | A PHP client for interacting with a stateless API to convert various document formats into PDF files | 237 |
eddieantonio/ocreval | A collection of tools and utilities for evaluating the performance and quality of OCR output | 57 |