ocr-fileformat

OCR file converter

Tool for converting and validating OCR file formats

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

GitHub

182 stars
20 watching
23 forks
Language: JavaScript
last commit: 2 months ago
Linked from 1 awesome list

altofinereaderhocrocrocr-dpage-xmltransformationvalidation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ub-mannheim/ocr-gt-tools A web-based tool for editing and annotating OCR transcriptions of scanned text 48
ocropus/hocr-tools Tools for manipulating and analyzing multi-lingual OCR results by representing them in a standard HTML format 373
hamdikahloun/windows_ocr An OCR library allowing developers to embed high-quality character recognition functionality in their products. 18
kba/hocr-spec A specification for an embedded OCR workflow and output format 74
cneud/ocr-conversion A collection of scripts and stylesheets for converting data between different OCR formats. 72
onb-rd/hocrtools Utilities to process and transform hOCR files into ALTO format using XSLT transformations 6
suyesh/ocr_space An API wrapper for a service that converts images to text 70
manisandro/gimagereader A software tool that enables the conversion of images and documents into editable text using OCR technology. 1,653
mittagessen/kraken An OCR system optimized for historical and non-Latin scripts, providing layout analysis, character recognition, and support for various formats. 757
ryanfb/ancientgreekocr-ocr-evaluation-tools A collection of tools and scripts to evaluate the accuracy of Optical Character Recognition (OCR) systems 22
hougesen/mdsf A tool for formatting and verifying markdown code with various programming languages and formatters. 26
ibm/max-ocr An optical character recognition system deployed as a web service using a trained Tesseract OCR model 47
dobro/uef-lib A collection of Erlang functions for text manipulation and formatting 15
gotenberg/gotenberg-php A PHP client for interacting with a stateless API to convert various document formats into PDF files 237
eddieantonio/ocreval A collection of tools and utilities for evaluating the performance and quality of OCR output 57