ocr-fileformat

OCR file converter

Tool for converting and validating OCR file formats

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

GitHub

180 stars
20 watching
22 forks
Language: JavaScript
last commit: about 1 month ago
Linked from 1 awesome list

altofinereaderhocrocrocr-dpage-xmltransformationvalidation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ub-mannheim/ocr-gt-tools A web-based tool for editing and annotating OCR transcriptions of scanned text 48
ocropus/hocr-tools Tools for manipulating and analyzing multi-lingual OCR results by representing them in a standard HTML format 370
hamdikahloun/windows_ocr An OCR library allowing developers to embed high-quality character recognition functionality in their products. 18
kba/hocr-spec A specification for an embedded OCR workflow and output format 74
cneud/ocr-conversion A collection of scripts and stylesheets for converting data between different OCR formats. 71
onb-rd/hocrtools Utilities to process and transform hOCR files into ALTO format using XSLT transformations 6
suyesh/ocr_space An API wrapper for a service that converts images to text 70
manisandro/gimagereader A software tool that enables the conversion of images and documents into editable text using OCR technology. 1,634
mittagessen/kraken An OCR system optimized for historical and non-Latin scripts 748
ryanfb/ancientgreekocr-ocr-evaluation-tools A collection of tools and scripts to evaluate the accuracy of Optical Character Recognition (OCR) systems 22
hougesen/mdsf A tool to format markdown code snippets using various formatters. 23
ibm/max-ocr An optical character recognition system deployed as a web service using a trained Tesseract OCR model 47
dobro/uef-lib A collection of Erlang functions for text manipulation and formatting 15
gotenberg/gotenberg-php A PHP client for interacting with a stateless API to convert various document formats into PDF files 226
eddieantonio/ocreval A collection of tools and utilities for evaluating the performance and quality of OCR output 57