MathOCR
Document analyzer
A software project that enables the recognition and analysis of printed scientific documents, particularly focusing on mathematical expressions.
A scientific document recognition system
167 stars
11 watching
41 forks
Language: Java
last commit: about 2 years ago
Linked from 1 awesome list
latexoptical-character-recognitionscientific-document-recognition
Related projects:
Repository | Description | Stars |
---|---|---|
jsv4/opencontracts | A document analytics platform providing features for managing documents, extracting layout information and vector embeddings, annotating documents, and querying them using LlamaIndex. | 717 |
bobld/documentlayoutanalysis | Develops tools and algorithms for analyzing layout and structure of documents in PDF format | 583 |
icij/datashare | An application that helps investigate journalists analyze and search documents, using natural language processing and entity recognition techniques. | 597 |
mingyuan-xia/patdroid | An Android-specific toolkit for analyzing and understanding APK files | 118 |
open-korean-text/elasticsearch-analysis-openkoreantext | An Elasticsearch analyzer plugin for analyzing Korean text using the Open-Korean Text module. | 127 |
tingxueronghua/chartllama-code | A multimodal LLM for understanding and generating charts in various formats. | 196 |
wangqianwen0418/discrilens | A tool for analyzing and visualizing discrimination in machine learning models | 7 |
ddmcdonald/sparser | A model-driven language text analysis system with a rule-based approach to extract information from large volumes of text | 57 |
runem/web-component-analyzer | Analyzes web components and emits documentation in various formats | 506 |
uglytoad/pdfpig | A C# library for extracting and analyzing text from PDF files | 1,733 |
ohjeongwook/darungrim | Analyzes software patches to identify vulnerabilities and weaknesses | 359 |
johannesbuchner/languagecheck | A tool to analyze and improve the language of scientific papers before submission. | 97 |
tylabs/qs_old | A tool to analyze and extract malicious content from office documents and executables | 126 |
dlang-community/d-scanner | Analyzes D source code for syntax, style, and security issues | 242 |
ckorzen/pdf-text-extraction-benchmark | Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles | 65 |