TabInOut
Table extractor
A framework for extracting information from tables in scientific literature using a rule-based approach.
Framework for information extraction from tables
41 stars
5 watching
10 forks
Language: Python
last commit: over 5 years ago
Linked from 1 awesome list
information-extractionliteraturerule-basedrule-enginerule-languagetable-information-extractiontext-miningwizard-steps
Related projects:
Repository | Description | Stars |
---|---|---|
nikolamilosevic86/tabledisentangler | A tool for automatically annotating tables in research papers with information about their functions and relationships. | 20 |
xyntopia/pydoxtools | A Python library for extracting information from unstructured documents using AI techniques and customizable pipelines. | 77 |
tabulapdf/tabula-java | Extracts tables from PDF files using Java | 1,843 |
the-black-knight-01/tabulo | Automated table detection and extraction using deep learning | 198 |
eyurtsev/kor | Extracts structured data from unstructured text using large language models | 1,629 |
monarch-initiative/ontogpt | An LLM-based tool for extracting structured information from text with ontology-based grounding. | 609 |
ckorzen/pdf-text-extraction-benchmark | Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles | 65 |
anonyfox/elixir-scrape | A tool for extracting structured data from web resources using information-retrieval techniques. | 328 |
recrm/archivetools | A collection of tools for extracting and analyzing data from web archives | 69 |
bikash/documentunderstanding | Research and development of tools and techniques for extracting information from images and PDFs using deep learning and graph neural networks. | 96 |
dice-group/fox | A framework that integrates Linked Data Cloud with diverse NLP algorithms to extract high-accuracy RDF triples from natural language. | 191 |
nissl-lab/toxy | A .NET framework for extracting text from various document formats across multiple platforms. | 359 |
nicolay-r/arekit | A toolkit for efficiently processing large text collections and extracting relations between objects in documents. | 59 |
eset-la/lord-of-the-strings | A tool to extract and classify relevant strings from binary files | 9 |
cocacola-lab/chatie | A framework for extracting information from unannotated text using large language models | 789 |