TabInOut
Table extractor
A framework for extracting information from tables in scientific literature using a rule-based approach.
Framework for information extraction from tables
42 stars
5 watching
10 forks
Language: Python
last commit: almost 6 years ago
Linked from 1 awesome list
information-extractionliteraturerule-basedrule-enginerule-languagetable-information-extractiontext-miningwizard-steps
Related projects:
Repository | Description | Stars |
---|---|---|
nikolamilosevic86/tabledisentangler | A tool for automatically annotating tables in research papers with information about their functions and relationships. | 20 |
xyntopia/pydoxtools | A Python library for extracting information from unstructured documents using AI techniques and customizable pipelines. | 78 |
tabulapdf/tabula-java | Extracts tables from PDF files using Java | 1,859 |
the-black-knight-01/tabulo | Automated table detection and extraction using deep learning | 198 |
eyurtsev/kor | An open-source wrapper around LLMs to extract structured data from text | 1,638 |
monarch-initiative/ontogpt | An LLM-based tool for extracting structured information from text with ontology-based grounding. | 626 |
ckorzen/pdf-text-extraction-benchmark | Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles | 65 |
anonyfox/elixir-scrape | A tool for extracting structured data from web resources using information-retrieval techniques. | 328 |
recrm/archivetools | A collection of tools for extracting and analyzing data from web archives | 71 |
bikash/documentunderstanding | Research and development of tools and techniques for extracting information from images and PDFs using deep learning and graph neural networks. | 96 |
dice-group/fox | A framework that integrates Linked Data Cloud with diverse NLP algorithms to extract high-accuracy RDF triples from natural language. | 191 |
nissl-lab/toxy | A .NET framework for extracting text from various document formats across multiple platforms. | 362 |
nicolay-r/arekit | A toolkit for efficient document-level relation extraction from large text collections. | 61 |
eset-la/lord-of-the-strings | A tool to extract and classify relevant strings from binary files | 9 |
cocacola-lab/chatie | A framework for extracting information from unannotated text using large language models | 795 |