TabInOut

Table extractor

A framework for extracting information from tables in scientific literature using a rule-based approach.

Framework for information extraction from tables

GitHub

42 stars
5 watching
10 forks
Language: Python
last commit: almost 6 years ago
Linked from 1 awesome list

information-extractionliteraturerule-basedrule-enginerule-languagetable-information-extractiontext-miningwizard-steps

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
nikolamilosevic86/tabledisentangler A tool for automatically annotating tables in research papers with information about their functions and relationships. 20
xyntopia/pydoxtools A Python library for extracting information from unstructured documents using AI techniques and customizable pipelines. 78
tabulapdf/tabula-java Extracts tables from PDF files using Java 1,859
the-black-knight-01/tabulo Automated table detection and extraction using deep learning 198
eyurtsev/kor An open-source wrapper around LLMs to extract structured data from text 1,638
monarch-initiative/ontogpt An LLM-based tool for extracting structured information from text with ontology-based grounding. 626
ckorzen/pdf-text-extraction-benchmark Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles 65
anonyfox/elixir-scrape A tool for extracting structured data from web resources using information-retrieval techniques. 328
recrm/archivetools A collection of tools for extracting and analyzing data from web archives 71
bikash/documentunderstanding Research and development of tools and techniques for extracting information from images and PDFs using deep learning and graph neural networks. 96
dice-group/fox A framework that integrates Linked Data Cloud with diverse NLP algorithms to extract high-accuracy RDF triples from natural language. 191
nissl-lab/toxy A .NET framework for extracting text from various document formats across multiple platforms. 362
nicolay-r/arekit A toolkit for efficient document-level relation extraction from large text collections. 61
eset-la/lord-of-the-strings A tool to extract and classify relevant strings from binary files 9
cocacola-lab/chatie A framework for extracting information from unannotated text using large language models 795