squeeze

Extractor

A tool to extract relevant information from text

tangerine Extract rich information from any text (urls, todos, etc)

GitHub

17 stars
2 watching
0 forks
Language: Rust
last commit: almost 3 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
fourdigits/wagtail_textract A Django package that enhances Wagtail's document search with text extraction capabilities using Tesseract and Textract libraries. 33
knowitall/reverb Extracts binary relationships from English sentences at scale 543
deviantech/rack-referrals Extracts information about referring search engines from HTTP requests. 17
coleifer/micawber A library for extracting metadata and content from URLs 636
gmarty/xgettext Tools for extracting translatable strings from source code written in template languages. 77
cantino/ruby-readability A tool for extracting readable content from web pages written in Ruby. 925
steelthread/mimeograph A CoffeeScript library for extracting text from PDFs and creating searchable files 28
referefref/aiocrioc An automated tool that extracts and analyzes indicators of compromise from text data using natural language processing and OCR techniques. 31
emersonelectricco/boomerang A tool designed to safely capture off-network web resources for network defense and security analysis 37
yaa110/rake-rs A Rust library implementing a keyword extraction algorithm to automatically identify relevant words in text 33
anonyfox/elixir-scrape A tool for extracting structured data from web resources using information-retrieval techniques. 328
recrm/archivetools A collection of tools for extracting and analyzing data from web archives 69
iseahound/vis2 An automated OCR tool using computer vision for image text extraction 159
cocacola-lab/chatie A framework for extracting information from unannotated text using large language models 789
feichao93/temme A lightweight, CSS-based selector for extracting structured data from HTML documents. 273