yomu
File extractor
A Ruby library for extracting text and metadata from various file formats.
Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)
499 stars
12 watching
125 forks
Language: Ruby
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
jonmagic/grim | A tool for extracting pages from PDFs and converting them to images and text strings. | 216 |
jimm/midilib | A Ruby library for reading and writing MIDI file formats | 181 |
exiftool-rb/exiftool.rb | A Ruby library that wraps ExifTool to extract metadata from images and videos. | 71 |
yohasebe/lemmatizer | A Ruby library that provides a lemmatizer for text in English. | 108 |
geemus/formatador | A library for formatting text with various options and capabilities for displaying tables, progress bars, and other formatted output. | 451 |
tmm1/emoji-extractor | A Ruby script that extracts high-resolution emoji images from Apple's font files | 558 |
cantino/ruby-readability | A tool for extracting readable content from web pages written in Ruby. | 925 |
recrm/archivetools | A collection of tools for extracting and analyzing data from web archives | 69 |
jkongie/mobi | An Ruby Gem to extract metadata from MOBI files | 38 |
rom-rb/rom-yaml | Provides YAML-based data mapping and serialization support for Ruby objects | 28 |
robotools/extractor | A tool for extracting data from font binaries into UFO objects. | 52 |
gunnarmorling/quarkus-pdf-extract | A Quarkus-based microservice to extract text from PDF files | 24 |
yoshoku/rumale | A Ruby machine learning library providing interfaces to various algorithms | 785 |
gomoob/php-metadata-extractor | A PHP wrapper to call the Java metadata-extractor library. | 9 |
coleifer/micawber | A library for extracting metadata and content from URLs | 636 |