metagoofil
Document scraper
Extracts metadata from public documents available on websites
Metadata harvester
1k stars
58 watching
205 forks
Language: Python
last commit: 8 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
gomoob/php-metadata-extractor | A PHP wrapper to call the Java metadata-extractor library. | 9 |
meilisearch/docs-scraper | Automates scraping and indexing of documentation content into a search engine | 290 |
jaimeiniesta/metainspector | A Ruby gem for web scraping and extracting metadata from web pages. | 1,036 |
needmorecowbell/giggity | A tool to scrape and store hierarchical data about GitHub organizations, users, or repositories. | 126 |
erikriver/opengraph | A Python module to extract and parse metadata from web pages using the Open Graph Protocol. | 228 |
unkl4b/gitminer | Automated tool for gathering code information from Github repositories | 2,092 |
neon-jungle/wagtail-metadata | A tool to help with metadata for search engines and social media platforms. | 116 |
barasher/go-exiftool | A Go wrapper around ExifTool to extract metadata from various file types. | 252 |
davemolk/gogetjs | Tools for extracting and analyzing JavaScript files from web pages | 40 |
pachterlab/ffq | A tool to fetch and display metadata from various public databases | 551 |
jkongie/mobi | An Ruby Gem to extract metadata from MOBI files | 38 |
michaelhelmick/lassie | Library for retrieving basic content from websites | 613 |
jgomezdans/get_modis | Downloads MODIS data from the USGS repository using a standardized interface | 62 |
aantron/lambdasoup | A functional HTML scraping and manipulation library | 383 |
holgerd77/django-dynamic-scraper | An app that allows you to manage Scrapy spiders through a Django admin interface. | 1,153 |