minet
Web scraper
A command line tool and Python library for extracting data from various web sources.
A webmining CLI tool & library for python.
286 stars
15 watching
26 forks
Language: Python
last commit: about 1 month ago clipythonwebmining
Related projects:
Repository | Description | Stars |
---|---|---|
jaimeiniesta/metainspector | A Ruby gem for web scraping and extracting metadata from web pages. | 1,036 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,140 |
benibela/xidel | A tool to extract data from web pages using various query languages and selectors. | 681 |
gushonorato/mechanize | A web scraping and automation tool for Elixir. | 30 |
unkl4b/gitminer | Automated tool for gathering code information from Github repositories | 2,092 |
felipecsl/wombat | A Ruby-based web crawler and data extraction tool with an elegant DSL. | 1,315 |
needmorecowbell/giggity | A tool to scrape and store hierarchical data about GitHub organizations, users, or repositories. | 126 |
mdsecactivebreach/linkedint | A Python-based tool for extracting and analyzing LinkedIn data for reconnaissance purposes during adversary simulation. | 476 |
martinsbalodis/web-scraper-chrome-extension | A web scraping tool integrated into a Chrome browser extension | 1,314 |
meilisearch/docs-scraper | Automates scraping and indexing of documentation content into a search engine | 290 |
manning23/mspider | A Python-based tool for web crawling and data collection from various websites | 348 |
emersonelectricco/boomerang | A tool designed to safely capture off-network web resources for network defense and security analysis | 37 |
malfrats/xeuledoc | A tool to fetch information about public Google documents from various services | 846 |
spekulatius/phpscraper | A web scraping utility for PHP that simplifies the process of extracting information from websites. | 536 |