blacklight-collector
Website scraper
A tool for scraping website content and analyzing browser behavior
202 stars
14 watching
36 forks
Language: TypeScript
last commit: about 1 month ago Related projects:
Repository | Description | Stars |
---|---|---|
benibela/xidel | A tool to extract data from web pages using various query languages and selectors. | 681 |
spekulatius/phpscraper | A web scraping utility for PHP that simplifies the process of extracting information from websites. | 536 |
skallwar/suckit | A Rust-based web scraping tool that recursively visits and downloads websites to disk. | 747 |
rust-scraper/scraper | A Rust library for parsing and querying HTML documents using CSS selectors. | 1,937 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
slotix/dataflowkit | A framework for extracting structured data from web pages using CSS selectors. | 662 |
propublica/upton | A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval | 1,613 |
martinsbalodis/web-scraper-chrome-extension | A web scraping tool integrated into a Chrome browser extension | 1,314 |
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,140 |
felipecsl/wombat | A Ruby-based web crawler and data extraction tool with an elegant DSL. | 1,315 |
scrapy/scrapely | A pure-python library for extracting structured data from HTML pages. | 1,863 |
fimad/scalpel | A web scraping library providing a declarative interface on top of an HTML parsing library to extract data from HTML pages | 323 |
tjatse/node-readability | Automates web page scraping and text extraction to make any webpage readable | 343 |
medialab/minet | A command line tool and Python library for extracting data from various web sources. | 286 |
archiveteam/wpull | Downloads and crawls web pages, allowing for the archiving of websites. | 556 |