blacklight-collector

Website scraper

A tool for scraping website content and analyzing browser behavior

GitHub

202 stars
14 watching
36 forks
Language: TypeScript
last commit: about 1 month ago

Related projects:

Repository Description Stars
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 681
spekulatius/phpscraper A web scraping utility for PHP that simplifies the process of extracting information from websites. 536
skallwar/suckit A Rust-based web scraping tool that recursively visits and downloads websites to disk. 747
rust-scraper/scraper A Rust library for parsing and querying HTML documents using CSS selectors. 1,937
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
slotix/dataflowkit A framework for extracting structured data from web pages using CSS selectors. 662
propublica/upton A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval 1,613
martinsbalodis/web-scraper-chrome-extension A web scraping tool integrated into a Chrome browser extension 1,314
spider-rs/spider A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. 1,140
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
scrapy/scrapely A pure-python library for extracting structured data from HTML pages. 1,863
fimad/scalpel A web scraping library providing a declarative interface on top of an HTML parsing library to extract data from HTML pages 323
tjatse/node-readability Automates web page scraping and text extraction to make any webpage readable 343
medialab/minet A command line tool and Python library for extracting data from various web sources. 286
archiveteam/wpull Downloads and crawls web pages, allowing for the archiving of websites. 556