QueryList

Web Scraper Framework

A PHP framework for building web scrapers and crawlers with a focus on ease of use and extensibility.

spider The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

GitHub

3k stars
74 watching
441 forks
Language: PHP
last commit: 4 months ago
Linked from 1 awesome list

crawlerquerylistscraperspider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,537
feng19/spider_man A high-level web crawling and scraping framework for Elixir. 23
veliovgroup/spiderable-middleware intercepts requests from web crawlers and proxies them to a prerendering service for rendering HTML 38
chenjiandongx/github-spider A Python-based web crawler for scraping Github user and repository data. 264
zhuyingda/webster A framework for automating web scraping and crawling tasks using Node.js 515
apify/crawlee A tool for building reliable web scraping and browser automation pipelines in Node.js. 15,604
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 681
kiddyuchina/beanbun A PHP framework for building distributed web crawlers with modular design and extensibility 1,248
technosophos/querypath A PHP library for manipulating XML and HTML documents, supporting various input formats and offering robust functionality through chaining. 823
joncanning/skyscraper A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. 58
mvdbos/php-spider A flexible PHP web crawler with configurable traversal algorithms and filters. 1,332
howie6879/ruia An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling 1,752
joseconstela/webparsy A Node.js library and CLI for scraping websites using Puppeteer and YAML definitions 44
xianhu/pspider A Python web crawler framework with support for multi-threading and proxy usage. 1,827
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826