pspider

Web crawler

A parallel web crawler framework built using PHP and MySQLi

纯 PHP 开发的并行抓取工具 (Parallel web crawler written in PHP)

GitHub

266 stars
41 watching
110 forks
Language: PHP
last commit: over 9 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
xianhu/pspider A Python web crawler framework with support for multi-threading and proxy usage. 1,828
manning23/mspider A Python-based tool for web crawling and data collection from various websites 348
crawlzone/crawlzone A PHP framework for asynchronous internet crawling and web scraping 78
mvdbos/php-spider A flexible PHP web crawler with configurable traversal algorithms and filters. 1,336
jmg/crawley A Pythonic framework for building high-speed web crawlers with flexible data extraction and storage options. 188
uscdatascience/sparkler A high-performance web crawler built on Apache Spark that fetches and analyzes web resources in real-time. 411
postmodern/spidr A Ruby web crawling library that provides flexible and customizable methods to crawl websites 809
stewartmckee/cobweb A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner 226
rivermont/spidy A simple command-line web crawler that automatically extracts links from web pages and can be run in parallel for efficient crawling 340
cocrawler/cocrawler A versatile web crawler built with modern tools and concurrency to handle various crawl tasks 188
hominee/dyer A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design. 135
bplawler/crawler A Scala-based DSL for programmatically accessing and interacting with web pages 149
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 380
kiddyuchina/beanbun A PHP framework for building distributed web crawlers with modular design and extensibility 1,249
wspl/creeper A framework for building cross-platform web crawlers using Go 780