distribute_crawler
Crawler framework
A distributed web crawler framework using Scrapy, Redis, MongoDB, and Graphite for efficient crawling and data storage.
使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现
3k stars
367 watching
2k forks
Language: Python
last commit: almost 8 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A high-performance, NoSQL document-oriented database system written in C++ | 26,503 |
| A Node.js-based community platform with MongoDB and Redis integration | 9,335 |
| An Object-Document Mapper framework for interacting with MongoDB databases in Ruby. | 3,916 |
| A MongoDB pipeline extension for Scrapy spiders that enables real-time data insertion and buffering options. | 357 |
| Provides a 'soft delete' functionality for Mongoid documents | 122 |
| A Docker Compose setup for a sharded MongoDB cluster with replication. | 510 |
| A Redis-based sharding solution for Phalcon applications | 16 |
| Automates real-time market data retrieval and storage from Huobi exchange, publishing updates to Redis for use in backtesting and analysis. | 39 |
| A Ruby gem that provides a clean and efficient way to delete data from MongoDB databases. | 2 |
| A MongoDB driver written in Crystal that provides a high-performance interface to the MongoDB database | 72 |
| An application that visualizes Redis instances and analyzes query patterns and spikes. | 3,071 |
| Tools and libraries for importing data into MongoDB databases | 555 |
| Generates randomized data for testing MongoDB databases | 34 |
| A Python library providing an object-document mapper for MongoDB with support for synchronous and asynchronous operations. | 69 |
| Extracts hashes from MongoDB databases in SCRAM-SHA-1 or SCRAM-SHA-256 format for use with hashcat | 7 |