distribute_crawler

Crawler framework

A distributed web crawler framework using Scrapy, Redis, MongoDB, and Graphite for efficient crawling and data storage.

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现

GitHub

3k stars
367 watching
2k forks
Language: Python
last commit: over 7 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
mongodb/mongo A high-performance, NoSQL document-oriented database system written in C++ 26,366
cnodejs/nodeclub A Node.js-based community platform with MongoDB and Redis integration 9,335
mongodb/mongoid An Object-Document Mapper framework for interacting with MongoDB databases in Ruby. 3,916
sebdah/scrapy-mongodb A MongoDB pipeline extension for Scrapy spiders that enables real-time data insertion and buffering options. 357
simi/mongoid_paranoia Provides a 'soft delete' functionality for Mongoid documents 122
minhhungit/mongodb-cluster-docker-compose A Docker Compose setup for a sharded MongoDB cluster with replication. 503
ienaga/redisplugin A Redis-based sharding solution for Phalcon applications 16
mmmaaaggg/ibats_huobifeeder_old Automates real-time market data retrieval and storage from Huobi exchange, publishing updates to Redis for use in backtesting and analysis. 39
databasecleaner/database_cleaner-mongo A Ruby gem that provides a clean and efficient way to delete data from MongoDB databases. 2
elbywan/cryomongo A MongoDB driver written in Crystal that provides a high-performance interface to the MongoDB database 72
nkrode/redislive An application that visualizes Redis instances and analyzes query patterns and spikes. 3,072
pkosiec/mongo-seeding Tools and libraries for importing data into MongoDB databases 554
duckie/mongo_smasher Generates randomized data for testing MongoDB databases 34
refty/mongo-thingy A Python library providing an object-document mapper for MongoDB with support for synchronous and asynchronous operations. 69
philsmd/mongodb2hashcat Extracts hashes from MongoDB databases in SCRAM-SHA-1 or SCRAM-SHA-256 format for use with hashcat 7