scrapy-mongodb

Data pipeline

A MongoDB pipeline extension for Scrapy spiders that enables real-time data insertion and buffering options.

MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the items to MongoDB as soon as your spider finds data to extract.

GitHub

357 stars
26 watching
99 forks
Language: Python
last commit: over 3 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
yougov/mongo-connector Enables real-time data synchronization between MongoDB and other systems. 1,880
mongodb/pymodm Provides an object-oriented interface to MongoDB 354
scille/umongo A Python library for interacting with MongoDB using object-document mapping and asynchronous support 448
refty/mongo-thingy A Python library providing an object-document mapper for MongoDB with support for synchronous and asynchronous operations. 69
emacsorphanage/mongo Provides a way to interact with MongoDB databases using Emacs Lisp 47
markroddy/duckdb-pytables An extension for DuckDB that allows running SQL queries on arbitrary data sources using Python functions. 83
mongodb-labs/pymongoexplain Provides a simplified interface to explain MongoDB commands in PyMongo 3
davidlatwe/montydb A pure Python-implemented alternative to MongoDB. 583
mehd-io/pypi-duck-flow A project to build data pipelines and visualizations for analyzing Python package download data from PyPi. 148
holgerd77/django-dynamic-scraper An app that allows you to manage Scrapy spiders through a Django admin interface. 1,153
enterprisedb/mongo_fdw A PostgreSQL extension that enables interaction with MongoDB databases through foreign data wrappers. 330
msamogh/nonechucks Library that provides dynamic data cleaning and filtering capabilities for PyTorch datasets and samplers 377
rick446/mmm A tool for setting up multi-master replication with MongoDB 69
doableware/djongo Provides a bridge between Django and MongoDB databases 1,886
emicklei/mora A RESTful API for interacting with MongoDB databases 315