twarc
Data archiver
A tool for archiving Twitter JSON data via the Twitter API
A command line tool (and Python library) for archiving Twitter JSON
1k stars
35 watching
255 forks
Language: Python
last commit: about 1 year ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
archivesunleashed/twut | An open-source toolkit for analyzing Twitter archives using Apache Spark. | 9 |
fisadev/twistorpy | A tool to backup Twitter user's tweets to a JSON file | 4 |
simonlindgren/2wttr | Collects and processes tweets from the Twitter API using Academic access | 20 |
dapivei/tweetple | A Python library that provides a simple interface to stream information from Twitter's Full-Archive Search Endpoint. | 12 |
janezkranjc/twitter-tap | A tool for collecting tweets from Twitter's search API and storing them in a MongoDB database | 80 |
twitter/elephant-bird | A collection of input formats and utilities for working with compressed data files in various formats. | 1,138 |
nla/httrack2warc | Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs | 30 |
peterk/warcworker | A web archiving tool that archives websites with high-fidelity preservation capabilities. | 55 |
eldraco/twitter-stats | A tool to retrieve and display Twitter account statistics. | 4 |
shohil-kishore/twitter-data-toolkit | A tool to collect and aggregate Twitter data using the v2 API for research purposes. | 7 |
n0tan3rd/squidwarc | An archival crawler built on top of Chrome or Chromium to preserve the web in high fidelity and user scriptable manner | 169 |
ryanmcgrath/twython | Provides access to Twitter data and functionality via a Python interface | 1,855 |
archiveteam/grab-site | A web crawler designed to backup websites by recursively crawling and writing WARC files. | 1,398 |
webrecorder/har2warc | Converts HTTP Archive format to Web Archive format | 46 |
chfoo/warcat | Tool for handling Web Archive files | 150 |