twarc

Data archiver

A tool for archiving Twitter JSON data via the Twitter API

A command line tool (and Python library) for archiving Twitter JSON

GitHub

1k stars
35 watching
255 forks
Language: Python
last commit: about 1 year ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
archivesunleashed/twut An open-source toolkit for analyzing Twitter archives using Apache Spark. 9
fisadev/twistorpy A tool to backup Twitter user's tweets to a JSON file 4
simonlindgren/2wttr Collects and processes tweets from the Twitter API using Academic access 20
dapivei/tweetple A Python library that provides a simple interface to stream information from Twitter's Full-Archive Search Endpoint. 12
janezkranjc/twitter-tap A tool for collecting tweets from Twitter's search API and storing them in a MongoDB database 80
twitter/elephant-bird A collection of input formats and utilities for working with compressed data files in various formats. 1,138
nla/httrack2warc Converts HTTrack crawls to WARC files by reconstructing requests and responses from logs 30
peterk/warcworker A web archiving tool that archives websites with high-fidelity preservation capabilities. 55
eldraco/twitter-stats A tool to retrieve and display Twitter account statistics. 4
shohil-kishore/twitter-data-toolkit An easy-to-use web application that collects and merges Twitter data from the v2 API into four JSON files. 7
n0tan3rd/squidwarc An archival crawler built on top of Chrome or Chromium to preserve the web in high fidelity and user scriptable manner 169
ryanmcgrath/twython Provides access to Twitter data and functionality via a Python interface 1,855
archiveteam/grab-site A web crawler designed to backup websites by recursively crawling and writing WARC files. 1,402
webrecorder/har2warc Converts HTTP Archive format to Web Archive format 46
chfoo/warcat Tool for handling Web Archive files 150