I've collected approximately 1 million tweets of a political nature so far since starting a few hours ago. The size uncompressed is around 5 gigabytes per one million tweets. So looking at around 500 GB for 100,000,000 million tweets. Keep in mind that every tweet has ...
Conversation
Replying to
...redundant data like the user object embedded in each tweet. Also, each tweet may have other tweets embedded in them (like quotes, retweets, etc.) I have a Python script I can make available to separate all of that if needed. But looking at half a terabyte here for all.
1
1
