That 80 gig compressed file is just for tweets from an 18 hour period (if I remember correctly -- it was around 18-24 hours). I was ingesting around 2,000+ tweets for each second in the timeline.
Yeah -- this method is great when researching specific time periods (also could be used to do a full space search after Trump makes a tweet to see if the same accounts / bots retweet, etc.) But for keyword search, this wouldn't be helpful. I am exploring ways to find methods ..