The Kavanaugh #dataset has been published. This collection of tweets contains over 56 million tweets from 3.2 million unique Twitter users spanning the nomination process. The dataset is ~315 GB uncompressed / 11 GB compressed. pushshift.io/kavanaugh-twit
#bigdata #datascience #nlp
Conversation
Replying to
Let’s get to work this weekend. We’re gonna need a bigger boa-I mean computer.
1
1
The best way to do this would probably to use spark and Hadoop for the neural net, but that would cost a lot.


