I should have the #kavanaugh Twitter dataset ready for download on Friday. I am doing some sanity checks on the data and prepping it for the dump. I'll have a more accurate number for the amount of tweets in a day or two.
#datasets #bigdata #datascience
Conversation
The current size of the dataset uncompressed is around 375 GB of tweets. That should be over 100 million tweets.
2
1
Replying to
Will there be handles or bios attached to each? I was thinking I would use keywords like maga or resist, etc in those to act as a kind of a negative/positive labeling method.
1
1
I have a tweet classification neural network I built as a side project basically ready to go for it. Except it was built for 140 char tweets, so I'll need a few adjustments.
1
2
Replying to
No, I just need to make some topology design decisions. It uses convolution and a GRU recurrent layer. Just a question of what adjustments to make. The code changes will be easy.
1
1
Replying to
Does this program do sentiment analysis on data? Do you have a website showing some previous work that was done using this program? I am very interested in learning more! Thank you!
1
1
Show replies

