Pushshift is processing 250 million tweets related to the Coronavirus and will be publishing a comprehensive list of hashtags associated with all tweets that mention the Coronavirus.
I expect the job to complete by tomorrow morning. The list should contain thousands of hashtags.
Conversation
Replying to
Once the job is completed, I will make available the hashtag data. This should aid other researchers who are currently collecting tweets related to Coronavirus by allowing more search terms to gather more data directly related to the pandemic.
Stay tuned!
1
1
16
Replying to
You're a saint! The volume of tweets had exceeded the data collection routine I threw together.
1
6
Replying to
It's a huge dataset. I'd like to publish the entire dataset but I have to check Twitter's TOS for research applications. I think at a bare minimum, I could publish the tweet ids along with some code to rehydrate the ids which should work.
2
1
9
Show replies
Replying to
Many thanks! Are the tweets and hashtags only in English or also in other languages?
1
Replying to
I too am impressed with the volume of COVID-19 data streaming on Twitter. I’m getting 40GB/day of raw JSON data, which is a good approximation of the 1% threshold. People are literally not talking about anything else.
2
5




