After testing, it appears that Pushshift can ingest, index and archive 10 million Telegram messages per day. With some additional tweaking, I should be able to push that up to 50-100 million.
I'm uncertain how many total messages there are but we definitely have the capacity ...
Conversation
Replying to
To stream a good chunk of the total messages. I am still working on improving the code.
More updates to come soon!
#bigdata #datascience
1
8
