25.5 million #Gab Posts are now available. I have cleaned up the structure. The file format is ndjson. This is the first dump (largest) with additional data coming soon. Working on creating searchable ES indexes.
Location: files.pushshift.io/misc/GAB_posts
#datascience #bigdata #datasets
Conversation
This Tweet was deleted by the Tweet author. Learn more
Replying to
You can parse out whatever fields you want with Python. "jq" is a program on linux / max that makes working with ndjson very easy. Ping me personally and I'll help.
1
2
