Conversation

#Gab data update -- we have worked hard to fill in all remaining gaps within the dataset and have normalized / flattened the data to retain the proper post hierarchy. A few members of our team have been working hard to collect all media associated with Gab posts. So far, we ...
2
6
..have collected millions of media files associated with the Gab dataset so that researchers can better understand and analyze what type of media is associated with Gab posts. We are still working very hard to create a comprehensive corpus for all Gab content. #bigdata #datasets
1
4
This Tweet was deleted by the Tweet author. Learn more
Replying to
It's a file of JSON objects separated by new lines. There are a few utilities out there that can read the data including some short Python scripts. It depends on your platform and level of knowledge to help suggest the best tools.
This Tweet was deleted by the Tweet author. Learn more
Replying to
The format is platform agnostic. You can use Python, Perl, Java, Visual Basic, etc. For Python, reading the file is as trivial as: for line in sys.stdin: data = json.loads(line) The data variable would now be a dict object of a Gab post.
This Tweet was deleted by the Tweet author. Learn more