I've created a Python module that makes reading the Reddit monthly file dumps easier for researchers. This will open .zst files and read them line by line without loading the entire file into memory.
github.com/pushshift/zrea
#bigdata #datasets
Conversation
Seriously, I was complaining in my lab some days ago about this!
Replying to
It does -- but some people want to have the code encapsulated entirely in Python so this should help with that.
1
2
Show replies


