I keep running into a bug when scraping #reddit comments with #Python 's #PRAW package. Anyone in my network able to help me debug? #computationalsocialscience #soctwitter
Conversation
Replying to
Assuming you already checked, but just in case, do the public archives of "all" comments meet your needs? E.g. 's files.pushshift.io/reddit/comments
1
Or the "real-time" / SQL interface in bigquery?
1
It does, I just was invested in using PRAW because I had already put so much time into trying to get it fully functioning by the time someone mentioned it. It's looking like I'll need to transition..
1
1
Exactly, and for some reason I see using python for the scraping as being higher in cultural capital, but at the end of the day its not working and I need to use a resource that works.
2
1
Are you working with all subreddits or just certain one? Happy to help.
1
2
Two different datasets:
one with all threads (+ thread comments collapsed as a thread corpus) for about 15 specific subreddits over a pre-defined time period
second very similar corpus structure, but pulling threads that include specific keywords from 4 more general subreddits
1
it would be great to have some help, because SQL bigqueries the guide that sent over (pushshift.io/using-bigquery) isn't working with google bigquery's new update
2
(You will also want to use Felipe's tables for the most recent data)


