Due to popular demand, I am now creating thorough documentation for the Pushshift search API:
github.com/pushshift/api
Conversation
Replying to
As an example, to find comments posted to Reddit related to Houston over the past 4 hours and find relevant posts: api.pushshift.io/reddit/search/
1
2
Replying to
Yep! It's intended to be a compliment to PRAW. PRAW is a great module, so hopefully this adds a lot of functionality to it.
1
Replying to
Do you plan to add an offset/page function, or should we rely using "before" with created_utc if we want more pages of results?
1
Replying to
Using the before / after parameters works best for the time being. Do you think adding pagination to the API would be helpful?
Replying to
It assists if you want >500 results (e.g. to do data science), but is ultimately just a shortcut for before/after. Also on older data,
Replying to
BigQuery would probably be more sane and performant (just my guess - not sure though!)
2
(e.g. just ask BQ instead of paginating 5 years of submissions)

