If you are interested in data for all of the stack overflow sites (over 100), Pushshift currently has all questions, answers and comments for all Stack sites available for research. There is an older dump via the Internet Archive, but that data is a pain in the ass to work with
Conversation
and also in XML format. Our data is in ndjson format and has all available fields for each type of data (questions, answers, comments and users).


