An idea that I've been thinking about implementing is designing endpoints for other data engineers that will allow them to send data to Pushshift for permanent archival and indexing. The goal being to help crowd source data ingest projects and create a centralized repo for data.
Conversation
Replying to
This is a great idea, actually. But to do this, you should build trust that Pushshift is able to save and manage the data correctly. The reason being data is such a delicate good to handle.
4
1
Another component is how Pushshift is going to treat these data in terms of privacy and other legal requirements.
One idea is that the data are saved using an encoding system in such a way that even who is managing the system is not able to see the data.
1
1
Replying to
Very good points! Yeah, we need to make sure we cover all of these issues.

