3/ You've also probably seen hints at the vision that's in my head. But, translating that will take time and a course heading is not enough. So, the first thing I want to do — my first objective — is, > To make Twitter data more accessible to researchers.
-
Show this thread
-
4/ If you follow me, there is a good chance you're into statistics, machine learning, or data science. But, if you're an academic working in isolation, you may not have time to experiment with the twitter API and set up the infrastructure for collection.
1 reply 2 retweets 5 likesShow this thread -
5/ I want to build a simple interface so you can do that part quickly and on an ad hoc basis. I want to lower the barriers of entry, so that we can get more eyeballs on the problem.
1 reply 1 retweet 3 likesShow this thread -
6/ To that end, my first key result will be developing, $ brittlewit sample | gzip > tweets.jsonl.gz This is the lowest hanging fruit.
1 reply 1 retweet 3 likesShow this thread -
7/ The next key result will be developing, $ brittlewit filter --track="keywords,..." and, $ brittlewit filter --follow="handle1,handle2,..."
1 reply 0 retweets 2 likesShow this thread -
8/ There are plenty of libraries that do this in [your favorite language] but they are designed for app building more than for researcher needs. I'll keep adding next steps to this thread as I go.
1 reply 0 retweets 3 likesShow this thread -
9/ A very, very, rough draft is good to go! (Please star and follow on
@github if you're interested!) https://github.com/jbn/brittlewit pic.twitter.com/DO21eA9FH8
1 reply 0 retweets 2 likesShow this thread -
Replying to @generativist @github
It's not Go (Python) but you might be interested in some of the work of the
@documentnow project, especially twarc https://github.com/docnow/twarc which has a similar form-factor and http://www.docnow.io/catalog/ for sharing datasets.2 replies 2 retweets 2 likes -
Replying to @edsu @generativist and
I say that knowing full well that there is tons of value in coming at the same problem from multiple angles and working out the details for yourself. In software the wheel always needs reinvention.
1 reply 0 retweets 1 like -
Yea. I have an existing library I wrote that I'm kinda porting that does some... ..."clever" stuff with coalescing Client credentials to get around API limits. But, this their work is slick and a great path to follow at the same time.
1 reply 0 retweets 0 likes
*Oh* *your* work :)
-
-
heh, of course I am tooting my own horn here. We talked a bit about coalescing credentials a bit too https://github.com/DocNow/twarc/issues/245 … but as a relatively small community of archivists we were concerned about about getting folks suspended.
1 reply 1 retweet 1 like -
Ah, yea -- that's what I assumed why you see it so rarely. Definitely going to dig in a bit deeper tomorrow. Thanks again!
0 replies 0 retweets 0 likes
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.