It looks like the recent interest in archiving (political) Instagram Stories has been percolating for a little bit: https://medium.com/s/story/the-government-has-an-instagram-problem-8a087e8a58ab … https://www.washingtonpost.com/news/powerpost/paloma/the-technology-202/2019/01/07/the-technology-202-policymakers-are-embracing-instagram-stories-open-government-advocates-are-worried/5c3218371b326b66fc5a1bbc/?utm_term=.9b11561731e8 …https://twitter.com/AOC/status/1085337947354918912 …
-
Show this thread
-
http://wecbrecorder.io fwiw, does a nice job of archiving Instagram. here's webrecorder-player playing back
@aoc's instagrampic.twitter.com/HmWgHrQACv1 reply 5 retweets 5 likesShow this thread -
One drawback is that you need to click into each item when recording to insure that the various media files are collected. I've heard that Webrecorder are working on automated behaviors for particular sites, like the existing autoscroll, for speeding up repetitive tasks.
2 replies 0 retweets 5 likesShow this thread -
Another drawback is that Instagram Stories are temporary. So you would have to actively watch for these as they happen, in order to record them, instead of setting and forgetting something like an Archive-It web archiving job.
2 replies 0 retweets 2 likesShow this thread -
In an ideal world it would be great if user behaviors could be shared across the Webrecorder and Archive-It platforms. What if web archivists could develop behaviors like people used to create Greasemonkey scripts, and they could get deployed to web archiving platforms?
5 replies 4 retweets 8 likesShow this thread -
Replying to @edsu
Not insane at all! Here's a prototype already implementing a solution based on
@SeleniumHQ: https://webis.de/downloads/publications/papers/stein_2018v.pdf … See Section 3.4: User Simulation Scripts. Web archiving ans Selenium is a match made in heaven.1 reply 1 retweet 2 likes -
Thanks for sharing this, I wasn't aware of this research! We are taking a similar user behavior approach. We are especially interested in the replay/reproducability after the capture which has become more complex. Will look the publication!
1 reply 0 retweets 2 likes
I'd be especially curious if you have any insights on how to improve the current request-response fuzzy matching system. Perhaps the ML you've employed for QA could also be used to supplement the rule based approach? If that's something you're interested in, we should chat :)
-
-
Replying to @IlyaKreymer @edsu
@KieselJohannes Perhaps you can comment better than I can right now. We have looked at fuzzy matching. See Section 3.5.1 reply 0 retweets 1 like -
So far we only used the known fuzzy matching. ML in this regard is definitely something we are interested in! I'll get back to you tomorrow
1 reply 0 retweets 4 likes - 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.