There's some browser extensions out there that allow you to take full page screenshots. Maybe couple it also with multiple archiving services such as http://archive.is . I'd also be interested to hear if there are self-hosted clones of these services.
-
-
Diesen Thread anzeigenDanke! Twitter nutzt das, um Deine Timeline besser zu machen. Rückgängig machenRückgängig machen
-
-
-
They’re still subject to US law and take down content upon request. Would be good to 1– demystify that and set expectations and 2– figure out a way to decentralise the archive so it can protect content legally.
-
Agreed to both points. It seems however that some of those exclusions are the result of retroactive honoring of newly added robots.txt changes. Which to me seems a bit overzealous.
-
And those are retroactive? Or you’ve lost the crawling going forward?
-
My understanding is that they are (at least partially) automated, and yes they are retroactive. So all records previous to that change are also not accessible anymore. See https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/ …
Ende der Unterhaltung
Neue Unterhaltung -
-
-
Looks like they added a robots.txt file with a "Disallow all" paramater, that http://Archive.org used to respect (they announced last year they wouldn't anymore, but gradually https://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/ … )
-
Does respecting newly added robots.txt also mean doing so retroactively and remove all historical records collected until then?
-
Yes: "We have also seen an upsurge of the use of robots.txt files to remove entire domains from search engines when they transition from a live web site into a parked domain, which has historically also removed the entire domain from view in the Wayback Machine."
-
Right, I read that, I meant - should it?
-
That's a tough question... Legally speaking, it's a risk that they don't (a risk they now take, but websites owners can ask them to remove the archive)
-
Fair. I think there’s some misconception to correct then, cause I often see it recommended to journalists or activists as a tool to preserve evidence. And for a reason or another, it just isn’t.
Ende der Unterhaltung
Neue Unterhaltung -
-
-
.
@botherder What exactly are you trying to achieve by doing this? Ironically you seem to have become a stalker yourself — take a look in the mirror. -
By doing what? Using Internet Archive?
- 1 weitere Antwort
Neue Unterhaltung -
-
-
Did you reach out to
@internetarchive to see what happened? -
Yep. I wrote them in September last year, and never got an answer.
-
How rude ;)
-
They didn't reply to journalistic requests either.
-
.
@josephfcox You're not a journalist you’re an internet blogger. FYI, we've already written to http://archive.org explaining that you are acting like an obsessed pest — and to respect our right to have content removed. -
Can you be a little more specific,
@FlexiSPYLtd, on what you mean by "our right to have content removed"? -
I thought it was well known you can have your site excluded from
@internetarchive. Not exactly a secret. -
Calm down flexispy employee
@originalhater - 4 weitere Antworten
Neue Unterhaltung -
Das Laden scheint etwas zu dauern.
Twitter ist möglicherweise überlastet oder hat einen vorübergehenden Schluckauf. Probiere es erneut oder besuche Twitter Status für weitere Informationen.