2/ Question 1: Am I in legal jeopardy by doing so? I abided by robots.txt, but it clearly violates the TOS. But, I'm not familiar with many cases of non-profit scrapes for academic purposes being penalized.
-
-
Show this thread
-
3/ Question 2: What's the standard for protecting identities in free-form text? People have a right to anonymity and "forgetting". Do I grep all possible names or use a language tagger to remove proper nouns that look like names?
Show this thread -
4/ Or, Question 3: Do you think such a dataset is unethical to release, period?
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.