Log in
Sign up
See new Tweets

Conversation

Jill Walker Rettberg
@jilltxt
Dette er visst listen over norske "bad words" som brukes for å filtrere ut websider AIen *ikke* blir trent på. Skriv altså ikke "sinnsykt" eller "jævlig" i en webside hvis du vil at din tekst skal bli en del av AIens kollektive bevissthet.
Translate Tweet
github.com
List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/no at master · LDNOOBW/List-of-Dirty-Naught...
List of Dirty, Naughty, Obscene, and Otherwise Bad Words - List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/no at master · LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
7:53 AM · Dec 5, 2022
1
Like
Jill Walker Rettberg
@jilltxt
·
Dec 5, 2022
Replying to
@jilltxt
Men så trenes ikke GTP-3 og denslags på norsk såvidt jeg kan se? De stygge ordene er fra fotnote 20 her: http://arxiv.org/abs/2104.08758.
arXiv logo
arxiv.org
Documenting Large Webtext Corpora: A Case Study on the Colossal...
Large language models have led to remarkable progress on many NLP tasks, and researchers are turning to ever-larger text corpora to train them. Some of the largest corpora available are made by...
2
1