dataset: all words appearing in the title or text of an english wikipedia page, filtered as: filtered_words = [w for w in words if not (has_nonascii(w) or len(w) < 4 or digit_ratio(w) > .4)]
-
-
Replying to @browserdotsys @AmbrosialArts and
oh i thought you were proposing a dual index where we would look up a word in the L2R index or the R2L index based on whether that word’s first or last character had more entropy
1 reply 0 retweets 6 likes -
Replying to @chaosprime @browserdotsys and
sorry about increasing your monstrousness metric on that basis since it’s monotonic
1 reply 0 retweets 5 likes -
Replying to @chaosprime @AmbrosialArts and
I definitely like the high-grade autism implied by keeping two copies of every book in your collection so you can have two libraries with different sorting order
1 reply 0 retweets 13 likes -
Replying to @browserdotsys @chaosprime and
i thought i would also but ....after experiencing this thread
1 reply 0 retweets 5 likes -
-
Replying to @eigenrobot @bigmastertroll and
"Is Backwards Alphabetization Better Than Forward", the greatest thread in the history of Twitter, locked by
@jack after 12,239 pages of heated debate,2 replies 1 retweet 10 likes -
Replying to @chaosprime @eigenrobot and
please, how is this still happen
2 replies 0 retweets 4 likes -
Replying to @jenphalian @chaosprime and
I wonder if bigram frequencies would give a different result
1 reply 0 retweets 3 likes -
Replying to @browserdotsys @jenphalian and
huh In [83]: ent(first_bigram) Out[83]: 5.39144586487275 In [84]: ent(last_bigram) Out[84]: 5.494890595823844
1 reply 0 retweets 4 likes
ruh roh
-
-
Replying to @chaosprime @jenphalian and
good thing this Dangerous Knowledge is buried down here at the bottom of a thread everyone has muted
0 replies 0 retweets 7 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.