Conversation

i was thrown off by a bunch of nearby words that had several very distinct meanings, i think some of the similarities were for different meanings and that took me a long time to sort out
1
7
Replying to
ah apparently it's expensive to generate and store the ranking data for each word, sad times
Quote Tweet
Replying to @QiaochuYuan
Or just store it in base64 and assume the user won't cheat. One problem is that I've only generated the top 1k nearest for ~4k of the 3m tokens in the dataset. I might decide to live without that, or decide to generate a few hundred k more and pay for a bit more disk.
3