The new "Rinzler" big data archiving and index system (written purely in Go) was just tested on a 64 core machine and can encode and index a whopping 500,000 records per second. This take JSON data, indexes key fields, compresses with zstd and applies ...
#bigdata #datascience
Conversation
Reed Solomon redundant data packets for each row. Even with redundancy, each row of data is compressed to around 25-30% of it's original size. The goal is to reach one million encodings per second.
Replying to
If it can reach 1.5 million encodings per second, it could theoretically catalog and index every Reddit comment (over 5.5 billion) in one day.
1
4
Wow I can't even do basic math -- 1.5 million a second would index Reddit in one hour. Whoops.
2
