I would love to learn to read very large files at a decent speed (ex: genomic files, in the tens of gbs, where I'm looking for "close" matches to some gene sequence). What things like memory alignment, threads, parallelism, mmap, SIMD, or other things be used to help? @cmuratori
-
-
Thank for answering! I am going to look all these up. Let me take the chance to thank you for your inspiring work! I look forward to Star Code Galaxy.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Compress the file with LZ4 HC (or use btrfs + compression) and use iotop to figure out how close you're getting to the available disk bandwidth. There's no point optimizing past what a fast NVME drive can do unless you're planning on RAID'ing them or something.
-
And if you do multithreaded or multiprocess access, it will often look like random reads to the disk scheduler and drive firmware. For most NVME drives, the random IO is about 1/3rd as fast as sequential. But that might be fine if you're more CPU-bound.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.