So, what's a good speed for CSV parsing? And what would people expect to get? I'm surveying the landscape and seeing a big range of functionality, from all-singing, all-dancing field recognition (date vs floating point vs int vs string) vs super-basic and/or wrong ...
-
-
Yours is definitely one of the better-looking ones. So Python's behavior is your inspiration? I almost immediately tripped over the CR-LF in RFC4180, but LF-in-reality ones. Doing what you do (CR, LF, *or* CR-LF seems pragmatic). In your experience are files consistent..
-
.. *internally*? That is, if you see CR-LF or CR alone on the first line, will you see another line-ending convention later? Or multiple quote conventions? (e.g. backslash-" vs "").
- 5 more replies
New conversation -
-
-
For speed, maybe look at https://bitbucket.org/ewanhiggs/csv-game … as a *starting* point. That benchmark's biggest problem is its lack of diversity in the inputs that it benchmarks. So your best bet is to go find some actual real world data sets.
-
I also wrote up some stuff here, including alternative architectures:https://www.reddit.com/r/rust/comments/8ur32t/is_there_a_zerocopy_csv_parser_for_generic_csv/ …
- 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.