Is there a fast way to just check a JSON file for syntactic correctness? I have ~13K JSON files I want to cat into jq and it keeps breaking in the middle when it hits a broken file...
Conversation
This Tweet was deleted by the Tweet author. Learn more
Replying to
That works, but would take around 10 minutes to get through everything. Obviously not a dealbreaker here, but curious for when I scale this up by 10x or 100x.
1
1
Switch the for loop to xargs / gnu parallel and saturate your IO / CPUs.
2
4
And this hack
cat *.json > /dev/null && parallel jq <yadda yadda>
to load them into the Linux memory buffers in one fell sweep. It can do wonders if IO is the bottleneck.
1
3
FWIW it's not too bad with parallel and jq, but I continue to believe that it could be a lot faster if it weren't trying to construct the objects in memory etc.:
ls *.json | parallel jq '<' {} '&>' /dev/null '||' echo {}
real 0m21.016s
user 6m20.112s
sys 0m32.337s
2
1
yajl has a json_verify utility. It should be a lot faster since it's a performance-oriented streaming library. It also has json_reformat which you can use to either beautify or minimize json. There are faster library implementations but it's nice having those utilities available.
2
6
Surprisingly slim pickings here - maybe folks could help populate this with suggestions (via a PR)
1
1
I only know about the yajl utilities because at one point I did json<tab> to see if I already had a tool for it and then used what I already had installed. Same reason that I use libxml2 for verification and minification of xml even though I think libxml2 is a terrible library.
Who can reply?
People @moyix mentioned can reply
I have yajl since i3-wm depends on it. Choosing json handling tools based on my choice of window manager is clearly the only logical approach.
Can see we use it in github.com/GrapheneOS/gra among other places. Real thought went into choosing tools for CSS, HTML, JS, etc. though.
1


