Conversation

Replying to
You can "cheat" and get an estimate with `head -n 10000 data.file > sample.file; ll -h sample.file`. should tell you size on disk for 10k lines, then divide the size of the data by the sample to estimate lines.
1
2
Replying to
Yup! It assumes that the first however many lines are representative. Often that's true, but now always. Bigger samples help too, but if it geyt big enough, your sample is just the whole dataset and you save nothing haha.
1
1