Usually python, applescript, or visual basic. Hilarious how many folks will trust automating these progresses without manually checking the data over tho. Same problem exists
-
-
Replying to @mrhunsaker @deevybee
Trying to bite my tongue, but there is a difference between data entry and data analysis. Also, please stop with the straw men. This thread was not intended to cover this, but it's not a great idea to use a full-blown spreadsheet for data entry.
2 replies 1 retweet 2 likes -
I think half the internet is now discussing this point. For the point I raised, It's a lost cause if I have to defend using well-commented version controlled scripts for data analysis. Senior folks should be pushing students to adopt these best practices.
1 reply 3 retweets 2 likes -
True. If manual observation of computational results is included I agree. But most trust the code to put the right thing in the right place. Open .CSV with Excel, visually confirm okay. Then import CSV into R or SAS
1 reply 0 retweets 0 likes -
Why open in excel? Check data class of columns, look at min/max values, are factors expected, plot variables to sense check. All of these steps are better done in
#opensource software.#opensource helps make our work reproducible and with code we can reuse these data checks.3 replies 1 retweet 2 likes -
Ffs. Open in Gnumeric then. Or Deducer in R. Wrote your own visualization software. Open source is irrelevqnt here. I say LOOK at the data because if EVERY DATA POINT is not correct then you have to start the process again.
1 reply 0 retweets 0 likes -
I can dream of every data point being correct, but I can't imagine it ever happening.
1 reply 0 retweets 1 like -
Then every analysis you ever performed is fraudulent and you should stop now. I'm not kidding. It is unethical to report analyses of data that are incorrect - when you have the means to ensure correctness.
1 reply 0 retweets 0 likes -
That seems a naïve view. When you expect errors you can plan for them. I have a dataset with 300 million cattle movements, self recorded by farmers. Farmers are humans so some errors will be there. Do you suggest I: a) abandon this dataset b) check each entry c) something else?
2 replies 1 retweet 4 likes -
What I will not say is that your data are any more or less reliable because you don't use a spreadsheet.
2 replies 0 retweets 0 likes
Hmm, so it's about doing it without dependance on proprietary code and in a replicable way. It's a great mix of things that lead me to think it's better to use something that's not binary files and outside version control.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.