In the last couple of years I’ve met many PhD student and early career researchers in doubt or discouraged to learn a statistical programming language (R, Python, Julia, …). I’ve made a list with 10 reasons to start programming.
[THREAD]
Conversation
Replying to
In my field (medicine), most statistical analysis that I come across are still done using point and click software (PACS) such as SPSS.
2
2
6
1) Reproducible statistical analyses
Meaning registration of ALL steps of the analysis in the form of a readable program (script). This IS statistical programming. Great things will happen when you program! Like, being able to reproduce results after, say, three months or so.
GIF
2
5
34
Sure, SPSS also comes with “scripts”, but my experience with these often fail in containing all steps taken and being readable.
2
14
2) The world moved on
For good reasons. Uptake of “new” statistical methods in PACS is generally slow. REALLY slow. Sooner or later you’ll need to use methods that aren’t available in PACS. By that time they are probably already available in popular programming languages.
GIF
2
1
13
For example: I regularly suggest penalized regression for analyses in small data sets (for reasons explained here: slideshare.net/MaartenvanSmed). These methods are well trusted and have been around for decades but very quickly you’ll run into the limitations of PACS.
1
2
41
3) Code sharing
Inspire and be inspired. No doubt you’ll do clever data analyses. Wouldn’t it be great to share clever code with peers? Perhaps not. But at least you’ll be able to use programs of your peers. Easy!
1
2
25
Places on the web such as github.com and stats.stackexchange.com are full with freely available programming code. You’ll be able to use these sources effectively. You’ll appreciate it.
1
1
31
4) Career perspectives
Of the post-doc positions in biomedicine currently listed on the Dutch vacancy website (AcademicTransfer), more than half require programming skills (14/24). Rather pursue a data science job outside academia? Great! You. Need. To. Learn. How. To. Program.
GIF
2
8
26
5) Data visualization
Data visualization is essential to get your message across. Programming languages such as R and Python are great for “Data viz”. It’ll REALLY change the way you can communicate about research. This should probably be reason no. 1 to start programming.
3
6
44
And you’ll be welcomed in the world of statistics GIFs! Great, no?
(source: paulvanderlaken.com/2017/11/15/adv )
GIF
1
13
74
6) Copy-paste frustration
If you’re like me you’ll find copy-pasting statistical output one-by-one into your word document very frustrating, and terrifyingly error-prone. Copy-paste burden significantly decreases once you can program your Tables, and copy-paste in one go.
2
2
18
The people from brought presentation by (statistical) programming to a whole new level. Check the great documents, slide-decks, books and websites that you’ll be able to make in R using R-markdown, here: rmarkdown.rstudio.com/gallery.html
1
2
47
7) Get what you need
After long preparation and endless trying the software doesn’t provide the output you thought you asked for. You’re stuck. Relatable? Programming means you’ll be able to create, change or extend existing software to get exactly what you asked for.
GIF
2
1
11
8) Develop interactive graphs and web-apps
Am I going to far here? Well maybe. But take a look what’s possible here:
shiny.rstudio.com/gallery/
2
15
9) When you need help of a statistician
Consults with a statistician (like me) are often because of 7👆. Such problems are generally quickly resolved when you know how to program. And even better: next time you encounter this problem you won’t need me at all.
GIF
1
9
10) Because you want to
GIF
2
9
After this personal top-10, there is much more to say about the pro’s and con’s of learning to statistically program. For instance, many programming languages (R, Python, Julia,…) are for free (as in beer) unlike PACS like SPSS. I'll stop now.
2
12
Final thought: as a frequent #rstats user, I can surely recommend learning R. For some start-up tips, check this great thread: twitter.com/dsquintana/sta or this free two hour R-tutorial to find out if it’s for you:
Quote Tweet
Still unsure whether R is the right place for your data analyses?
Learn the basics of #rstats in just 2 hours (for free!) at r-tutorial.nl.
No programming experience required.
8
10
69
