Analytics work needs to be two things to scale well:
- Discoverable
- Reproducible
And [this'll be a bit controversial, but] to get there:
- Use SQL when possible (vs Py/R).
- Don't share work primarily through git.
Read more
, and fight me!
https://towardsdatascience.com/how-to-scale-your-analytics-org-by-ditching-git-3d8d4ce398d1 …
-
Show this thread
-
Replying to @imrobertyi
If you get into any fights, I will join your side. the SQL-by-default take is probably the spiciest thing in here but it’s a take I very much agree with. So much of DS work is about counting, and SQL is hard to beat at that, particularly with modern data warehouses
2 replies 0 retweets 12 likes -
Replying to @imightbemary @imrobertyi
But why would you use such an impoverished language when perfectly good abstractions can sit on top?
1 reply 0 retweets 2 likes -
Replying to @RussellSPierce @imightbemary
what sort of abstractions are you thinking
@RussellSPierce?1 reply 0 retweets 0 likes -
Replying to @imrobertyi @imightbemary
dbplyr, ibis, etc. Going into a programing language lets you abstract over SQL implementations - and you get a whole (familiar and maintained) toolchain along with it to write tests, functions, doc strings, code coverage, etc.
2 replies 0 retweets 5 likes -
i think calling that toolchain 'familiar' is the issue here -- there are many different paths, they aren't better or worse, but what feels familiar/simple/easy is different for people who have taken different paths, analytics people should get comfortable with thatpic.twitter.com/psQCakL2N9
2 replies 0 retweets 5 likes -
in my experience, the people who tend to hate SQL most are coming from academia. this makes some degree of sense, since it's difficult to learn SQL without working for a company that has a real database to practice on
2 replies 0 retweets 7 likes
another possible way to think of this is methods-first (DS in this diagram) and business-context-first (analyst in this diagram). if your home base is the methods you learned in your academic training, it makes sense that you feel less fluent when you move to SQL
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

