I got asked what this "standardization" looks like. I wish I knew, and I'd just say it out right if I did! But I still have some thoughts based on what I've seen work on both eng and DS teamshttps://twitter.com/imightbemary/status/1404471774297812998 …
-
-
In practice, this often seems to manifest as opinionated data warehousing. It allows you to enforce the relevant assumptions and business context without anyone knowing you've done so.
Show this thread -
For reducing variance in operations, I immediately think of experimentation. There are commonly agreed upon ways to run experiments, and the consistency reduces a lot of the mental and organizational overhead that makes cross-team collaboration hard.
Show this thread -
The procedure of experimentation is a way of ensuring quality and (at least in theory) will make comparison of results across experiments meaningful.
Show this thread -
Another source of unproductive variance in operations is people coding in subtly different environments. In a worst case scenario, different versions of software packages can generate different results.
Show this thread -
In a best case scenario, a DS tries to replicate an analysis, realizes their set up is different from the author's, and decides it's too much work to build off someone else's work, instead spinning up their own version of the thing they're trying to replicate.
Show this thread -
Maybe they'll do a fine job reproducing it, maybe they won't, but in either case it's a huge waste of time. To really make sure everyone is contributing to the same body of work, it should require minimal effort for a DS to start where another analysis left off
Show this thread -
This is all maybe a long-winded way of saying that the key to standardization might just be code! It's natural to think of data as the most important resource we have at our disposal, but that's not quite right
Show this thread -
As DS, we earn our keep by using our expertise to extract value from data. Code is the way we formalize and scale our expertise, whether it's SQL, Scala or something in between
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

