Right now I'm reading the book Accelerate by Nicole Forsgren, which is the capstone of years of studying the best practices used high performing software eng teams. It's related to a work project I'm doing, but it's got a lot of ideas I'd like to see used for data science.
-
-
"Preapproved libraries, packages, toolchains" seems most related to the quality of your data platform. How fragmented is your data warehouse or your tool ecosystem? Does one person use sklearn while another uses statsmodels while another person uses Tensorflow?
Show this thread -
Translating between different tools and libraries can be a real nightmare. Worse case scenario, you get different answers from each. Best case, you have to spend extra time converting between them just to build on work done in another part of your organization.
Show this thread -
"Processes" might seem easy to generalize, but it's harder than you think. IMO the most important dimension of this is being aligned with the rest of your data org (and adjacent orgs) on what DS can and can't do.
Show this thread -
After that, it helps to think through what project lifecycles and workflows are shared across DS. It should be easy for stakeholders to understand what it means to work with DS, as well easy for DS outside of your team know where they're at if they join an in-flight project.
Show this thread -
These are high level thoughts, but in general, DS has a lot to learn from the DevOps world. Some companies try to solve the "shift left" issue by embedding their DS in a matrix organization and that CAN work, but I'm not convinced it's the scaleable answer.
Show this thread -
Just look at infosec. There's not a security engineer embedded on every single feature team. Modern software is not perfectly secure, but it's still a hell of a lot more secure than it's been in the past.
Show this thread -
So too with data science. We want to make it easy for decision makers to do the right thing when it comes to data usage. We don't need every decision to be perfectly data-driven, but we do want decisions to be a hell of a lot more data-driven than they have been in the past.
Show this thread -
And we won't get there with by trying to brute force our way in. We gotta shift left.
Show this thread -
(Btw, if you have good examples of shifting DS left, I'd love to see them!)
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

