It's fun to speculate about what direction DS as a profession is going, but it's also instructive to dig into how it grew into its modern form. This paper's an example of that, a snapshot from a few years before the term itself was coined circa 2008 https://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 …
-
-
Some of his reflections from working in industry resonate strongly with me: * Focus on finding a good solution—that’s what consultants get paid for * Live with the data before you plunge into modeling
Show this thread -
Some kind of caught me off guard to hear a statistician in industry say: * Search for a model that gives a good solution, either algorithmic or data * Predictive accuracy on test sets is the criterion for how good the model is
Show this thread -
The first of those two is a little less surprising for me, given that "a good solution" could mean a lot of different things, but the second is both makes a lot of sense to me and is weird to think about.
Show this thread -
That's probably because of the type of DS I am. I've done some forecasting and productionized precious few ML models, but I've mostly modeled to do what Breiman describes as "extracting information about how nature is associating response variables to input variables"
Show this thread -
I perceiving caring about accuracy first as ML eng territory, and I perceive MLE as stemming directly from computer science. This is sort of silly of me, given that "data scientist" has been an overloaded junk title for most of the time I've held it (and if I'm honest still is)
Show this thread -
Specialization within the DS world is still emerging and while some of the boundaries between types of DS roles have gotten sharper in the last few years, they're all still coming from the same lineage
Show this thread -
"Terrabytes of data are pouring into computers from many sources, both scientific, and commercial, and there is a need to analyze and understand the data," Breiman says, like a prophet foretelling HBR articles to come
Show this thread -
And reflecting on his work in the 90's (!!!), he was already seeing how crossfunctional this line of work can be: "there has been a noticeable move toward statistical work on real world problems and reaching out by statisticians toward collaborative work with other disciplines"
Show this thread -
The problems we're solving with data today are greater in scale and complexity, which means we're getting the luxury of focusing on narrower subsets of those problems. Now we ask for analysts, scientists, MLEs, analytics engineers, etc. instead of just DSes or statisticians
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

