Conversation

Replying to
7) Each of these scenarios are a bit different, and I don't necessarily know what the right answer is in all of them. But in each case, the statistics were (a) reasonable, (b) correct, and (c) net harmful to their decision making process.
6
189
8) The key insight: you're not choosing between looking at statistics or acing randomly. You have a prior coming in: based on your intuition and critical thinking. The question is whether data is more or less useful than your priors, and whether you combine them well.
4
347
9) In Bob's case, his data is technically correct! But there are two core issues: a) his revenue might not be the same in both cases; maybe the mobile app makes more than 4x as much revenue as the B2B product. b) also: valuation isn't all that matters! I'd prefer earnings.
3
163
10) Bob probably would have been better just saying "let's build the business that seems the best" and ignoring valuation. In Alice's case, her data is probably being misinterpreted.
4
131
11) Yes, there is a positive correlation between having 10k employees and being successful: You can only hire 10k employees if you've done well. So there's a correlation, but the direction of causation is probably wrong.
5
169
12) And how about Zed? Well, what, in the end, is an impression? One of the important properties about superbowl ads: they're talked about again and again and again, in lots of places that are hard to track. The direct views significantly underestimate it's impact.
3
161
13) And in this case, a simple gut check might have made Zed realize that _obviously_ superbowl ads have large impact, and a lot of that is the chatter. So there are lots of ways to use data poorly. That doesn't make it useless--there are also lots of ways to use it well!
7
141
14) But if you do a mediocre job of using data, it just adds noise which distracts you from your baseline reasonable judgement. There is a fairly high bar that statistical analysis has to overcome to be net useful!
5
238
16) And this is a failure mode that a _lot_ of people fall into. The vast majority of statistics that I see quoted are useless. The times when stats are more likely to be useful are when they are answering a very specific, intentional question.
12
220
Replying to
18) But aimlessly generating data just distracts. It's also very similar to a trap that some interview candidates fall into, particularly those with strong math backgrounds: Given a hard, messy question, they'll try to solve it exactly. And if they can't, they get flummoxed.
19
199
19) The flipside of overfit, irrelevant data: fermi estimates. Trying to estimate quantitative factors without knowing all the relevant data is hard, but you can often get reasonable bounds on it. And those bounds can be extremely useful.
60
315
Replying to
1/ Sam for this thread. Most orgs have no idea a) how to use the data they have and b) realize that even in order to use it, it typically requires significant cleansing to even be usable. Many stats published out there are flawed due to this oversight.
1
Show additional replies, including those that may contain offensive content
Show