I’m going to explain in this thread why this analysis is ignorant junk at best. (For anyone who doesn't know me & cares about credentials, I'm a lawprof/former econ prof with a PhD in economics from MIT & have published many stats-forward articles https://scholar.google.com/citations?user=nYi57uEAAAAJ&hl=en …)
-
-
Show this thread
-
When Benford’s “law” applies, we expect a distribution of numbers to exhibit more values that begin with a small leading digit than a larger one.
Show this thread -
The reason is that, eg, to get to 20 you have to first go through 10, 11, …, 19, so there are ten numbers that begin with a “1” before any that being with a “2”. Under some conditions this means you expect to see a first-digit distribution like this one:pic.twitter.com/Yntw3vpI6h
Show this thread -
Here’s the Wikipedia discussion of Benford’s “law” for those who are interested. https://en.wikipedia.org/wiki/Benford%27s_law …
Show this thread -
The anonymous
@statsguyphd did an initial analysis of precinct-level data. Someone else picked up their ball and has posted data and code at https://github.com/cjph8914/2020_benfords … (see@statsguyphd’s pinned tweet image below)pic.twitter.com/tpmdZ1AuSy
Show this thread -
I went to the github link, and here's the results of the applied to Chicago precinct-level data. WOW!!!! EXTREMELY ANOMALOUS!!!!! Biden/Harris have way too few precincts with leading-digit 1 and Trump/Pence has too *many*! Anomaly! Right??? Right? right? Um: Nope.pic.twitter.com/gaqDRht76x
Show this thread -
I downloaded the Chicago csv file provided on the github link (Chi was first on the list) & with a little work found that of 2k+ precincts, only 9 had at least 1000 votes. Only 2 of those 9 precincts have at least 1000 *Biden* votes & *none* has at least 1000 Trump votes.
Show this thread -
Why does that matter? Three reasons. First, it means the only way a candidate can have a leading “1” is to have precinct-level votes of 10-19 or 100-199. Second, Biden's avg precinct-level total in the data set is about 379. Third, Trump's avg precinct-level total is 78.
Show this thread -
Put it all together & you see Trump has a lot more leading-digit “1” values than Benford’s “law” predicts because Trump's precinct-level vote totals are *low*. Chi voters don't dig Trump; he has many precinct totals in 1/10-19/100s. Chi voters like Biden; his totals are > 200.
Show this thread -
As with so much else, the ANOMALY!!! analysis boils down to suggesting that Democrats' popularity with Democratic voters is somehow surprising. Extremely anomalous! Maybe even *fraudulent*. Step off.
Show this thread -
Benford’s “law” ain't a law; that's why I've been using quotes. The distribution in question provides a good fit under conditions that apply sometimes but not others, eg, when, as here, data have a fixed max & are distributed away from 1/10-19/100-199. https://en.wikipedia.org/wiki/Benford%27s_law …pic.twitter.com/FXZyjidROU
Show this thread -
Once a trained statistician-or even a kind of thoughtful person-sees this is precinct-level data & thinks about the size of precincts & the candidates' relative popularity, it should take that person about 14 seconds to realize Benford’s “law” analysis is garbage-in/garbage-out.
Show this thread -
Still don’t believe me? Ok, let's do a simple analysis of the precinct-level vote *totals*. There are 30 precincts with a leading-digit “1” and WOW THAT’S EVEN FEWER THAN FOR BIDEN ALONE!!!!! ZOMG THE FRAUD EXTENDS TO THE PRECINCT-LEVEL **TOTAL** NUMBERS OF VOTES!!!! Nah.
Show this thread -
Stop with the caps and !. You need to calm down / You’re being too loud.
Show this thread -
This
@statsguyphd's analysis isn’t evidence of anything even slightly anomalous. In the data, anyway. What it is evidence of is - to be charitable - deeply irresponsible sloppiness. The anonymous person behind the account should retract and apologize to those they duped.Show this thread -
If you still think there’s something here, I challenge you to go get precinct-level data from heavily pro-Trump areas with a similar distribution of total precinct-level votes, and redo this analysis. See what you get.
Show this thread -
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
-