2 - My student then pointed me towards a tweet that was exploring this type of analysis (but they hadn't done Benford's). So I chimed in.
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
3 - However, I did not know what data they used so I found a source for the context they referenced. However, I could not initially find write-ins versus non-write-ins, so I looked at candidate counts.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
4 - I then wrote a quick script to gather that data, here is an example of what the data gathering portion of this process looked like.pic.twitter.com/9zfKvJhqU2
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
5 - With this data now available to look at in code, I created a process to analyze first digit conformity to the Benford's distribution. This is a test that is often conducted via Chi-squared.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
6 - I wrote the code to produce the Benford's discrete distribution. This code looks like this.pic.twitter.com/hKpAZgAW1b
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
7 - Now that I had the data and the distribution, I simply needed to perform the test. To do that, I leveraged scipy's chisquare. However, prior to doing that, you need to produce the expected result values (not just the percentages. But this is as simple.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
8 - To do that, you take the total number of observations (number of numbers that the first digit counts are derived from) and multiply them by the Benford's distribution frequencies accordingly. This looks like this:pic.twitter.com/JRkbFvfBcZ
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
9 - The final process, put together, has some additional code to handle data and count the digits from that webpage (comes in 2 parts, first script setup and function definition, then the script on next tweet):pic.twitter.com/2biVrnZ2P3
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
11 - In the end, Biden's vote data from that page is far more anomalous than Trump's. Here is what it looks like visually:pic.twitter.com/7qPivR9zQX
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.