Friday, 22 May 2026

Benford's Law - Refresh and month 1

While writing the F1 scrutineering posts, I realised that was the exact testing I needed to be doing for my Benford's Law project (https://fulltimesportsfan.wordpress.com/2021/03/17/obey-benfords-its-the-law-an-introduction-to-my-benfords-law-project/). 

This makes it an excellent opportunity to redo that project, but better, and to finalise it. 

The Benford's law project focussed on the leading digit of all numbers in the lead articles for one year of BBC.com front pages. 

It began in February 2021. 

The 28 daily news articles contained 436 numbers written as numbers (~ 15 per day). 

The data looks like this: Bar chart of the observed number of appearances by a leading digit compared to expected, where expected is described by a standardised residual.  One is massively over represented with a standardised residual of 4.7 Calculated, it's X² = 37.434 
df = 8 
p-value = 9.576 × 10⁻⁶ 

The difference between the expected and the observed is statistically significant. 

Therefore, the leading digits do not obey Benford's law. 

Obviously, this is just one month's worth of data. Most of the deviation comes from the digits 1 and 2. 

1 is massively over-represented (with a standardised residual of 4.7) and 2 is underrepresented (standardised residual of -2.4). 3, 4, 7 and 8 are present as often as they are expected, while 5, 6 and 9 are slightly under-represented, with 6 being significantly under-represented (-2.07).

Further reports to follow (I make no promise on time line, the World Cup and the Tour de France will keep me busy).

No comments:

Post a Comment