Wednesday 20 April 2022

Benford's Law - From February 2021 to the end of July 2021

Today's post was supposed to be about cycling, and withdrawals from the Giro Rosa/Giro d'Italia Femminile compared to withdrawals in the men's Tour de France, but it requires more prose than I am presently capable of (running fencing competitions takes it out of you). 

Instead, let us return to an update to the Benford's Law project which has been chugging along in the background. 

In July, I recorded the first digits in the top news article on the BBC website on 25/31 days. In those 25 articles, there were 261 numbers with leading digits. That's 10-11 per day, which is a less than February but the same as March and May.

July numbers - 

  Azzeqb.png 

No number appeared exactly as often as expected, 8 was the closest, only 0.1% away from expected. 

1 and 7 are the most different to their expected values with 1 being over-represented and 7 under-represented. 

If you add together the sum of all the values of (observed-expected)squared, all divided by the expected, the calculated test statistic is 3.6, the lowest monthly total so far. 

The critical chi squared value for 9 items with only one line is ~ 15.507 The test statistic smaller than the critical value therefore the difference is not significant. This data does not disobey Benford's Law. 

If we look at the rolling total from February to the end of June, there have been 1860 numbers with leading digits.

Rolling total numbers

  Azz9IX.png 

No number exactly its expected value. 1 is the number furthest away from its expected value and remains over-represented. 

If you add together the sum of all the values of (observed-expected) squared, all divided by the expected, the calculated test statistic is 2.45, reducing as it should with more numbers. 

The critical chi squared value for 9 items with only one line is ~ 15.507 The test statistic smaller than the critical value therefore the difference is not significant. This data does not disobey Benford’s Law.

This is a reduction from the test statistic of the total to May, but it's not as low as it was in April.

Wednesday 13 April 2022

Formula 1 2022 - Australian Grand Prix

Both Melbourne and Sebastian Vettel are back!

Although one of those two had a much better weekend than the other.



From a Ferrari point of view, that could have gone better, but not much.  I'm not going to card Carlos Sainz jnr because ... well he out-performed the car and was calm and steady so often last season that I'll forgive him one bad race.

(Okay, I will forgive him many bad races but only because I <3 both my lovely Ferrari boys)


Wednesday 6 April 2022

The Draw for the 2022 World Cup

Normally, after a World Cup draw, I do an alternative "draw" based on purely on the national team rankings.  For this World Cup, that is difficult, because 3 places have yet to be filled.  Yes 3/32 teams (more than 10%) still haven't qualified but they've had the draw anyway.

Now obviously, a pure potluck draw doesn't have quite the same mathematical issues as mine (I did actually calculate the draw by rankings for all 18 possible permutations, and will add them slowly to the data-blog) but is there any need to have a draw this early?!

Saturday 2 April 2022

Formula 2022 - Saudi Arabian Grand Prix

Late due to computer issues 

Once it was clear that Mick Schumacher was fine after that crash, I said I'd be happy with whatever the result was. Hopefully, FIA will see reason and make the circuit make safety changes. There's no point enforcing safety upgrades at Spa and so on, if they're not going to do the same to every race on the calendar. 

With regard to Ferrari's results, second and third place are not bad. I can't expect Ferrari to win every race - I can hope, but I can't expect.
  AAMY5O.png