Saturday, 30 October 2021

F1 2021 - US Grand Prix

Something about this race felt like it marked a real sea-change. 

For so long, if any non-Mercedes car has been in the lead and a Mercedes was hunting, the result was inevitable. That Mercedes car was going to close and close and then press the go-faster button, cue *yet another* Mercedes win. 

This time that didn't happen. 

The Mercedes caught up, but couldn't get within DRS range of Verstappen's Red Bull. 

Now, I'm hoping this means an end to the automatic Mercedes dominance (yes, two very dominant teams is not that much better than just having one dominant team, but it is better). Would I like the pretty red cars to reach the stage of not being one of the teams that is automatically overtaken by any chasing Mercedes - yes, but right now I am more concerned with them cleanly beating the McLarens and getting third place. 

That is being hamstrung by some poor tyre choices. For instance, Sainz jnr having to start on soft tyres. And also, slow pitstops, which seem to have become a bane again.

This means the updated diagram contains one red card and one yellow card. The yellow card will be upgraded to red if pitstop performance doesn't improve. gCPEvX.png 

I am going to give Sainz jnr another off-the-chart cookie for putting up with the nonsense. Also deserving of cookies for putting up with the nonsense is Michael Masi, F1 race director, or "chief cat-herder of the naughty schoolchildren" as the role is also known. Never have I heard someone quite so done with team's whinging as Masi here.

Thursday, 21 October 2021

Copa America 2021 Network Diagrams

Following my Euro <strike>2020</strike> 2021 posts (link), L raised an interesting point.  While my theory that "closer to the centre = more likely to win, and less tightly connected = more likely to go out" has held up over several Euros, one Rugby World Cup, one Women's Football World Cup and one Men's Football World Cup (after the group stages), in those situations I knew enough about the relative strengths of the teams that my opinion might have been biased.  The Copa America 2021 gave me an opportunity to demonstrate whether my theory worked in a competition where I wouldn't know as much about the relative strength of the teams taking part.

There were some complicating factors, starting with the COVID pandemic.  Due to the pandemic CONMEBOL stated that nations could nominate squads of 28.  The squad lists had to be with CONMEBOL by the 10th of June.  Not all nations nominated 28, which does leave the diagrams looking lopsided.

Plotting the first round teams gives me the following diagram:



The national team closest to the centre is Chile, with Racing the club team closest to the centre.

River Plate and Bolivar are the club teams with the most players (7) followed by Atlético Madrid with 6.  The lower numbers of players per club team are reflected in how much more spread out this diagram is compared to the equivalent Euro 2020 diagram.

Bolivia are the only unconnected team.  Eight of the ten teams go through to the second round, and from this diagram, I would expect Bolivia to be one of the first to go out.  The most likely other team to be be eliminated is unclear from the diagram, but I would guess one of Peru, Venezuela or Paraguay, but Brazil are actually one of the outer teams, which even I know suggests a problem with the system.

Even if you include all the players including injury (COMEBOL rules allowing outfield replacements as well as goalkeeper replacements) and COVID-related player replacements, you still get a very similar image.


Chile are still the central national team with Universidad de Catholica (Chilean team of that name) as the central club team.  The clubs with the most players remain River Plate and Bolivar with 7 players and Atletico Madrid with 6.

Bolivia, Venezuela and Paraguay are the most outlying teams, followed by Peru, Ecuador and Brazil.

I suspect that the shape of this diagram is influenced by two factors:

1 - a lot of teams have players who play for club teams that only have representatives for that national team so the diagram is spaced out. 

2 - a lot Brazil and Argentina players play for European clubs and those two teams have disproportionately few players playing South America, but have many players playing for the same teams so they're pulling each other away from the centre.

Once two teams (Bolivia and Venezuela) were eliminated to give the quarter final teams, the diagrams looked like this:


Chile remain the national team closest to the centre, with Universidad de Chile the club team closet to the centre.

The most outlying teams are Peru, Brazil, Ecuador, Paraguay - looking at the quarterfinal teams, my theory still doesn't work for the Copa America.  (Again, I think because a lot of Brazilians play in Europe).

It took until the semifinal diagram for Brazil to no longer be one of the outlying teams, with Peru now being the outliers with the remaining 3 teams (Argentina, Brazil and Colombia) forming a solid triangle. 



Argentina or Peru are the national team closet to the centre, with Everton (?!!) the club team closest to the centre.

Peru stick out.

Following the semifinals, the final diagram looks like this:



It was a Brazil vs Argentina final, which I suspect everyone with a financial interest in South American football was hoping for.  The final diagram is much more interlinked than Euro or World Cup finals.  None of the linking teams are South American, very clear *all* the money is in the European game.  This undoubtedly does have a distorting effect on other Federations' Federation Cups (e.g. player release for the African Cup of Nations).

Wednesday, 13 October 2021

F1 2021 - Turkish Grand Prix



Yes, that is a cookie for Ferrari.  Even more unexpectedly, it's a cookie for the strategy team!  Yes, I am astounded.

But they tried.  Not everything came off, see also leaving Leclerc out maybe 2 laps too long, but they tried.  And one thing definitely succeeded, sending Sainz out to give Leclerc a tow when it looked like he might not have made it out of Q2.  After years of watching Ferrari struck by indecision, it's so refreshing.  Interestingly, I have no idea who the Ferrari strategy person is, which is a good sign of a strategy person doing their job.

The committee have said I am not allowed to give Sainz a cookie just because.  But he definitely deserves a cookie just because.

Wednesday, 6 October 2021

Benford's Law Posts - Back From A Break With May's Results

This follows the three previous posts.

I was better at remembering to add the daily article in May, adding articles on 29 of 31 days.

Looking at May's articles only, 313 leading digit numbers were used (10-11 per day, slightly more than April, about the same as March and less than February).

3 is appearing the expected percentage of times. 1 and 7 are the most different to their expected values wth 1 being over-represented and 7 under-represented. If you add together the sum of all the values of (observed-expected)squared, all divided by the expected, the calculated test statistic is 6.67, slightly higher than April.

The critical chi squared value for 9 items with only one line is ~ 15.507

The test statistic smaller than the critical value therefore the difference is not significant. This data does not disobey Benford's Law.

If we look at the rolling total from February to the end of May, there have been 1254 numbers with leading digits.

2 and 3 are the numbers closest to their expected values. 1 is the number furthest away from its expected value and remains over-represented, the next furthest away is 6 which is under-represented. If you add together the sum of all the values of (observed-expected) squared, all divided by the expected, the calculated test statistic is 2.84.

The critical chi squared value for 9 items with only one line is ~ 15.507

The test statistic smaller than the critical value therefore the difference is not significant. This data does not disobey Benford’s Law.

Interestingly, as more numbers from articles added you would expect the calculated test statistic to reduce.  Previously, it has (February = 8.6, February + March = 3.49, February + March + April = 2.29), but the test statistic has increased this time to 2.84, possibly explained by the articles from the 1st, 7th and 8th of May being very skewed towards the number 1 and having a lot of numbers in them.