So, I screwed up.

This is not unexpected, as this was year 1 of rolling out the new models, so there were bound to be some mistakes, but it’s important that I’m transparent about what went wrong and what it affected so that you all know you can trust me.

The issues were with how Minnow calculated its efficiency metrics. Since Branchy also used those metrics, the issue affected Branchy as well. For those of you who don’t care about the math, feel free to skip to the next section where I break down how the fix affected the data. Essentially, I was failing to properly anchor my parameters, and I wasn’t stabilizing my alpha hyperparameter. This is why the issue became apparent when I went to update my rankings for the Sweet 16, the values all shifted.

Updated Evaluations

First, let’s look at how the changes affect the results of the 2025 MLBR (Machine Learning Battle Royale). Since Minnow and Branchy were both affected, the composite was naturally affected as well. Every other value remains the same. Next to the rank, I include in parentheses the change in rank from the old evaluation, where (+1) means a move up in the rank, and (-1) means a move down in the rank (+ means a better rank, whereas - is a worse rank).

Accuracy

Rank	Model	Old Score	New Score	Diff
1	BranchyBrackets	88.89%	90.48%	+1.59%
2 (+1)	Composite	82.54%	85.71%	+3.17%
3 (-1)	Torvik	84.13%	NA	NA
4 (-1)	Resumetric	82.54%	NA	NA
T5 (+1)	Minnow	79.37%	80.95%	+1.58%
T5	MNPI	80.95%	NA	NA
7	Chalk	77.78%	NA	NA

Log Loss

Rank	Model	Old Score	New Score	Diff
1	Branchy Brackets	0.3494	0.3509	+0.0015
2	Resumetric	0.3628	NA	NA
3	Composite	0.3753	0.3821	+0.0068
4 (+1)	Torvik	0.4047	NA	NA
5 (-1)	Minnow	0.3846	0.4072	+0.0226
6	MNPI	0.4383	NA	NA
7	Chalk	2.0468	NA	NA

(For context, 0.693 is the log loss of a random classifier)

Tourney Points

Rank	Model	Old Score	New Score	Diff
1	Branchy Brackets	1760	1790	+30
2 (+1)	Composite	1630	1760	+130
3 (-1)	Resumetric	1740	NA	NA
4	Torvik	1270	NA	NA
5	Minnow	1140	1150	+10
6	Chalk	1090	NA	NA
7	MNPI	1060	NA	NA

Expected Tourney Points

Rank	Model	Old Score	New Score	Diff
1	Resumetric	1114.85	NA	NA
2	Branchy Brackets	1079.35	1079.66	+0.31
3	Composite	955.24	944.28	-10.96
4	Minnow	904.86	865.30	-39.56
5	MNPI	764.66	NA	NA

Most of the differences were subtle, but a couple stood out. In particular, the composite saw a moderate leap in accuracy, and a huge leap in tourney points. I’m going to chalk that up to random chance more than anything significant analytically, since the difference is just the shifting of percentage points in a few places just barely over 50%. We can confirm our suspicions by seeing that the composite’s expected tourney points actually decreased (if only slightly). However, if the composite performs similarly well after this year’s tournament upon review, then we can start to sing its praises.

At first glance, the updated scores might seem a bit concerning, as my “fix” to Minnow resulted in Minnow scoring worse across all four evaluation metrics, and all three affected models scoring worse in log loss, which I identified in my original post as likely the most important metric. We have to remember that we’re looking at a relatively small sample size, and that the 2025 tournament was exceptionally chalky. The issues that I fixed in Minnow had a few different practical effects, but one of them was that it made Minnow more confident in favorites. That served it well in a tourney as chalky as 2025, but I remain confident that a more balanced approach will benefit it in the long-run, especially since our fixes are well-founded and principled. I’ll keep an eye on it moving forward, and if the old versions are routinely outperforming the new ones once we’ve built up a dataset of 3-5 years, then we can contemplate switching back, and perhaps do some research into why having untethered parameters benefits these models.

Updated 2026 Odds

There are changes up and down the model, with Minnow in particular (and, to a lesser extent, Branchy) being slightly more favorable to underdogs, but the most obvious changes are at the top, so let’s focus on that:

Championship Odds by Model

	Minnow	Branchy
Duke	30% (-1%)	42% (-10%)
Arizona	12% (+0%)	36% (+13%)
Michigan	19% (-3%)	10% (-2%)
The Field	39% (+4%)	12% (-1%)

As expected, the model flattened out somewhat, and Minnow reduced some confidence in the favorites to return odds to the field.

As for Branchy, honestly, who knows what that model is thinking. This might not inspire a lot of faith in me, but your guess is as good as mine when it comes to why that model decided to start liking Arizona more. The only inputs that changed for Branchy were the adjusted efficiency metrics provided by Minnow. My best guess is that those subtle differences taught Branchy new interactions that snowballed into a big change at the top.

Here’s a breakdown of the changes to Minnow rankings further down the list:

Minnow Net Rating — Ranking Changes (Old → New)

Top 10 Non-Tournament Teams by Ranking Change

#	Team	Old Rank	New Rank	Change
1	Maryland	134	182	🔻 48
2	Utah	122	167	🔻 45
3	Penn St.	142	181	🔻 39
4	Navy	163	127	🔺 36
5	Rutgers	144	179	🔻 35
6	Georgia Tech	158	186	🔻 28
7	East Tennessee St.	150	123	🔺 27
8	Oregon	103	130	🔻 27
9	Boston College	149	176	🔻 27
10	Austin Peay	183	158	🔺 25

All 68 Tournament Teams by Ranking Change

#	Team	Old Rank	New Rank	Change
1	Howard	225	192	🔺 33
2	Siena	186	154	🔺 32
3	UMBC	202	172	🔺 30
4	North Dakota St.	130	103	🔺 27
5	Tennessee St.	168	147	🔺 21
6	High Point	76	56	🔺 20
7	LIU	233	215	🔺 18
8	Hawaii	121	104	🔺 17
9	Cal Baptist	133	116	🔺 17
10	Idaho	177	160	🔺 17
11	Wright St.	151	135	🔺 16
12	McNeese St.	66	52	🔺 14
13	Troy	140	126	🔺 14
14	Akron	56	45	🔺 11
15	UCF	51	62	🔻 11
16	Miami OH	81	70	🔺 11
17	Northern Iowa	68	58	🔺 10
18	Furman	174	164	🔺 10
19	Prairie View A&M	323	313	🔺 10
20	Queens	203	194	🔺 9
21	Hofstra	79	71	🔺 8
22	Lehigh	281	273	🔺 8
23	Penn	129	136	🔻 7
24	Saint Louis	31	25	🔺 6
25	Utah St.	33	28	🔺 5
26	Santa Clara	40	35	🔺 5
27	South Florida	45	40	🔺 5
28	Missouri	54	59	🔻 5
29	Connecticut	17	13	🔺 4
30	Virginia	18	14	🔺 4
31	VCU	42	38	🔺 4
32	Texas	38	42	🔻 4
33	Gonzaga	11	8	🔺 3
34	Arkansas	13	16	🔻 3
35	Saint Mary’s	25	22	🔺 3
36	Wisconsin	27	30	🔻 3
37	Ohio St.	28	31	🔻 3
38	SMU	36	39	🔻 3
39	Michigan St.	9	11	🔻 2
40	Texas Tech	15	17	🔻 2
41	Alabama	16	18	🔻 2
42	Kansas	19	21	🔻 2
43	Kentucky	22	24	🔻 2
44	North Carolina	24	26	🔻 2
45	N.C. State	30	32	🔻 2
46	Villanova	39	41	🔻 2
47	Purdue	8	9	🔻 1
48	Tennessee	14	15	🔻 1
49	St. John’s	20	19	🔺 1
50	Nebraska	21	20	🔺 1
51	Iowa	26	27	🔻 1
52	Miami FL	34	33	🔺 1
53	Clemson	35	34	🔺 1
54	UCLA	37	36	🔺 1
55	Texas A&M	43	44	🔻 1
56	TCU	47	46	🔺 1
57	Kennesaw St.	164	165	🔻 1
58	Duke	1	1	— 0
59	Michigan	2	2	— 0
60	Arizona	3	3	— 0
61	Florida	4	4	— 0
62	Houston	5	5	— 0
63	Illinois	6	6	— 0
64	Iowa St.	7	7	— 0
65	Vanderbilt	10	10	— 0
66	Louisville	12	12	— 0
67	BYU	23	23	— 0
68	Georgia	29	29	— 0

A Mea Culpa