Home | All Chalktalks


A Case Study on the Fall 2023 Womens ACCs

Author: SailRank

Date: 2023-10-19

Before We Begin

This document will include lots of statistical terminology. Sorry…

That said, I have tried to make it as fun and graphical as possible so as many of you as possible can get an understanding of how exactly we evaluate the performance of the SailRank algorithms. Below are some resources that do a better job than I could at explaining what some of the statistical terms mentioned below mean and how you should interepret them.

  • Correlation Coefficient: The correlation coefficient that you will see the most throughout this document is Spearman’s Correlation Coefficient. This number is ideal for comparing the relationships of one ranking to another ranking. This webpage gives you an introduction to how the coefficent is used to perform hypotheses tests on ranking datasets.

  • Hypothesis Testing: Hypothesis Tests likely bring flashbacks to your AP Stats class you took senior year of highschool. While it may seem complicated at first, it is really just a set of instructions for interpreting the results of various statistical analyses. Data is fun to look at, but without a hypothesis test is really just a bunch of numbers. This webpage does a good job of explaining what various terms in a hypothesis test mean and how you can read them.

  • Means and Standard Deviations: Those flashbacks are just gonna keep coming, huh… As we will discuss further down, understanding means and standard deviations is crucially important to understanding SailRank. Here’s a quick refresher on what those terms mean.


Examining The Dataset

WACCS_Data = read_xlsx("F23W5_WACCS_Data.xlsx")
WACCS_Data
## # A tibble: 18 × 10
##    School   Abbrev SWRank SRPRank SRPre…¹ Actual Actua…² PredS…³ PredS…⁴ PredS…⁵
##    <chr>    <chr>   <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
##  1 Stanford STAN        1       1       1      1      99    87.4    75.5    99.4
##  2 Boston … BC          3       8       3      2     136   169.    131.    207. 
##  3 Cornell  COR         4       4       5      3     158   187.    133.    241. 
##  4 Yale     YALE        2       3       2      4     170   137.    120.    154. 
##  5 Harvard  HAR        10       2       4      5     182   191.    172.    210. 
##  6 Dartmou… DART        6       5       6      6     193   195.    156.    235. 
##  7 Brown    BR         12      12       7      7     201   207.    173.    240. 
##  8 Charles… COC         7       7       8      8     233   223.    204.    241. 
##  9 Navy     USNA       11      17      11      9     239   245.    179.    312. 
## 10 Georget… GTN        14      14      10     10     239   253.    232.    275. 
## 11 Tulane   TUL        16      15      12     11     252   259.    239.    278. 
## 12 MIT      MIT         5       6       9     12     271   227.    204.    250. 
## 13 Bowdoin  BOW         9       9      16     13     275   277.    244.    310. 
## 14 UPenn    UPN        15      16      15     14     276   274.    232.    315. 
## 15 USF      USF        18      10      13     15     276   270.    230.    310. 
## 16 Coast G… CGA         8      13      14     16     278   270.    228.    313. 
## 17 Tufts    TUF        17      18      18     17     297   330.    313.    348. 
## 18 NC State NCS        13      11      17     18     335   302.    259.    345. 
## # … with abbreviated variable names ¹​SRPredict, ²​ActualScore, ³​PredScoreMean,
## #   ⁴​PredScoreLower, ⁵​PredScoreUpper
Variable Description
SWRank Rank Of A Team From Sailing World Rankings October 2023
SRPRank Relative Ranking Positions of the First TeamsFrom The F23W5 SailRank Women’s Fleet Racing Power Rankings
SRPredict Predicted Finishing Place at ACCs Using The SailRank Regatta Prediction Algorithm
Actual Actual Finishing Place at ACCs
ActualScore Finishing Score From ACCs
PredScoreMean Predicted Mean Score At ACCs From SRPredict
PredScoreLower Lower Confidence Interval Score At ACCs From SRPredict
PredScoreUpper Upper Confidence Interval Score At ACCs From SRPredict

Why are the SRPredict values different from the SRPrank values?

\(SRPredict\) uses rotation information when calculating the predicted places for a given race, meaning it uses the actual skipper and crew that sailed each race in the model. It utilizes the same algorithm as the one used in the SailRank Power Rankings, but with more accurate information provided to it. As rotations are of course not entered before regattas are sailed, it is not possible for us to publish these numbers before regattas, but we are working on a possible middle ground solution for predicting regattas fully before the event starts!


Sailing World Rankings Comparison

We would be doing ourselves a disservice if we did not acknowledge that this past week the famed Sailing World Rankings were published for the first time this season. So where did we agree (or agree to disagree)?

Quick Note Many of the following plots will utilize a red line representing a perfect 1:1 correlation between the two metrics being compared.

WACCS_Data %>% ggplot(aes(x=SWRank, y=SRPRank, label=School)) +
  geom_abline(intercept = 0, slope = 1, color = "red") +
  geom_point() +
  labs(title = "SailRank Power Rankings vs. Sailing World Rankings", subtitle = "Week of Fall 2023 Women's ACCs", x="Sailing World Ranking", y="SailRank Power Ranking") +
  geom_text(hjust=0.5, vjust=1.4, size=3) +
  theme_minimal()+
  xlim(0,20) +
  ylim(0,20)

WACCS_Data %>% mutate(RankingDiff = SRPRank - SWRank) %>% select(School, RankingDiff) %>% arrange(desc(abs(RankingDiff)))
## # A tibble: 18 × 2
##    School         RankingDiff
##    <chr>                <dbl>
##  1 Harvard                 -8
##  2 USF                     -8
##  3 Navy                     6
##  4 Boston College           5
##  5 Coast Guard              5
##  6 NC State                -2
##  7 Yale                     1
##  8 Dartmouth               -1
##  9 Tulane                  -1
## 10 MIT                      1
## 11 UPenn                    1
## 12 Tufts                    1
## 13 Stanford                 0
## 14 Cornell                  0
## 15 Brown                    0
## 16 Charleston               0
## 17 Georgetown               0
## 18 Bowdoin                  0
cor(WACCS_Data$SRPRank, WACCS_Data$SWRank, method = "spearman")
## [1] 0.7688338

Our SailRank Power Rankings exhibit a fairly strong positive correlation to the Sailing World Rankings. Notable differences in opinion are Harvard, USF, Navy, USCGA, and Boston College.


Comparing The True Results To The Predictions

Sailing World

WACCS_Data %>% ggplot(aes(x=SWRank, y=Actual, label=School)) +
  geom_abline(intercept = 0, slope = 1, color = "red") +
  geom_point() +
  labs(title = "Actual Results vs. Sailing World Rankings", subtitle = "Week of Fall 2023 Women's ACCs", x="Sailing World Ranking", y="ACCs Actual Finishing Order") +
  geom_text(hjust=0.5, vjust=1.4, size=3) +
  theme_minimal()+
  xlim(0,20) +
  ylim(0,20)

WACCS_Data %>% mutate(RankingDiff = SWRank - Actual) %>% select(School, RankingDiff) %>% arrange(desc(abs(RankingDiff)))
## # A tibble: 18 × 2
##    School         RankingDiff
##    <chr>                <dbl>
##  1 Coast Guard             -8
##  2 MIT                     -7
##  3 Harvard                  5
##  4 Brown                    5
##  5 Tulane                   5
##  6 NC State                -5
##  7 Georgetown               4
##  8 Bowdoin                 -4
##  9 USF                      3
## 10 Yale                    -2
## 11 Navy                     2
## 12 Boston College           1
## 13 Cornell                  1
## 14 Charleston              -1
## 15 UPenn                    1
## 16 Stanford                 0
## 17 Dartmouth                0
## 18 Tufts                    0

Starting off with Sailing World’s Rankings. Notable differences: Coast Guard and MIT underperformed the Sailing World Rankings by 8 and 7 places respectively. Harvard, Brown, and Tulane all outperformed the rankings by 5 spots. North Carolina State also underperformed by 5 places.

SailRank Power Rankings

WACCS_Data %>% ggplot(aes(x=SRPRank, y=Actual, label=School)) +
  geom_abline(intercept = 0, slope = 1, color = "red") +
  geom_point() +
  labs(title = "Actual Results vs. SailRank Power Rankings", subtitle = "Week of Fall 2023 Women's ACCs", x="SailRank Power Ranking", y="ACCs Actual Finishing Order") +
  geom_text(hjust=0.5, vjust=1.4, size=3) +
  theme_minimal()+
  xlim(0,20) +
  ylim(0,20)

WACCS_Data %>% mutate(RankingDiff = SRPRank - Actual) %>% select(School, RankingDiff) %>% arrange(desc(abs(RankingDiff)))
## # A tibble: 18 × 2
##    School         RankingDiff
##    <chr>                <dbl>
##  1 Navy                     8
##  2 NC State                -7
##  3 Boston College           6
##  4 MIT                     -6
##  5 Brown                    5
##  6 USF                     -5
##  7 Georgetown               4
##  8 Tulane                   4
##  9 Bowdoin                 -4
## 10 Harvard                 -3
## 11 Coast Guard             -3
## 12 UPenn                    2
## 13 Cornell                  1
## 14 Yale                    -1
## 15 Dartmouth               -1
## 16 Charleston              -1
## 17 Tufts                    1
## 18 Stanford                 0

Now looking at our own Power Rankings! Notable Differences: Navy, Boston College, and Brown all outperformed the SailRank Power Rankings by 8, 6, and 5 places respectively. NC State, MIT, and USF underperformed by 7, 6, and 5 places respectively.

SailRank Regatta Predictions

WACCS_Data %>% ggplot(aes(x=SRPredict, y=Actual, label=School)) +
  geom_abline(intercept = 0, slope = 1, color = "red") +
  geom_point() +
  labs(title = "Actual Results vs. SailRank Regatta Predictions", subtitle = "Using Race By Race Rotation Information", x="SailRank Regatta Predicted Place", y="ACCs Actual Finishing Order") +
  geom_text(hjust=0.5, vjust=1.4, size=3) +
  theme_minimal()+
  xlim(0,20) +
  ylim(0,20)

WACCS_Data %>% mutate(RankingDiff = SRPredict - Actual) %>% select(School, RankingDiff) %>% arrange(desc(abs(RankingDiff)))
## # A tibble: 18 × 2
##    School         RankingDiff
##    <chr>                <dbl>
##  1 MIT                     -3
##  2 Bowdoin                  3
##  3 Cornell                  2
##  4 Yale                    -2
##  5 Navy                     2
##  6 USF                     -2
##  7 Coast Guard             -2
##  8 Boston College           1
##  9 Harvard                 -1
## 10 Tulane                   1
## 11 UPenn                    1
## 12 Tufts                    1
## 13 NC State                -1
## 14 Stanford                 0
## 15 Dartmouth                0
## 16 Brown                    0
## 17 Charleston               0
## 18 Georgetown               0

Finally, let’s take a look at our regatta prediction algorithm. Notable Differences: 3 places off on Bowdoin and Cornell. They underperformed and outperformed the ranking respectively.

Graphs Are Nice But Numbers Are Important Too


Below are the correlation coefficients for each of the three rankings we looked at above. Higher numbers are better!

Sailing World Rankings

cor(WACCS_Data$SWRank, WACCS_Data$Actual, method = "spearman")
## [1] 0.7254902

SailRank Power Rankings

cor(WACCS_Data$SRPRank, WACCS_Data$Actual, method = "spearman")
## [1] 0.6800826

SailRank Regatta Predictions

cor(WACCS_Data$SRPredict, WACCS_Data$Actual, method = "spearman")
## [1] 0.9545924

Summary

Both the Sailing World Rankings and the SailRank Power Rankings exhibit a moderate to strong positive correlation with the real results of Women’s ACCs. On the other hand, our regatta prediction algorithm, blew the other two out of the water with a nearly perfect correlation coefficient. Impressive!


Strength in Numbers

What would happen if we combine the Sailing World Rankings with our SailRank Power Rankings by averaging the numbers?

WACCS_Data = WACCS_Data %>% mutate(avgPred = as.double(SRPRank + SWRank) / 2.0)
WACCS_Data %>% select(School, avgPred, Actual)
## # A tibble: 18 × 3
##    School         avgPred Actual
##    <chr>            <dbl>  <dbl>
##  1 Stanford           1        1
##  2 Boston College     5.5      2
##  3 Cornell            4        3
##  4 Yale               2.5      4
##  5 Harvard            6        5
##  6 Dartmouth          5.5      6
##  7 Brown             12        7
##  8 Charleston         7        8
##  9 Navy              14        9
## 10 Georgetown        14       10
## 11 Tulane            15.5     11
## 12 MIT                5.5     12
## 13 Bowdoin            9       13
## 14 UPenn             15.5     14
## 15 USF               14       15
## 16 Coast Guard       10.5     16
## 17 Tufts             17.5     17
## 18 NC State          12       18
cor(WACCS_Data$avgPred, WACCS_Data$Actual, method = "spearman")
## [1] 0.7292629

We see that this approach actually yields a higher correlation coefficient than either the Sailing World Rankings or the SailRank Power Rankings did individually, but underperformed relative to the SailRank Regatta Prediction Algorithm.


Are Rankings/Predictions Significant?

Lets go ahead and start by saying that we want our significance value to be \(\alpha = 0.05\) for \(95\)% confidence.

Our hypotheses will be as follows:

Null Hypothesis (\(H_0\)): A given ranking system is not significantly related to the actual results of the ACCs.

Alternative Hypothesis (\(H_1\)): A given ranking system is significantly related to the actual results of the ACCs.

Sailing World Rankings

SW.Spearman = cor.test(WACCS_Data$SWRank, WACCS_Data$Actual, method="spearman", alternative = "g")
SW.Spearman
## 
##  Spearman's rank correlation rho
## 
## data:  WACCS_Data$SWRank and WACCS_Data$Actual
## S = 266, p-value = 0.0004755
## alternative hypothesis: true rho is greater than 0
## sample estimates:
##       rho 
## 0.7254902

We get a \(P\)-Value of \(0.0004755\). As our \(P\)-Value is less than \(\alpha\) we can reject \(H_0\) and say that, at a significance of \(0.05\), the Sailing World Rankings are significantly related to the actual results of the ACCs.

SailRank Power Rankings

SRPR.Spearman = cor.test(WACCS_Data$SRPRank, WACCS_Data$Actual, method="spearman", alternative = "g")
SRPR.Spearman
## 
##  Spearman's rank correlation rho
## 
## data:  WACCS_Data$SRPRank and WACCS_Data$Actual
## S = 310, p-value = 0.001257
## alternative hypothesis: true rho is greater than 0
## sample estimates:
##       rho 
## 0.6800826

We get a \(P\)-Value of \(0.001257\). As our \(P\)-Value is less than \(\alpha\) we can reject \(H_0\) and say that, at a significance of \(0.05\), the SailRank Power Rankings are significantly related to the actual results of the ACCs.

SailRank Regatta Predictions

SRPred.Spearman = cor.test(WACCS_Data$SRPredict, WACCS_Data$Actual, method="spearman", alternative = "g")
SRPred.Spearman
## 
##  Spearman's rank correlation rho
## 
## data:  WACCS_Data$SRPredict and WACCS_Data$Actual
## S = 44, p-value = 2.336e-06
## alternative hypothesis: true rho is greater than 0
## sample estimates:
##       rho 
## 0.9545924

We get a \(P\)-Value of \(0.000002336\). As our \(P\)-Value is less than \(\alpha\) we can reject \(H_0\) and say that, at a significance of \(0.05\), the SailRank Regatta Predictions are significantly related to the actual results of the ACCs.

Summary

All 3 of the different rankings are significantly linearly related to the actual results of the ACCs. Well done!

Now this of course does not mean that rankings are significant overall. This information is only applicable to the ACCs.


Score Predictions

Our regatta prediction algorithm works by predicting the scores of each race individually, so not only do we have predictions for the rank of each team in the event, but also have an idea of where exactly the team should place score wise. Or do we?

Looking At Predicted Scores

WACCS_Data %>% ggplot(aes(x=PredScoreMean, y=ActualScore, label=Abbrev)) +
  geom_abline(intercept = 0, slope = 1, color = "red") +
  geom_smooth(method = "lm", se=FALSE) +
  geom_point() +
  labs(title = "Actual Score vs. SailRank Regatta Predicted Score Mean", subtitle = "With Ideal Fit (Red) & Real Linear Fit (Blue)", x="SailRank Regatta Predicted Score Mean", y="ACCs Actual Score") +
  geom_text(hjust=1.0, vjust=1.4, size=2) +
  theme_minimal()

Hey that’s weird, our regatta prediction ranking graph looked a lot better than this. Why doesn’t this look nearly as good?

The actual rank predictions don’t use the mean prediction, and for good reason too. For those of you who have taken the time to look at the PlackettLuce rating algorithm, the backbone of SailRank, you are aware that ratings are modeled by two important numbers: A Mean and a Standard Deviation. No person is perfect and we could never expect a single value to summarize skill as performances vary and that is why using a Standard Deviation value is important in this context. To calculate our predicted ranking we add \(0.2\) standard deviations to our mean prediction and rank that way. That is why when we show predicted scores it is often more helpful to show a range of scores representative of what a really good day or really bad day could look like. An example of this can be found on our Power Rankings pages for Fleet Racing where you can maximums and minimums for both division scores and team scores.

Score Ranges

Lets take a look at those ranges in context of the ACCs!

WACCS_Data %>% ggplot(aes(x = reorder(Abbrev, Actual), y = ActualScore)) +
  geom_bar(stat = "identity", fill = "lightblue", color = "black", width = 0.5) +
  geom_point(aes(y = PredScoreMean), shape = 3, size = 2, color = "red") +
  geom_errorbar(aes(ymin = PredScoreLower, ymax = PredScoreUpper), width = 0.2, position = position_dodge(0.5)) +
  labs(title = "Actual Regatta Scores For Teams at ACCs", y = "Regatta Score", x="Team", subtitle = "With Error Bars Indicating Predicted Range w/ Predicted Mean Value") +
  theme_minimal()

Hey that’s pretty good. Out of the teams that competed in ACCs only three were outside of our predicted range: Yale, MIT, and Tufts. We could then theoretically draw conclusions that either those teams are over/underrated by our model, or that they had a really good/bad 2 days at ACCs with some more work. (still going to have to save that for another day :( )

Hypothesis Testing Predicted Means

This test will be slightly different than the one’s performed on the rankings, though much of the terminology for describing the test will remain the same so don’t fret.

Lets go ahead and start by saying that we want our significance value to be \(\alpha = 0.05\) for \(95\%\) confidence.

cor.test(WACCS_Data$PredScoreMean, WACCS_Data$ActualScore, method="pearson", alternative = "greater")
## 
##  Pearson's product-moment correlation
## 
## data:  WACCS_Data$PredScoreMean and WACCS_Data$ActualScore
## t = 10.913, df = 16, p-value = 4.019e-09
## alternative hypothesis: true correlation is greater than 0
## 95 percent confidence interval:
##  0.8627793 1.0000000
## sample estimates:
##       cor 
## 0.9389159

Our \(PredScoreMean\) values show a strong positive relationship to \(ActualScore\)! In fact, a value of \(0.9389159\) is extremely strong in terms of prediction models.

Our hypotheses will be as follows:

Null Hypothesis (\(H_0\)): \(PredScoreMean\) does not exhibit a significant relationship to \(ActualScore\).

Alternative Hypothesis (\(H_1\)): \(PredScoreMean\) exhibits a significant relationship to \(ActualScore\).

From our output above we see that our \(T\)-Value for the test is \(10.913\) which translates to a \(P\)-value of \(0.000000004019\). As our \(P\)-value is less than \(\alpha = 0.05\) we can now reject \(H_0\) and say, at our given significance, \(PredScoreMean\) exhibits a significant relationship to \(ActualScore\).

Do you notice something different about how we calculated this correlation coefficient?

We have swapped from using a Spearman correlation to a Pearson correlation. This is because we are no longer working with ranking datasets and instead working with raw numbers that are roughly normally distributed. This means we can also move on to trying to fit a linear model to predict the actual score from the predicted score to assess the significance of the predicted score in predicting the actual score. You can read about this correlation coefficient here.

Linear Model

To be honest, this linear model is not all that helpful in this context as we are already looking for a 1:1 relationship between our variables, but nevertheless we shall continue as it’s a good introduction to how we could use a linear model in a future Chalk Talk!

PredScoreMean.LM = lm(ActualScore~PredScoreMean, data = WACCS_Data)
summary(PredScoreMean.LM)
## 
## Call:
## lm(formula = ActualScore ~ PredScoreMean, data = WACCS_Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.958  -9.760  -2.227   8.187  43.661 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    7.95240   20.85219   0.381    0.708    
## PredScoreMean  0.96659    0.08857  10.913 8.04e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.05 on 16 degrees of freedom
## Multiple R-squared:  0.8816, Adjusted R-squared:  0.8742 
## F-statistic: 119.1 on 1 and 16 DF,  p-value: 8.039e-09

Using our linear model above, we can now conduct a hypothesis test on the predictive significance of our predicted means on the actual scores of the ACCS.

Our hypotheses will be as follows:

Null Hypothesis (\(H_0\)): \(PredScoreMean\) is not a significant predictor of \(ActualScore\) in our linear model.

Alternative Hypothesis (\(H_1\)): \(PredScoreMean\) is a significant predictor of \(ActualScore\) in our linear model.

Now from our model output we get a \(T\)-Value of \(10.913\), which then translates to a \(P\)-Value of \(0.00000000804\). As our \(P\)-Value is less than our selected \(\alpha = 0.05\), we can reject \(H_0\) and say that, at a significance of \(\alpha\), \(PredScoreMean\) is a significant predictor of \(ActualScore\) in our linear model.

Another thing we can look at is our \(R^2\) value for the model. It has a value of \(0.8816\) which we can then interpret as: Our model is able to handle \(88.16\)% of the variance in \(ActualScore\).

That’s pretty reassuring.

Fit Check

There is one more important step with our linear model. Checking our residual plots!

Feel free to check out this page on how to read the following graphs.

plot(PredScoreMean.LM, which=1)

In this plot we are looking to see that our residuals show no clear patterns, and that they are fairly randomly distributed on either side of our line at 0. That’s another test passed.

plot(PredScoreMean.LM, which=2)

Our Q-Q plot also looks pretty fine. Those central values are especially promising! We can now check that off too!


Catch Some ZZZs

For those who have made it this far into our analysis. I’ll try not to put you entirely to sleep here, but we are gonna talk about \(Z\)-Scores.

First as laziness led me to not input the actual SD values into the spreadsheet for this analysis (though entering the Lower and Upper Values did in fact take more time…), we will have to trace back our calculations to get them.

WACCS_Data = WACCS_Data %>% 
  mutate(PredScoreSD = (PredScoreUpper - PredScoreMean) / 0.8,
         ActualScoreZVal = (ActualScore - PredScoreMean) / PredScoreSD)
WACCS_Data %>% select(School, ActualScoreZVal) %>% arrange(-abs(ActualScoreZVal))
## # A tibble: 18 × 2
##    School         ActualScoreZVal
##    <chr>                    <dbl>
##  1 Yale                    1.53  
##  2 MIT                     1.51  
##  3 Tufts                  -1.50  
##  4 Stanford                0.776 
##  5 Boston College         -0.688 
##  6 NC State                0.612 
##  7 Georgetown             -0.526 
##  8 Charleston              0.452 
##  9 Cornell                -0.432 
## 10 Harvard                -0.394 
## 11 Tulane                 -0.278 
## 12 Coast Guard             0.143 
## 13 Brown                  -0.136 
## 14 USF                     0.117 
## 15 Navy                   -0.0759
## 16 Dartmouth              -0.0481
## 17 UPenn                   0.0475
## 18 Bowdoin                -0.0468

So now we have our \(Z\)-Scores! The \(Z\)-Score represents just how far the actual regatta score was from our predicted regatta score proportional to the Standard Deviation of the predicted regatta score. It can be used in many followup analyses of our errors and can even potentially be used to try to calculate how much of an advantage a given team has at their own home venue (Once again a project for another time).

WACCS_Data %>% ggplot(aes(x=ActualScoreZVal)) +
  geom_boxplot() +
  labs(title = "Boxplot of Z Scores of ACCs Results Compared To SailRank Regatta Predictions", x ="Z Score") +
  theme_minimal() +
  theme(axis.text.y = element_blank(),
        axis.title.y = element_blank())

WACCS_Data %>% ggplot(aes(x=ActualScoreZVal)) +
  geom_density() +
  theme_minimal() +
  labs(title = "Density of Z Scores", x="Z-Score", y="Density")

summary(WACCS_Data$ActualScoreZVal)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -1.50299 -0.36502 -0.04746  0.05858  0.37438  1.52958

It looks like we should actually be able to tighten what our prediction algorithm’s confidence range is as most of our \(Z\)-Scores are actually well within the \(\pm 0.8 SD\) range that we use there. However, the prediction algorithm was tuned to perform as well as possible on ALL of the over 1400 regattas we have evaluated, so not so fast there. In other words, by pure chance, it looks like Women’s ACCs is making our model look really really good!


Conclusion

Congratulations to those who have made it this far!

In this case study we have determined that both the Sailing World Rankings and our own Power Rankings do a fairly good job at creating rankings capable of having good accuracy values when compared to the Women’s ACCs results. We have also determined that our \(SRPredict\) results are a far better fit against a regatta and is quite capable of predicting the actual score of a team at ACCs.

We also know that there is definitely room for improvement in our Power Rankings Lists as those should be capable of performing almost as well as \(SRPredict\) did. The issue there is most likely how skipper/crew pairings are created. We have been aware of this issue, but are currently limited in our capabilities to address it. This study will justify putting some more tasks related to that onto our more immediate backlog of work to get done!

I hope you have both learned something from this and have gotten a better understanding of how our models for SailRank work and what they are capable of.

Until Next Time,

Nick

Home | All Chalktalks