PDA

View Full Version : Question for stat geeks



Uriel
12-02-2016, 07:36 AM
How do you calculate the win probability of two teams in a head-to-head matchup if all you have are the final scores from each of their games and their respective net ratings?

Chinook
12-02-2016, 08:10 AM
Dunno. Whatever number you get will just be bullshit, though.

SPURt
12-02-2016, 08:16 AM
How do you calculate the win probability of two teams in a head-to-head matchup if all you have are the final scores from each of their games and their respective net ratings?
I think there are probably a lot of ways to do this and it probably depends on building a scoring system of some sort that is subjective. I would probably do something like this though the inherent problems with improbable performances deters me from going down this rabbit hole too much:

1. Take season averages of multiple categories of individual players and assign a value or score for both teams to give a base average for what to expect in a game by game basis.
2. Take the averages of the last group of 5 to 10 games to see if players are trending a certain way and compare to their season average of a predictive score can be made for that day regardless of opponent.
3. Then go back and look at the last few matchups with the next opponent and try and see the impact that team has on the categories you value on a player by player basis.
4. Look at the injury report and try to adjust your scoring based on previous games the injured player missed and how his absence changed the performance of teammates.

You can keep going after that to try and find as many games where an injured player impacted the team. It'd take a lot of testing and massaging to make the system reliable even remotely.

The problem with this is the stats you value and how you score the overall team is subjective and the NBA is comprised of amazing talent. For instance, the stats of last season wouldn't have suggested that Kobe would break 30 in his final game based on an algorithm. Another example is Linsanity in both directions breaks a predictive algorithm. I'm sure smarter people than me are dedicating their lives to this to gain an edge for gambling purposes but it's an interesting thought exercise!

SPURt
12-02-2016, 08:17 AM
Dunno. Whatever number you get will just be bullshit, though.
Ha! This is the one sentence version of my response!

Uriel
12-02-2016, 08:32 AM
I think there are probably a lot of ways to do this and it probably depends on building a scoring system of some sort that is subjective. I would probably do something like this though the inherent problems with improbable performances deters me from going down this rabbit hole too much:

1. Take season averages of multiple categories of individual players and assign a value or score for both teams to give a base average for what to expect in a game by game basis.
2. Take the averages of the last group of 5 to 10 games to see if players are trending a certain way and compare to their season average of a predictive score can be made for that day regardless of opponent.
3. Then go back and look at the last few matchups with the next opponent and try and see the impact that team has on the categories you value on a player by player basis.
4. Look at the injury report and try to adjust your scoring based on previous games the injured player missed and how his absence changed the performance of teammates.

You can keep going after that to try and find as many games where an injured player impacted the team. It'd take a lot of testing and massaging to make the system reliable even remotely.

The problem with this is the stats you value and how you score the overall team is subjective and the NBA is comprised of amazing talent. For instance, the stats of last season wouldn't have suggested that Kobe would break 30 in his final game based on an algorithm. Another example is Linsanity in both directions breaks a predictive algorithm. I'm sure smarter people than me are dedicating their lives to this to gain an edge for gambling purposes but it's an interesting thought exercise!
I don't have a lot of data, though. My sample size is very small (only 14 games) and I only have the box scores from each of the games. Is there a simpler way to do this?

MaNu4Tres
12-02-2016, 08:36 AM
Leave it to Vegas.

SPURt
12-02-2016, 08:53 AM
I don't have a lot of data, though. My sample size is very small (only 14 games) and I only have the box scores from each of the games. Is there a simpler way to do this?
This is where it gets real tricky but you have to answer one question first, how much data is enough data to establish a baseline?

If you need more than 14 games you have a few options:
1. Use career NBA averages for players that have been in the NBA.
2. Use player stats from the previous season weighted in such a way that assumes that data isn't as strong as the 14 games you do have.
3. For rookies you take their non-NBA stats from the previous season and adjust those for the competition level difference in the same way you do with NBA vets, which is so unbelievably subjective.

Once you know your threshold for "enough data" you slowly ween yourself off the previous season/career data. If you think 40 games is enough to optimise your algorithm it's a lot of tweaking your weighted numbers to give you the strongest prediction.

You can test your algorithm by going back five years and run your algorithm till it predicts games you know the outcomes to accurately. Your system does not need new games to be perfected or as close to perfected as possible.