MP vs IMPs

helene_t · December 12, 2014

The thread about bypassing spades in 1m-1♥-1NT has become a blend of two different discussions so I thought maybe it's better to start a seperate thread about MPs vs IMPs.

Obviously (cross)IMPs work best (in terms of identifying the best pair) if we assume that big swings (such as making or not making a slam) reflect skill more than small swings (such as overtricks or part score sacs). This will obviously to some extent be the case, if only because a big swing can occur as an aggregate of several good decisions, for example first jamming opps auction effectively and then double them and then defend well.

On the other hand, MPs work best if some boards allow skill to translate into bigger swings than other boards do.

In addition, I thought that MPs work better in large fields but I am not sure if that is really true.

So I made some simulations of a 27 board mitchel (27*1 board, 9*3 or 3*9) in which I assumed that the raw score on a board was normal distributed with a mean value of

E(rawscore[board,nspair,ewpair]) = skillfactor[board] * (strength[ns]-strength[ew])

where the skillfactor was gamma distributed across boards with a shape parameter which I allowed to vary between simulations. Rate=Shape to keep the average skill factor constant between sims.

The variance of the raw score was gamma distributed across boards, independent of the skill factor.

Before calculating IMPs and MPs I rounded off to nearest multiple of 50 to allow for ties at matchpoints (rounding also applied for IMP scoring for a fairer comparison). I used butler scoring without outlier removal.

The average Spearman correlations between strength and IMP scoring was (as a function of shape parameter of the skill factor distribution and number of tables):

 .1;3  .1;9   .1;27  1;3    1;9    1;27   10;3   10;9   10;27 
0.756  0.833  0.864  0.877  0.929  0.952  0.920  0.960  0.979

For MPs:

 .1;3  .1;9   .1;27  1;3    1;9    1;27   10;3   10;9   10;27 
0.744  0.844  0.880  0.869  0.931  0.957  0.920  0.962  0.980

So it looks like that for large values of the shape parameter (i.e. the skill factor is roughly the same for all boards), it doesn't matter which scoring you use, and this hold regardless of field size. But for more heterogenous sets of boards (low value of the skill factor shape parameters), MPs is better for large fields and butler is better for small fields, with a break even somewhere halfway between 3 and 9 tables.

Based on 9000 sims, using both the ew and the ns ranking so 18000 data points per parameter combination.

Maybe I should have a go with correlated noise and skill factor, which is probably realistic. This would favour matchpoints, I would think.

Of course this is all based on a huge number of simplifications and assumptions. It would be cool if someone could do a similar analysis of real data.

Vampyr · December 12, 2014

Obviously (cross)IMPs work best (in terms of identifying the best pair) if we assume that big swings (such as making or not making a slam) reflect skill more than small swings

But losing big swings is often unrelated to skill, as was pointed out In the other thread.

whereagles · December 12, 2014

@Helene: I haven't had the time yet to digest the post, but at first sight I'd say the correlation differences can be attributed to statistical fluctuations.

Or have I gotten it wrong?

helene_t · December 12, 2014

Nono the standard errors are mostly less that 0.001

rhm · December 12, 2014

I do not share the premise.

Different scoring lead to some extent to simply different games, which give different incentives and require different skills.

That's why there are some players, who do better at one form of the game than on the other.

There is little point in arguing which is "better" bridge or a fairer game or measures bridge skills better.

Rainer Herrmann

steve2005 · December 12, 2014

Nono the standard errors are mostly less that 0.001

That's a very small error for what is essentially one big guess. lol

barmar · December 12, 2014

I do not share the premise.
Different scoring lead to some extent to simply different games, which give different incentives and require different skills.
That's why there are some players, who do better at one form of the game than on the other.
There is little point in arguing which is "better" bridge or a fairer game or measures bridge skills better.

Although the "best of the best" seem to be good at all forms of the game. E.g. teams containing Meckwell tend to do well both in KOs and BAM -- they're almost always strong contenders for Spingold, Vanderbilt, Bermuda Bowl, and Reisinger.

Siegmund · December 13, 2014

You reached some different conclusions than I did, when I investigated some similar questions a while back, but we made very different assumptions, too. Some scattered thoughts:

I thought that MPs work better in large fields but I am not sure if that is really true.

This is not at all what I would expect. Whatever scoring method you use on a particular board, your result is determined 1/4 by your partnership, 1/4 by your table opponents, and 1/2 by the people against whom you are compared. Comparing against a large field diminishes the noise added by the second half, the "luck of who you are compared against". Whether you play matchpoints on a T top or are cross-imps against T other tables, the variance of your scores is proportional to 1+1/T. This can be confirmed by live results, too -- I was "blessed" with a club with a lot of 2 1/2 table games in the winter, a while back, so had some data to compare T=1,2,3,4 from real life, plus T=12 from regionals.

It is one reason I am surprised by the enduring popularity of head-to-head team matches, which are cursed with all the same extra randomness caused by only a single comparison that 2 1/2 table pairs games are. Non-statisticians seem to equate knowing the name of the source of the randomness and being able to yell at him after the session, with the result not being random.

* * *

if only because a big swing can occur as an aggregate of several good decisions, for example first jamming opps auction effectively and then double them and then defend well

This also reflects a different and imo rather unusual approach to the origin of swings. I've always taken the perspective that if nobody makes any mistakes, the expected score is close to average, and that swings occur only as a result of someone making a mistake -- whether that mistake is guessing the wrong final contract to play because the bidding has been jammed, or failing to double, or failing to defend right, or failing to declare right. Or, to put it another way, there is IMO no such thing as "creating" a swing by playing well -- only taking advantage of the opportunities for positive swings which your opponents create, and minimizing the number of opportunities for adverse swings that you create.

cherdano · December 13, 2014

So I made some simulations of a 27 board mitchel (27*1 board, 9*3 or 3*9) in which I assumed that the raw score on a board was normal distributed with a mean value of
E(rawscore[board,nspair,ewpair]) = skillfactor[board] * (strength[ns]-strength[ew])
where the skillfactor was gamma distributed across boards with a shape parameter which I allowed to vary between simulations. Rate=Shape to keep the average skill factor constant between sims.

The variance of the raw score was gamma distributed across boards, independent of the skill factor.

(Italics are my emphasis.)

I don't understand this assumption. If there is a large swing available by superior bidding or play, then I could probably stumble into that larger swing by pure luck?

I think this assumption negates the main reason that matchpoint scoring is more accurate. If I see a large swing in your simulation, then it is very likely based on skill (since the amount of points available by luck is constant across all boards).

I do not think the same can be said in the game of "contract bridge".

helene_t · December 13, 2014

If there is a large swing available by superior bidding or play, then I could probably stumble into that larger swing by pure luck?

I think this assumption negates the main reason that matchpoint scoring is more accurate. If I see a large swing in your simulation, then it is very likely based on skill (since the amount of points available by luck is constant across all boards).
I do not think the same can be said in the game of "contract bridge".

Yes I think you are right. As I said, I should maybe have a go with correlated noise and skill factor, i.e. the swing boards have a larger luck component as well as a larger skill component.

Alternatively, if you think swing boards just have a larger luck factor, then we should keep the model as it is but the parameters for the noise distribution could be changed.

Vampyr · December 13, 2014

It is one reason I am surprised by the enduring popularity of head-to-head team matches, which are cursed with all the same extra randomness caused by only a single comparison that 2 1/2 table pairs games are.

This is very very different, because you are comparing with your teammates' table.

nige1 · December 13, 2014

But losing big swings is often unrelated to skill, as was pointed out In the other thread.

I agree. Assuming that big-swings are more related to skill than small-swings seems to beg the question.

jogs · December 14, 2014

That's a very small error for what is essentially one big guess. lol

The first differences between double dummy 1NT and observed results from actual play is huge.

Double dummy 1NT declarer averages less than 6.2 tricks.

Observed results declarer averages more than 6.8 tricks.

That is a first difference of over 0.6 tricks.

cherdano · December 14, 2014

Yes I think you are right. As I said, I should maybe have a go with correlated noise and skill factor, i.e. the swing boards have a larger luck component as well as a larger skill component.

Alternatively, if you think swing boards just have a larger luck factor, then we should keep the model as it is but the parameters for the noise distribution could be changed.

I would do something different: take the results table from a big MP tourney. For each table, determine the percentile obtained at this table by a combination of luck and skill difference between the two pairs. I am not sure this is theoretically sound, but how else do you want to mimick a board where +650, +620, +200 and +170 are common results, and where both the difference between

- game bonus or not, and

- 11 tricks or 10 tricks

may potentially be attributed to either mostly skill, or mostly luck.

(I.e., my point is that even though the standard deviation on this board may be much much larger, a 30 point difference may well point to a skill difference here. That's the point of matchpoints.)

barmar · December 14, 2014

Or, to put it another way, there is IMO no such thing as "creating" a swing by playing well -- only taking advantage of the opportunities for positive swings which your opponents create, and minimizing the number of opportunities for adverse swings that you create.

While you may not be able to create swings by playing "well", you can make things harder for the opponents, which gives them more opportunities to go wrong, which then creates swings. This is essentially why teams that are far behind in a match will bid more aggressively, as well as psyching heavily. They're challenging the opponents (who would otherwise play conservatively) to figure out what's going on.

mgoetze · December 14, 2014

Whatever scoring method you use on a particular board, your result is determined 1/4 by your partnership, 1/4 by your table opponents, and 1/2 by the people against whom you are compared.

How did you make up these numbers and what are they supposed to mean? If I play a board where it is obvious for my side to pass throughout, and the opps have an obvious claim for exactly 12 tricks at trick one, then my result is obviously determined "0%" by my partnership. Furthermore, at a given average skill level, I would expect the distribution of comparison scores to stabilize as the field gets larger, eventually making my table opponents on that particular hand the only relevant factor for determining my score.

Siegmund · December 14, 2014

How did you make up these numbers and what are they supposed to mean? If I play a board where it is obvious for my side to pass throughout, and the opps have an obvious claim for exactly 12 tricks at trick one, then my result is obviously determined "0%" by my partnership.

I thought it was self-evident, but I will use your example to try to make it clearer.

If you play a board where the opponents have an obvious 6H+6, and neither side does anything stupid, you expect to get an average board.

1) You or your partner have the power to guarantee yourself a bad board - by bidding 7C, by underleading an ace at trick 1, or whatever else.

2) Your opponents have the power to give you a good board - by failing to bid slam, or by bidding 7H, or fumbling the laim, or whatever else.

3) Each other pair in the room holding your cards, has the power to give himself a bad board, and give you one extra matchpoint, by misdefending. Collectively, all-the-other-people-holding-your-cards make 1/4 of the decisions that affect what your score on the board will be.

4) Each other pair in the holding the slam cards, has the power to give himself a bad board, and cost you one extra matchpoint, by misdeclaring. Collectively, all-the-other-people-holding-your-opponents-cards make 1/4 of the decisions that affect what your score on the board will be.

It so happens, on your example board, that you had an easy decision, and it wasn't particularly hard for you to do your part.

In my view, on EVERY board, there is SOME result that would happen if everybody at the table did everything right. When the angels play against each other in Heaven, every board has been a 50% board since Satan was cast out :) In the real world, you can cost yourself your entitlement to an average board by a misjudgment, and so can your opponents, and so can the other people in the room. On the complicated boards where the bidding can go ten different ways, you very often make a mistake that your opponents could capitalize on (if they knew how.) Your table opponents make mistakes that give you chances back. The final outcome is a complicated mess -- determined by which side succeeded in throwing away more.

Over the course of the evening, we expect you to face approximately the same number of "interesting" decisions as your table opponents do, as the NS at other tables do, and as the EW at other tables do.

The way I choose to approach this statistics problem, we have the same question on every board -- which of the 4 groups of people makes mistakes, with what frequency and severity? The degree of difficulty faced by each side can be different on each board, as you observe. If you played an entire session of bridge able to see through the backs of your opponents' cards, but they couldn't see through yours, you would expect to get a score near 75%. You couldn't guarantee yourself more than that, because you can't force your opponents to make a mistake on every board, nor can you prevent the people at other tables from doing weird stuff so that your good boards won't always be tops.

Adapting Helene's model to my philosophy, for each board we would need to draw two difficulty scores, D_NS and D_EW, from some distribution. And for each pair, let's have some quality measurement Q_i that says how often, on a relative scale, that pair makes errors. To get a "table result" on a board, we do something like let M_i, the number of mistakes made by pair i on this board, be Poisson(D_NS * Q_i) or Poisson(D_EW * Q_i), and take M_1-M_2 as the "result" from one table, M_3-M_4 the "result" from the next, and matchpoint them. Or a more complicated model that allows errors of different sizes. The results won't depend much, qualitatively, on exactly how complicated of a model you use.

Furthermore, at a given average skill level, I would expect the distribution of comparison scores to stabilize as the field gets larger, eventually making my table opponents on that particular hand the only relevant factor for determining my score.

No argument at all that the distribution of scores from the rest of the field will stabilize as the number of comparisons on the board increases. (It stabilizes to a different distribution according to how strong the rest of the field is.)

mgoetze · December 14, 2014

[/size][/color]

I thought it was self-evident, but I will use your example to try to make it clearer.

I still don't even know what the numbers mean. If my partnership plays horribly, the opponents do nothing special, and the field does nothing special, we are not getting 25%*0%+25%*50%+50%*50% = 37.5% on the board. We are getting 0%.

If you play a board where the opponents have an obvious 6H+6, and neither side does anything stupid, you expect to get an average board.

I didn't say it was obvious, I'm assuming it's difficult to bid and/or easy to overbid.

1) You or your partner have the power to guarantee yourself a bad board - by bidding 7C, by underleading an ace at trick 1, or whatever else.

Perhaps. Perhaps not, there are enough slam boards where any sort of sacrifice is obviously absurd, and I could underlead my ace into their KQJ opposite xxx.

Over the course of the evening, we expect you to face approximately the same number of "interesting" decisions as your table opponents do, as the NS at other tables do, and as the EW at other tables do.

Aha, so this is your assumption. I find it pretty much never holds true in any sort of 1 day event. More crucially, if you want to discuss the difference between MPs and IMPs, it is vital that at IMPs the "interesting" decisions do not all have the same weight in determining your score.

No argument at all that the distribution of scores from the rest of the field will stabilize as the number of comparisons on the board increases. (It stabilizes to a different distribution according to how strong the rest of the field is.)

And yet you claim that the result is "determined" by a constant "1/2" by the field. Again, what does this mean?

jogs · December 15, 2014

Random luck plays a huge role in results. Skill is linearly proportional to boards played. Luck is proportional to the square root of boards played.

You are on a good AX team playing world champions. You should be able to win 35-40% of 7 board matches. The longer the length of the match in terms of total boards the less likely your team can upset the champions.

nige1 · December 15, 2014

I do not share the premise. Different scoring lead to some extent to simply different games, which give different incentives and require different skills. That's why there are some players, who do better at one form of the game than on the other. There is little point in arguing which is "better" bridge or a fairer game or measures bridge skills better.

MPs and imps require slightly different skills but the skills seem to correlate and overlap. You would expect players who perform well at one form of the game to perform well at the other.

suokko · December 17, 2014

MP and IMPs emphasize very different skill sets. Of course those same skills are required in both forms. If a player has some areas that clearly weaker it can make results quite different in MPs and IMPs.

Also cross IMPs is fun to play but a good biding decision by opponents can easily dominate the score in a specific board. In MPs good card player can often produce relatively large improvement to the score with a good defence.

In my experience it is clear that in MPs it is a lot easier to play stable good score (55+%) than in cross IMPs. Teams is completely different IMP scoring where good teams have a decisive advantage even in short matches.

Cthulhu D · December 17, 2014

MPs and imps require different skills but the skills seem to correlate. You would expect players who perform well at one form of the game to perform well at the other.

I'm not sure correlate is the correct term. The skills are not different between the two forms of the game. Fundamentally play skills boil down to 'what is the best line for X tricks with these cards on the bidding and play to date' and bidding is 'bidding to a contract that is Y% likely to make/not go off more than Z amount (for a balancing auction over 2M)'

The only skill difference between the two is correctly setting your X and Y for the form of the game and that is a comparably small thing.

The only reason I don't like match-points is because it encourages you to do things that you consider suboptimal (for example, Jeff Goldsmith on his website has a piece where he decides to open 1NT even when he'd prefer to open 1m because he knows that the field isn't doing it and the form of scoring stakes the entire board on your judgement) because the 'field protection' is worth it.

This seems silly, it's like having an electoral system that doesn't result in the concordat candidate.

fromageGB · December 17, 2014

The only reason I don't like match-points is because it encourages you to do things that you consider suboptimal (for example, Jeff Goldsmith on his website has a piece where he decides to open 1NT even when he'd prefer to open 1m because he knows that the field isn't doing it and the form of scoring stakes the entire board on your judgement) because the 'field protection' is worth it.

This seems silly, it's like having an electoral system that doesn't result in the concordat candidate.

This is no different to the idea playing IMPs that if you are winning the match you don't bid something that might be better but has the capability of a large negative swing, you just accept a possible smaller negative for the sake of the overall result. In MPs if you decide to follow what you perceive as the field, you are accepting what you expect to be a probably worse score for the sake of not getting a significantly worse one that will jeopardise your position. Same thing.

Not silly, just tactical. Like in a UK electoral method, voting for the loony liberals if that might keep out the even loonier left, in your constituency. Or apply your own adjectives and parties!

The UK electoral method is like playing IMPS. What you do on 80% of the boards/constituencies has little bearing on the result, it is only the 20% swing constituencies/game-slam hands where your vote/bidding-play has any impact. This is why I prefer MP scoring, where every vote counts.

Cthulhu D · December 17, 2014

Right, the UK electoral system doesn't always (or even often) elect the Condorect candidate - that was my point. The system is fundamentally flawed!

I think the two bridge situations are different. Bidding a thin slam at a teams match when you are behind is using your judgement effectively to maximise your win %. The parallels from other games is obvious - you take a big risk because the consequences of a loss are small. Similarly this is why you make a safety okay at IMPS but go for the maximum trick expectancy at MPs.

Compare to the NT judgement at match points issue - here you are specifically refusing to use your judgement, because the scoring system does not reward it.

This is flawed - a desirable attribute of a game is to maximise te opportunities to use your skills. Here one has been removed.

rhm · December 17, 2014

Right, the UK electoral system doesn't always (or even often) elect the Condorect candidate - that was my point. The system is fundamentally flawed!

I think the two bridge situations are different. Bidding a thin slam at a teams match when you are behind is using your judgement effectively to maximise your win %. The parallels from other games is obvious - you take a big risk because the consequences of a loss are small. Similarly this is why you make a safety okay at IMPS but go for the maximum trick expectancy at MPs.

Compare to the NT judgement at match points issue - here you are specifically refusing to use your judgement, because the scoring system does not reward it.

This is flawed - a desirable attribute of a game is to maximise te opportunities to use your skills. Here one has been removed.

It is your arguments, which look flawed to me. .

Contract Bridge is in itself a game where certain contracts are rewarded and others not, "because the scoring system does not reward it". A tautology.

Unforced by opponents 5♥,5♠,4♣,4♦,3♥,3♠,2NT are all examples of such unrewarding contracts.

Does this make the game flawed?

On the contrary.

These are just prejudices against some form of the game (in this case MP).

Rainer Herrmann

MP vs IMPs

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

helene_t

PhilKing

helene_t

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation