whereagles Posted March 1, 2010 Report Share Posted March 1, 2010 Hi all, We're going to hold a regional imp pairs tournament and we're debating how to determine DATUM scores for each hand. Options are: 1. DATUM = mean of all scores 2. Calculate mean M and standard deviation S of scores for each hand. Then DATUM = mean of all scores that are not (1.07 x S) away from the mean. In other words, in option 2 we do away with outliers, i.e. scores that are more than 1.07 standard deviations away from the mean. (Note: scores distribution usually does not follow a gaussian curve, so don't try to make sense out of the 1.07 factor - it's just a guess by the orginal author. In fact, the scores distribution tends to be uni, bi or trimodal, depending on the hand.) Simulations show that option 1 tends to benefeit playing with the field, whereas option 2 comes close to imp scores for each hand that are closer to what we see in a regular teams game. We are leaning towards option 2, but I'd like to hear some opinions. Quote Link to comment Share on other sites More sharing options...
hrothgar Posted March 1, 2010 Report Share Posted March 1, 2010 These days, I am spending an awful lot of time working on automatic outlier detection. It's a hard problem at the best of times and is intractable at many other. (From what I can tell, "the best of times" means that I can use a random forest) Here's a couple simple pieces of advice: 1. If you can't generate some kind of model that describes how your board results should be distributed, then trying to develop any kind of defensible outlier detection scheme is a waste of time. You have directly stated that you can't tell whether scores should be modeled as a unimodal, bimodel, or trimodel distribution. Fair enough, however, if you can't describe a "normal" result, how can you hope to describe an abnormal result? 2. If the author that you are citing is just making a random guess that anything greater than 1.07 * sigma should be treated as an outlier than I don't have much faith in his analysis. There must be something more to this... I tend to have a conservative bias on these sorts of things. If you aren't in a damn good position to explain precisely what you want to accomplish and describe how your changes will achieve this end, then its probably better to do nothing at all. The following quote touches on some of these issues, >Simulations show that option 1 tends to benefeit playing with the >field, whereas option 2 comes close to imp scores for each hand >that are closer to what we see in a regular teams game. However, you fail to explain why we should care about "Playing with the field" or"IMP scores in team games" Quote Link to comment Share on other sites More sharing options...
Siegmund Posted March 1, 2010 Report Share Posted March 1, 2010 I find it A) mildly strange that you want to IMP against a datum rather than use cross-imps, and B ) considerably stranger to use either of these two methods to find a datum. The three most obvious ones that come to my mind are double-dummy par, the score which minimizes squared imp differences (what a mean does in total-points scoring - but this would have some very odd behaviours in a few cases), and the median (has a vaguely matchpoint-like feel to it, but it's easy to calculate and takes care of outliers.) I would want to hear a REALLY good reason for your rather drastically trimmed mean proposal before I'd consider it as anything other than a bizarre outlier of a method :) Quote Link to comment Share on other sites More sharing options...
whereagles Posted March 1, 2010 Author Report Share Posted March 1, 2010 Sieg: method 2 isn't an invention of mine. It was shown by a friend who read it somewhere when fiddling with stuff on imp pairs. That being said, it's your three methods that I find really strange!! I've never, ever seen anyone using any of those, nor would I ever convince people those are good methods. Bridge players aren't statisticians. They want simple ways to understand the DATUM and "mean" or "mean, minus wierd results" are simple enough. "Imps least-squares" just doesn't cut it :) Still, the cross-imp proposal seems good and in line with what I'd want. I might try that later. We'd still need to convert it to victory points, though. Quote Link to comment Share on other sites More sharing options...
Siegmund Posted March 1, 2010 Report Share Posted March 1, 2010 I'd be interested in hearing where, if you or he happens to recall it. The basic question somebody needs to ask is "what is a datum supposed to represent?", from which the correct type of datum to use usually will follow directly. (To my mind, as soon as you decide your event isn't going to be scored by total points, methods based on mean total-point score are automatically off the list of candidates.) I did think that - back before the internet introduced cross-imps to everybody - median was fairly widely accepted to be better than mean, but it's never been a popular format in my part of the world, so I can only judge by internet forum traffic. You have seen bluejak's old article on the subject? Quote Link to comment Share on other sites More sharing options...
Mbodell Posted March 1, 2010 Report Share Posted March 1, 2010 Why not cross imps? That seems the most straightforward. Quote Link to comment Share on other sites More sharing options...
NickRW Posted March 2, 2010 Report Share Posted March 2, 2010 If you use a 2 winner movement, I don't think it makes that much difference which outliers, if any, you remove. It is maybe a little more likely to make a difference to a one winner movement - not sure really - just a gut feel - don't have so much experience scoring one winner movements. Cross imps is generally recognised as being fairer/more accurate, if a little harder for inexperienced people to understand where their score came from. However, again for a 2 winner movement, in practice it makes no difference for most sessions whether it is scored a la Butler or cross imps. Nick Quote Link to comment Share on other sites More sharing options...
Trinidad Posted March 2, 2010 Report Share Posted March 2, 2010 There is basically no excuse for scoring IMPs against a datum (Butler method) when you can use cross IMPs. In the old days, when scores were calculated by hand, it was impossible to calculate the cross IMPs. That was a good excuse. Now we have computers everywhere and the excuse is gone. Why are cross IMPs better than scoring against a datum? The main reason is that in datum scoring all the data that you have available are reduced to an average of some sort. All other data gets lost in the process. And an average is not a very good discription of a data set, particularly when you are going to do complicated things with it, such as scoring IMPs. To repeat the old example: If you fire a round in front of a hare and one behind it, on average the hare is dead. All hunters know how wrong averages are. In cross IMPs all data are used. It takes more work for the computer (2 microseconds instead of 1 microsecond), but it is a much better method. Some of the odd things in Butler (datum scoring) that are the result of using an average:o The sum of all EW scores is not equal to the sum of all NS scores. (Barring penalties by the TD, in MP pairs this sum is always the number of pairs x 50%, in cross IMPs this sum is always 0.)o In theory, if you would extrapolate your IMP pairs to 2 tables, you should get the same score as a team match. o Since your own score is also counted in the datum, you are also playing against yourself (?!?). In short: If you are living in the 21st century use cross-IMPs. Do not go anywhere near datum scoring. Rik Quote Link to comment Share on other sites More sharing options...
hotShot Posted March 2, 2010 Report Share Posted March 2, 2010 Basically at Butler (using a mean) you are averaging the score, while at cross-imps you are averaging the IMPs. The results differ, because the IMP-scale is not linear. It would be nice to have a par score, but many boards don't have one.(I remember a board where I could make 6♠ while our opponents could make 6♣.) At Butler scoring it's easier to handle outliers, you just don't use them when calculating the mean. But this does not help much, if there is no par score. Using cross-imps you want to have as much scores as possible, so that outliers from both ends even out. Quote Link to comment Share on other sites More sharing options...
gwnn Posted March 2, 2010 Report Share Posted March 2, 2010 IMP least square sounds like a really cool method!!! Of course coolness is not quite the proper criterion but maybe it should be taken into account...:) Quote Link to comment Share on other sites More sharing options...
hanp Posted March 2, 2010 Report Share Posted March 2, 2010 I think that it is a serious flaw if many people cannot understand the scoring method. The IMP table is hard to remember but not hard to understand. Evaluation method (2) will be incomprehensible for many competitors. Throwing away the top and bottom scores will do fine and is much easier to understand. Quote Link to comment Share on other sites More sharing options...
ArtK78 Posted March 2, 2010 Report Share Posted March 2, 2010 It would be nice to have a par score, but many boards don't have one.(I remember a board where I could make 6♠ while our opponents could make 6♣.) All boards have a par. On the one that you referenced, the par would be 7♣x down one. Quote Link to comment Share on other sites More sharing options...
NickRW Posted March 2, 2010 Report Share Posted March 2, 2010 I think that it is a serious flaw if many people cannot understand the scoring method. The IMP table is hard to remember but not hard to understand. Evaluation method (2) will be incomprehensible for many competitors. Throwing away the top and bottom scores will do fine and is much easier to understand. Yes, well, I too hear the "use cross imp - its 21st century" argument - and I also see some sense in trying to think of different methods of improving the datum for the Butler method. I am in a situation where my club is thinking of introducing IMP scored pairs once a month (essentially a lot of the members don't like teams - but some of the better players want more opportunity to practice IMP strategy for when they do play teams outside the club - hence the IMP scored pairs compromise). As the muggins who does much of the scoring and as someone who has to make sure the other scorers know what they are doing, I have to make up my mind which is best. At the moment I am struggling to see a better option than "standard" Butler, i.e. compute the datum after throwing the top and bottom score. Recently I've regularly rescored our MP sessions by IMPs - both cross imp and butler - and though using any sort of IMP scale (often) makes a difference to the placings versus MP, there is usually no difference at all for Butler vs cross imp - not for the 2 winner movements that we normally use anyway. Nick Later edit - have tried comparing butler vs ximp with one winner movements - as I suspected it makes a difference which you use a good deal more often Quote Link to comment Share on other sites More sharing options...
helene_t Posted March 2, 2010 Report Share Posted March 2, 2010 Given the choice I prefer least squares to random forests <_< Partially agree with Han. I think people can understand butler in the sence that they understand the mechanics of it, but I don't think many people understand the implications of it, beyond that "you should follow roughly the same strategy as with IMPs" which is of course reasonably accurate. Then again, the same argument holds for the other methods. Anyway, the solution is well-known. Play XIMPs. XIMPs is more accurate in determining the best pair than Butler, and it's easier to understand the implications since it's the same strategy as IMP Teams. Quote Link to comment Share on other sites More sharing options...
barmar Posted March 2, 2010 Report Share Posted March 2, 2010 Didn't we just have a XIMP vs Butler debate a couple of months ago? I came up with a good reason why XIMP is "correct" a few days ago. In the limiting case of just two tables, the XIMP score is the same as the score in a team game. So if IMP pairs strategy is intended to be similar to team strategy, this scoring method corresponds to that. Quote Link to comment Share on other sites More sharing options...
gwnn Posted March 2, 2010 Report Share Posted March 2, 2010 huh? Butler is also equivalent to Teams if there are only 2 tables. Quote Link to comment Share on other sites More sharing options...
whereagles Posted March 2, 2010 Author Report Share Posted March 2, 2010 I'm a bit short on time now, but I'll reread the thread and links later tonight. I didn't think of cross-imps because, as I said, we want to convert scores into VP. But you can probably do that divinding the total cross-imps for each board by the nr. of tables and converting the outcome to VP. Quote Link to comment Share on other sites More sharing options...
barmar Posted March 2, 2010 Report Share Posted March 2, 2010 Let's say you make a non-vul game at one table, and go down 1 at the other table. XIMP and teams: the scores are +10 and -10. Butler: the datum is 185, and the scores are +5 and -5. In this example, XIMP becomes equivalent to Butler if you divide by the number of tables, but that doesn't apply more generally. For instance, try it with a making and failing slam: XIMP is 14 IMPs, Butler is 10. Quote Link to comment Share on other sites More sharing options...
whereagles Posted March 2, 2010 Author Report Share Posted March 2, 2010 Allright, I'm more or less convinced cross-imps is the deal. We'll probably divide by the number of comparisions to get a more normal-looking result (even if in fractionary imps lol) that we can later insert into a VP scale and come up with a VP result. Quote Link to comment Share on other sites More sharing options...
gwnn Posted March 2, 2010 Report Share Posted March 2, 2010 you're right barmar, momentary lack of reason from my side, sorry Quote Link to comment Share on other sites More sharing options...
whereagles Posted March 2, 2010 Author Report Share Posted March 2, 2010 I'm just worried of converting imps to VP... dividing by the number of comparisions and adding all games up may require a revised VP scale.... Quote Link to comment Share on other sites More sharing options...
gwnn Posted March 2, 2010 Report Share Posted March 2, 2010 how many boards per round Nuno? I'd be reluctant to turn imps to VP unless there was enough boards per round, Quote Link to comment Share on other sites More sharing options...
Cascade Posted March 2, 2010 Report Share Posted March 2, 2010 How many is enough? Not that it matters much to the IMPs but is it normal to divide by the number of comparisons? I thought one more or one less was recommended. It would matter to the VPs. Quote Link to comment Share on other sites More sharing options...
gwnn Posted March 2, 2010 Report Share Posted March 2, 2010 I would say 6 is enough but it's just an arbitrary line that I made up now. I played in a few events where they made VP's out of 2 board matches and it wasn't very healthy. People took really swingy decisions whenever they had a little disaster in the 1st board. 2 was clearly too short and 8 is intuitively quite enough for these purposes. 6 is OK and I am not sure about 4. Quote Link to comment Share on other sites More sharing options...
mycroft Posted March 3, 2010 Report Share Posted March 3, 2010 As many people are hinting at (or just assuming everybody understands), the biggest problem with Butler is that the datum is almost always a non-bridge result (185 being a classic). The IMP table has been constructed - key word here - with two keys in mind: cut down somewhat on the "one-hand-is-the-match" problem with total points scoring (while not disappearing it completely - that's B-A-M teams/matchpointed pairs), and make common bridge differences different IMP scores (frequently by putting the "common difference" at the bottom of an IMP band - 10, 20/30, 50, 100, 250, 450, 500, 750 off the top of my head). The first part is irrelevant here, but the second is totally destroyed when the score you're comparing to *isn't a bridge result*. Whether that actually matters is another problem for the student, but it feels wrong to anyone who is of mathematical bent. David's "cross-IMP confusion" argument, at least where I "live", seems to have gone out the window; people understand cross-IMPs now, either because they've played through it (and had it explained to them) enough, or because they play online, and OKB and BBO cross-IMP everything. Yes, there's a problem if you don't pitch results and people play silly buggers on the outliers, but that tends to happen much less often (and be valid results (passed splinter, revoke giving the contract) (which you *want* to keep) when it does, as opposed to random open 7NT and XX if it comes back) in real life, and the more hands in the system, the less the outlier matters. (That's the problem - which Fred has apologized for - with BBO's cross-imp rooms; there are only 15 comparisons, and one dumb result is 1-2 IMP noticeable. With 50 comparisons, not so much) Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.