Intentional weird results and possible prevention?

coyot · July 9, 2005

Hello, folks!

I've recenly seen another case of a pair scoring 24 IMPS for 7NTxx-13 with obvious intent (straight bid and 0 tricks claim) by the other pair.

It is impossible to prove intent on the receiving end (and I believe that in most cases there is no intent). Most directors do adjust for these results when found - but especially when this happens in the last round, it escapes detection...

I think that there should be some significant punishment for people that spoil other people's game this way... the only problem is that things like that are very hard to achieve automatically.

One possibility would be to detect all 0 tricks claims by defense at first trick (as these are next to impossible in a real play) - and repeat offenders banned/deleted.

Other idea would involve TDs (and that is why I came to this board):

Perhaps it would be useful to have the system automatically notify the TDs about any weird results for each board - and they could blacklist the offenders globally...

This would be more subtle solution, catching not only the most obvious cases but any freakish results. I imagine that setting the threshold for suspectible results at 300-500% of average (or anything doubled/redoubled and with more than 5 under/overtricks) should be sufficient.

What would you think? Is it okay to just trust TDs to catch all those freaks? Or should there be something in the system to punish them as well?

nickf · July 9, 2005

I saw this the other day too - after an auction 1D - 7NT - X - XX ///. This was the last board of the tournament I also played in. I had been playing fast and was one of the first finished so as usual I was cruising through the other tables to see what was going on the last hand.

The auction described above was clearly retributional. On the previous board the 7NT bidder had opened a strong NT and his partner had raised to 6NT on some balanced 14 hcp filth. This drifted three off.

As soon as I saw the auction on the last board I pmed the Director, asking him to examine the auction on Bd X at Table Y. He pmed me back to say he would look at the board.

I don't know if he adjusted the score but I wonder if consideration can be made for eliminating the top (or top 2) scores in each direction when averages are calculated? This eliminates the effect of unusual scores and is the principle we employ in Australian events.

nickf

sydney

hotShot · July 9, 2005

1) Directors are only aware of such results if they are reported by a player.

2) Such an action, if it comes to the attention of someone, should be reported to abuse (at) ...com They'll know what to do.

3) I do like the idea that the software could give an alert on "strange" results, but i think it is very hard to define "strange result" to a computer. Since a lot of strange results might be simple misunderstandings.

It such an alert could be implemented, it should perhaps better be directed to a yellow or to abuse, because they can react to such a violation of site rules in an appropriate way. e.g. ban a player from the site.

uday · July 9, 2005

Auctions like 7Nxx should be reported to abuse....

Rain · July 10, 2005

I think right now, when either the defenders or declarer claims, all tricks appear to have been claimed by declarer. This makes pinpointing fault difficult. Will be nice if this is changed.

Yeah report these abusive boards to abuse, but remember if you want at the spot action, like score adjustment, contact TD........

BurnKryten · July 10, 2005

I think in this case the defenders should refuse to accept the claim and call the director to have him take care of the situation. I think this is far superior to having the director try to sort it out after the fact.

bearmum · July 10, 2005

I think in this case the defenders should refuse to accept the claim and call the director and have him take care of the situation. I think this is far superior than having the director try to sort it out after the fact.

sounds like a good solution to me (in addition report to abuse @ u know )

Double ! · July 10, 2005

Made this suggestion once before, and shall make it again.

I recommend that the program that scores imps be modified so that the highest and the lowest scores not be used in determining the mean score. This should help prevent skewing of the scale as the result of a few seriously abnormal scores in either direction. (It's annoying to play a hand really well, get a plus score, and lose 1.2 imps because someone just posted a plus 500 the same way as the result of a poor contract at some other table.)

Gerardo · July 10, 2005

You are assuming butler score is used (get all the scores, maybe cut the extremes, get the mean). It's not.

Your score is compared with all the others, IMPs are calculated in each comparison, and what you get is the sum of all those results (then this final result is divided by the number of comparisons, to "normalize" it, but this last step is not necessary).

So, there is no mean, and no cut is possible.

nickf · July 10, 2005

Auctions like 7Nxx should be reported to abuse....

Uday is proabably right here - but should this the business of the director or any one who witnesses the unusual?

And where do we draw the line? I mean, I'd like to report the guy this afternoon who had a blackwood accident, wound up in a grand off three keycards and caused me to lose 5 imps against the average.

At least I *think* it was accident. But who knows what evil lurks in the minds of men (or however that saying goes?) But get my point ?

nickf

sydney

coyot · July 10, 2005

I would perhaps want to split this topic into two:

1) Ignoring extreme results when counting IMP average for a board - which has nothing on general to do with abusive players and should be discussed as a separate thread

2) Protecting the whole tournament from abusers. This topic could be discussed from several views:

a) spoiling the average (solved by the above)

:unsure: giving the defending pair a good result they do not deserve.

If the defenders are supposed to refuse the claim and call the director, would not doing so be considered unfair? Would the defenders who gladly accept 24 IMPs be punished for doing so or not?

c) giving the TD more work handling these calls.

d) reporting to abuse

e) catching the offenders automatically by the system.

I personally do not fear that the last step would be too difficult. For a first start, two simple conditions could work together:

1) board results in more than 4(5) under/overtricks than bid

2) board score results in more than 300% of average score for this board.

These conditions together should cover most of unbid slams and even sacrifices gone way wrong... (i.e. non-vul down 5 against vulnerable game would be 1100 vs 600, still within 200%)

If the system could automatically "call" the director (or perhaps notify him in less demanding way), the TD could then be given the choice of either approving the result

or assigning ave+ to defenders and some hefty penalty to offenders (plus automatically reporting them to abuse and adding them to a general blacklist).

I don't know yet if the scale of this problem is worth such the fuss, though. Maybe a simple check of the results list would show that there are only a few cases each day...

Currently I see the biggest problem in the fact that (especially when this happens in the last round of tourney) some TDs do not (for various reasons) act on it.

In my case, the defenders scored 24 IMPS on the very last board and won the tournament by about the same margin. I don't know if it is possible for TDs to adjust results when the tournament has already finished - but in this case, when I (accidentaly) discovered this result and told the TD, he said something like he couldn't adjust...

if you feel that this problem is not worth any system changes, I would at least propose to employ a very strict policy against those abusers - that is, account ban on 2nd repeat (if not on 1st :)).

V.

coyot · July 10, 2005

You are assuming butler score is used (get all the scores, maybe cut the extremes, get the mean). It's not.

Your score is compared with all the others, IMPs are calculated in each comparison, and what you get is the sum of all those results (then this final result is divided by the number of comparisons, to "normalize" it, but this last step is not necessary).

So, there is no mean, and no cut is possible.

This is not true. Cuts are possible even if there is no mean.

The only thing you need to do is, when walking the list of existing results, find its max and min and skip those.

Using sum of IMPs divided by amount of used results is in fact a nice and fair method, because you can gain fractions, i.e. for playing NT instead of major etc. - but it can still involve cuts of extremes.

It would only mean that you could not use a simple loop that goes through all results - you would have to use min() and max() functions on the result set and two ifs in the code.

It would probably make the whole code significantly slower - but if performance shows as troubling, it can always be reworked from the simple method with mi(), max() and ifs to some cleverer singlepass function.

To simplify matters further, the whole recalculation could be done only once when all the results for the board are present - no need to do the cuts on the fly. Now that would be really fairly simple and quick.

Elianna · July 10, 2005

You are assuming butler score is used (get all the scores, maybe cut the extremes, get the mean). It's not.

Your score is compared with all the others, IMPs are calculated in each comparison, and what you get is the sum of all those results (then this final result is divided by the number of comparisons, to "normalize" it, but this last step is not necessary).

So, there is no mean, and no cut is possible.

I'm not sure that I understand: isn't THIS taking a mean also (but with the IMP scores instead of the raw scores)?

hotShot · July 10, 2005

There is a simple reason not to split this into more than one topic.

A score of 7NTXX-13 on one side, does indeed influence heavily all results of the board. Not only will all pairs playing at the same side gain a lot of -IMPs thos on the other side will gain about the same as +IMPs. The tourney score is worthless after that.

So the offenders does not only ruin his own score, he is destroying the scores of half the participants in a tourney. If a TD whants to save his tourney he needs to adjust the boad. But to what score should he adjust?

Not knowing the players system, their ability etc. any decision made will be somekind of unjust. Best he can do ist assign ave= to both sides so that the others get a fair score. But the offenders would go unpunished until "abuse" hits them.

So handling extreme scores while calcutating the board result could help a lot here. Introducing some sort of "cut" is of cause possible. One could calculate the crossIMPS for all others ignoring the cut results. Those players involved in this result, get their crossIMPS calculated including their results, after TD decided not to assign an artifical score.

hotShot · July 10, 2005

Using a percentage limit might not work well. Consider cases like:

-2 nv = 100

X-2 nv = 300

XX-1 nv = 400

Or: 2♠= and 2♠X=

Gerardo · July 10, 2005

You are assuming butler score is used (get all the scores, maybe cut the extremes, get the mean). It's not.

Your score is compared with all the others, IMPs are calculated in each comparison, and what you get is the sum of all those results (then this final result is divided by the number of comparisons, to "normalize" it, but this last step is not necessary).

So, there is no mean, and no cut is possible.
This is not true. Cuts are possible even if there is no mean.

The only thing you need to do is, when walking the list of existing results, find its max and min and skip those.

Using sum of IMPs divided by amount of used results is in fact a nice and fair method, because you can gain fractions, i.e. for playing NT instead of major etc. - but it can still involve cuts of extremes.

It would only mean that you could not use a simple loop that goes through all results - you would have to use min() and max() functions on the result set and two ifs in the code.

And how do you calculate the result for the extremes? And why an honest disaster should not be included?

You end up with an hybrid/changed method, which must be fairer than current one in the GENERAL case. Why should get the top scorer less (what you are proposing implies this)? Or bottom scorer more?

IMO, you are ruining the method in all cases for just the corner cases (dumped hands)....

...which should be managed by averaging the hand, which has the same end result when these hands appears (hand is discarded for calculations). BUT non-offenders should get A+, and offenders A- (and a procedural penalty if/when available, I'd say 100% of a top should be fine). AND dumper bidder should get a suspension from tourneys. TDs must do the first, the second is not (yet, hopefully) available, then would be at TD discretion, and abuse does the third, WHEN REPORTED.

BTW, results can be adjusted while tourneys are still list, even after completed.

I'd also say even "no adjustments" TDs must correct these (<BBO hat off> as an aside, I don't see the point in "no adjustment" TDs <BBO hat on>).

inquiry · July 10, 2005

Be sure to report these to abuse, the record is easily checked, and HARSH PENALTIES are frequently applied to people who do this, and repeat offender are banned from the site. And yes, report his immediately to yellow if TD is not adjusting, as yellows will.

coyot · July 10, 2005

And how do you calculate the result for the extremes? And why an honest disaster should not be included?

You end up with an hybrid/changed method, which must be fairer than current one in the GENERAL case. Why should get the top scorer less (what you are proposing implies this)? Or bottom scorer more?
IMO, you are ruining the method in all cases for just the corner cases (dumped hands)....

1) Results for the extremes are calculated, of course, by comparison to the "cleaned" average.

2) What is the purpose of IMP scoring, compared to matchpoints? That the more extreme result you get, the bigger score you get. But, for example, playing in Main Bridge Club at a table, you would want your scoring to reflect, if possible, the ability of your pair to outbid and outplay the opponents against something that should be called the "par of the board". The score I achieve at the table should be the least possibly affected by one of the other 15 pairs having an honest disaster.

3) You seem to have misunderstood me. In fact, the top scorers will get extra benefit.

If there is a laydown 3NT for +600 and 15 pairs end up in the right contract, the ones that score 150 for 2NT+1 will NOT cause the 15 pairs to receive +1 IMP for doing right. (And, which is even more important, they will not cause the other 15 pairs (defending a laydown game) to receive -1IMP for doing nothing wrong. They will be the only suffering pair, suffering the full 450 points penalty...

Again: the general principle is that you discard i.e. one extreme result from each end (for 16 results), determine the IMP average and then score all 16 boards against this simple average. This way, one freak result does not change the par of the board - while both pairs involved in that freak result get their big win/loss - possibly even slightly bigger than under current conditions...

McBruce · July 11, 2005

coyot, you are confusing Butler scoring and cross-IMP scoring. BBO and most online sites use cross-IMPs (referred to as Average IMPs by ACBLScore).

In cross-IMPs, you compare every pair against every other pair once and take the average score.

In Butler scoring, you remove about 5-10% (one out of 13, more if there are more results) of the extreme scores from either end, take an average of the rest, and compare all the results to this completely artificial average (which often is an unmakable score).

Suppose a board has these N-S scores:

1 x 1460

4 x 1430

1 x 1400

1 x 800

2 x 710

7 x 680

2 x -100

BUTLER: The average result, taking two off the top and bottom is +905; we round to +910 and compare:

1460 wins 11

1430 wins 11

1400 wins 10

800 loses 3

710 loses 5

680 loses 6

-100 loses 14

CROSS IMPS: Each result has to be compared against each other result. If you scored 1430 you push against the three other 1430s, you lose 1 to the 1460, win 1 against the 1400, win 12 against the 800, win 12 twice against the 710s, win 13 seven times against the 680s, and win 17 twice against the two -100s. This adds to plus 161 IMPs in 17 comparisons, for a rounded average of +9.47 IMPs. Here are the rest:

1460 wins 9.94

1430 wins 9.47

1400 wins 8.70

800 loses 1.00

710 loses 2.52

680 loses 3.29

-100 loses 13.71

Most good players feel that Cross-IMPs is better. Cross-IMPs always adds to precisely zero. The Butler method above adds to -18, meaning that east-west pairs will gain an average advantage of a full IMP just for sitting the right way. (Such advantages usually--but not always--even out over the course of the session.)

Some have said that Cross-IMPs 'flattens' the scores, and advocate dividing instead by (comparisons minus one). Butler critics claim that the folks who devised the IMP scale never considered artificial scores like +910 above; comparing scores against an artificial score that nobody can make is flawed.

Taking out the extreme scores in Butler doesn't do enough to reduce the effect of wild scores. The theory is that we are removing the extremes to make a better average score to compare against. But this average will often be significantly different if there are wild scores involved.

The average above (two removed from each side) is +905. Add a +7600 (7NT redoubled down 13) and the average (even with the 7600 removed is now +940. Add instead a minus 7600 and the average is now +838. Add one at both ends and the average is still not quite the same: +875. They still have their effect.

The only solution to deliberately wild scores in Cross-IMPs is to adjust them to an artificial score of average minus or zero, depending on fault, and recalculate all of the non-artificial scores. Any solution involving tossing out extreme scores, or comparing the scores in the middle extra times, has serious flaws.

coyot · July 11, 2005

coyot, you are confusing Butler scoring and cross-IMP scoring. BBO and most online sites use cross-IMPs (referred to as Average IMPs by ACBLScore).

...

Oh, I see the difference now... was too lazy to do the experiment and thought that Cross IMPS would produce nearly the same results...

I agree with the statement that removing extremes does not purify the average completely in most cases - but you should also consider the effect of Butler-type cuts on flatter hands:

When everyone plays 4S= and one pair plays 2S+2, every defending pair will get a bad result "for just sitting there". I'm not sure how seating algorithms work in various BBO tourneys - are they all random enough so that noone will suffer average 1IMP per board for sitting on the same side of the table as some aggresive freaks?

(3/4 Howell or Mitchell comes to mind)

I personally think that it would be possible to do CrossImps with the Butlerish cut - because the 2.7 IMP difference between CrossIMPs and Butler in the example is caused by the fact that two pairs went down one in the slam. This causes the difference between CrossIMP 3.29 and Butler 6IMP penalty for not reaching the small slam.

Butler has the advantage that it's cuts protect the innocent from one freaky pair - and while it's critics may rightfully claim that you get scored against a non-makable score, I would only say that in the long term, it does not really matter. If half of the field bids the slam and half does not, your average IMP gain in CrossIMP will STILL look like a comparison to non-makeable score.

I don't understand the part that says that Butler compares to -18... how does the number of results come into it?

McBruce · July 11, 2005

I don't understand the part that says that Butler compares to -18... how does the number of results come into it?

1 x 1460

4 x 1430

1 x 1400

1 x 800

2 x 710

7 x 680

2 x -100

1460 wins 11 once, total +11

1430 wins 11 four times, total +44

1400 wins 10 once, total +10

800 loses 3 once, total -3

710 loses 5 twice, total -10

680 loses 6 seven times, total -42

-100 loses 14 twice, total -28

11 + 44 + 10 - 3 - 10 - 42 - 28 = -18, so the average NS pair is -1! I think this is an extreme example, oddly enough I made it up out of thin air! :ph34r:

In Cross-IMPs the math works out so that it always balances.

When everyone plays 4S= and one pair plays 2S+2, every defending pair will get a bad result "for just sitting there".

Assuming 13 tables and vulnerable, in cross IMPs this is only a 0.83 IMP swing, since you lose ten once and push eleven times. Hardly a bad result. In Butler this would be an average because we discard the extremes: but who knows, maybe this is a board that only 12 out of 13 pairs will risk game on? Who are we to assume that it is a universal flat board because only 1 out of 13 pairs misses the game. My opinion is that a hand like this is an example of the disadvantages of Butler scoring.

If you Google 'Butler Cross-IMPs' you will find lots of online discussion on this.

helene_t · July 11, 2005

Removing the extremes only solves a small fraction of the problem. I don't think it's worthwhile.

Even with extremes removed, most of your succes (well, at least most of the succeses I enjoy in tournaments) will still come from opponents that make absurd bidding mistakes or other blunders.

It would be nice if the TD's had time to clean up dumping results, but still, dumping is not the major cause of outliers.

The only way to get clean results is to homogenize the field (when feasible), or to agregate over multiple sessions. Otherwize, don't take the results too serious. It's just a card game.

coyot · July 11, 2005

When everyone plays 4S= and one pair plays 2S+2, every defending pair will get a bad result "for just sitting there".

Assuming 13 tables and vulnerable, in cross IMPs this is only a 0.83 IMP swing, since you lose ten once and push eleven times. Hardly a bad result. In Butler this would be an average because we discard the extremes: but who knows, maybe this is a board that only 12 out of 13 pairs will risk game on? Who are we to assume that it is a universal flat board because only 1 out of 13 pairs misses the game. My opinion is that a hand like this is an example of the disadvantages of Butler scoring.

If you Google 'Butler Cross-IMPs' you will find lots of online discussion on this.

Oh, I see... you would have to use continuous IMP scale to get rid of this problem... that 910 Butler average happens to be near to the wrong (for NS) boundary on IMP scale...

To the quoted part:

If 12 of 13 pairs risked the game, I don't think they deserve any extra points... When the whole field bids it except for some natural pessimist, they don't play "better" bridge than the rest of the field. It is only the pessimist that should suffer on this board.

I think that we can safely assume that if a large majority of the field ends up in a game contract, the game contract should be considered "par of the board" and only those that score more than par should receive points.

If you look at it from the defending side: who are we to assume that the 12 NS pairs who bid the game play better bridge than the 12 EW pairs that just lost 0.83 IMP?

I think the scoring system should try it's best to reward those who play better bridge...

Who are we to assume that it is a universal flat board because only 1 out of 13 pairs misses the game.

Pardon me, but I thought that the definition of "flat board" IS a board where a large majority of pairs playing common bidding systems and not affected by opponents psycho bids end up in the same contract and score the same result.

Even if there are alternative ways of making the contract, if the large majority of pairs choses the one that succeeds, I consider it a flat board.

but who knows, maybe this is a board that only 12 out of 13 pairs will risk game on

It's about the same risk as betting that the next US president will be either a Democrat or a Republican... 12 of 13 people think so... and that is why no betting agency ever runs bets like this with win:bet ratio above 1 :ph34r:).

In other words - cutting the extreme results would bring more good than bad - and it would bring the good largely to the pairs that were not able to affect the result significantly.

There is the question if and how could extreme-cutting be incorporated into cross-imps - how to determine the score for the pairs whose results were cut out as extreme (in order not to disrupt the IMP balance etc...)

IMHO Butler with fractional IMP scale would work quite well. We're using normalized IMPs with decimal parts already (although they have their nice whole IMPs behind them) and I think it would work quite well.

[but I promise to stop spamming the forums with this topic if noone else supports the idea :-)]

Gerben42 · July 11, 2005

You can use Herman de Waels improvement of Butler instead:

For more information about "Bastille" see:

http://users.skynet.be/hermandew/bridge/bastille.html

reisig · July 11, 2005

Why try to change a whole bridge scoring program because of a player that probably should be banned? The director should be called - and an adjusted score given. Then the crazy bidder be reported...and the proper punishment assigned by BBO.

Intentional weird results and possible prevention?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation