Zar points, useful or waste of energy

mike777 · September 7, 2005

I mentioned earlier that I tried reading Mr. Zar but could not understand page one let alone the rest.

Am I alone on this site regarding not understanding this debate.

What is the hypothesis that is being tested, assuming there is one? If so could some one post it not only in Math terms but also in plain bridge english terms? Both would help me.

Thank you in advance.

awm · September 7, 2005

Here's my understanding of the debate.

We would like to design a good hand evaluation measure.

The idea of such a measure is that you can look at your hand and compute some number. Your partner looks at his hand and computes some number. By combining these two numbers together, without any other information about the hands, we can decide whether we have game, whether we have slam, and so forth with a reasonable degree of accuracy.

Why do we want a hand evaluation measure?

The goal in bridge is to find your best contract. The problem is, you don't have enough bidding space to exactly describe every card to partner. Since I can't just lay my hand on the table and let partner pick a contract, I need to select a (relatively small) amount of data to communicate such that partner can do a reasonably good job.

Isn't this business about computing numbers an oversimplification?

Yes, obviously so. But people have historically used it as a starting point at the table. The first attempt at a hand evaluation system (that I know of) is the milton work point count (4-3-2-1, familiar to most of us) with 26 being enough for game. Once we agree on a basic system, we can start worrying about how to adjust the evaluation when some additional information about length of suits has been communicated as well.

So what's the debate about?

Zar has designed a method of hand evaluation which he believes to be good. This method counts as follows:

Start with your standard 4-3-2-1 point count. Add two points for each ace and one for each king (controls). Now add the sum of the lengths of your two longest suits. Now add the difference of your longest suit length and shortest suit length. This gives you the "number" described above.

Zar claims that you should bid game if your number plus your partner's is 52. Thresholds are also given for slam. Of course, like any system that doesn't take into account relative shapes, this is of limited accuracy. Zar has proposed fit/misfit points to adjust for that once the shapes are known.

The following questions are being addressed:

(1) Is Zar's evaluation scheme better than others in the literature? The other schemes include the 4-3-2-1 count, losing trick count, and various others. Zar has run through a very large library of hands, computing for each one the Zar count and comparing its "predicted number of tricks" to the actual number. His data suggests that his method outperforms the competing methods. Tysen mentions that Zar doesn't include the BUM point method (basically a 3-2-1-0.75 scale). He claims that the main reason Zar outperforms other methods is the reweighting of the honors (i.e. aces are underweighted in the 4-3-2-1 scheme and quacks overweighted) and that Zar's method of counting distribution by adding/subtracting suit lengths is actually less accurate than adding points for singletons and voids.

(2) Do the additional fit/misfit points Zar adds when distribution is known accurately reflect these sorts of features? Here less seems to be known; Tysen points out that Zar's scheme of adding fit/misfit points seem to weigh things differently depending on who opens (with two identical hands) which seems kind of odd.

(3) There is also some debate about what is simple to compute at the table, and what is reasonably simple to explain to opponents.

tysen2k · September 7, 2005

Superfit points are calculated straightforward – 0123 for the Zar Ruffing method and straight 3 for the ZP3. Obviously regardless of “opener” – since there is simply no opener.

Okay, then is this 2 or 4 superfit points if there is no opener?

xxxxx

x

xxx

xxxx

xxxxx

xx

xxx

mike777 · September 7, 2005

Here's my understanding of the debate.

We would like to design a good hand evaluation measure.

The idea of such a measure is that you can look at your hand and compute some number. Your partner looks at his hand and computes some number. By combining these two numbers together, without any other information about the hands, we can decide whether we have game, whether we have slam, and so forth with a reasonable degree of accuracy.

Why do we want a hand evaluation measure?

The goal in bridge is to find your best contract. The problem is, you don't have enough bidding space to exactly describe every card to partner. Since I can't just lay my hand on the table and let partner pick a contract, I need to select a (relatively small) amount of data to communicate such that partner can do a reasonably good job.

Isn't this business about computing numbers an oversimplification?

Yes, obviously so. But people have historically used it as a starting point at the table. The first attempt at a hand evaluation system (that I know of) is the milton work point count (4-3-2-1, familiar to most of us) with 26 being enough for game. Once we agree on a basic system, we can start worrying about how to adjust the evaluation when some additional information about length of suits has been communicated as well.

So what's the debate about?

Zar has designed a method of hand evaluation which he believes to be good. This method counts as follows:

Start with your standard 4-3-2-1 point count. Add two points for each ace and one for each king (controls). Now add the sum of the lengths of your two longest suits. Now add the difference of your longest suit length and shortest suit length. This gives you the "number" described above.

Zar claims that you should bid game if your number plus your partner's is 52. Thresholds are also given for slam. Of course, like any system that doesn't take into account relative shapes, this is of limited accuracy. Zar has proposed fit/misfit points to adjust for that once the shapes are known.

The following questions are being addressed:

(1) Is Zar's evaluation scheme better than others in the literature? The other schemes include the 4-3-2-1 count, losing trick count, and various others. Zar has run through a very large library of hands, computing for each one the Zar count and comparing its "predicted number of tricks" to the actual number. His data suggests that his method outperforms the competing methods. Tysen mentions that Zar doesn't include the BUM point method (basically a 3-2-1-0.75 scale). He claims that the main reason Zar outperforms other methods is the reweighting of the honors (i.e. aces are underweighted in the 4-3-2-1 scheme and quacks overweighted) and that Zar's method of counting distribution by adding/subtracting suit lengths is actually less accurate than adding points for singletons and voids.

(2) Do the additional fit/misfit points Zar adds when distribution is known accurately reflect these sorts of features? Here less seems to be known; Tysen points out that Zar's scheme of adding fit/misfit points seem to weigh things differently depending on who opens (with two identical hands) which seems kind of odd.

(3) There is also some debate about what is simple to compute at the table, and what is reasonably simple to explain to opponents.

Wow another very clear, well written post. Thank you very much.

At the risk of getting swatted down let me try and start at the very beginning.

"the goal in bridge is to find your best contract" For sake of discussion "Hand evaluation" is a very important tool in achieving this goal. So Zar and others are devising "Hand evaluations" to achieve this goal.

Ok here I step out on a limb. I do not think "THE goal of bridge is to find your best contract" and therefore this debate is centered on an incorrect goal.

I would argue the goal of bridge is to WIN. What are the smaller goals we need to achieve to win is a good debate. One goal may be to find your best contract another may be to make life difficult for the opp. Perhaps hand evaluation should have other goals besides reaching the best contract, perhaps not.

In any event I do not see how you find the best "hand evaluation" method without a debate first on what the goal of hand evaluation should be first. Maybe it should be on finding the best contract maybe it should be something else?

Zar · September 7, 2005

>

I mentioned earlier that I tried reading Mr. Zar but could not understand page one let alone the rest.

<

Page one is the title, Mike – not much to understand there :-) I have lots of requests to provide a “short” and “no-data-please” presentation of the material. Even requests like “Zar, I unconditionally trust you about any information you present – no NEED to give me proofs and traces of changes of the data so I can even follow the tendency. Just strip the damn thing from any tables and numbers that make me dizzy and give me the conclusions for me to use”.

Mike, I believe you have the same type of request AND I am planning to do a “stripped-down-version” for people that do not need proof and trace of tendencies. That is a perfectly fine request and I will honor it.

>

Am I alone on this site regarding not understanding this debate? What is the hypothesis that is being tested, assuming there is one? If so could some one post it not only in Math terms but also in plain bridge english terms? Both would help me. Thank you in advance.

<

The general “Debate” is actually along the lines of a “pursuit for perfection” in the good sense of the word and as I have said several times “nobody’s perfect” so ... it boils down (apart from the pure Zar-Points-approach discussion) to trying to measure the methods used by “normal people” at the table and see which one makes more sense (and the meaning of that by itself is a partial matter of the debate).

It started with my attempts to test Zar Points against popular methods and see is it is worth even worth presenting it here and there. It measured “aggressiveness” only and I was rightly accused that this is just ‘one side of the coin”. Then I decided to make a “complete coverage” of the spectrum and test all the 3 important boundaries (Game, Slam, and Grand) each from BOTH sides of the fence – overbidding and underbidding. It was a “match” of all the 105,000 boards that have between 9 and 13 tricks in Spades (out of 1,000,000 boards) in the NS direction and see what happens. In my view that was the “ultimate test” that everyone would understand – measured in IMPs, everything on the table, nowhere to hide. Then the debate was pushed into the area you don’t like with all the Variances, Standard deviations, Means etc. (if you think you cannot understand the book, wait until I publish all the data and analysis from THIS exercise :-)

Is all this good? Absolutely. I think we all learn a lot in the process and the thread itself is very active which to me means that the stuff discussed is interesting.

ZAR

ochinko · September 7, 2005

I am thankful for the works of Zar and Tysen, and happy that they are still researching and present here in the discussion.

I would like to add two more things. The first one should be fairly obvious. BUMRAP and ZAR evaluate all the honors (except for the Ten) in the same way. 4.5-3-1.5-0.75 has exactly the same ratio between the honors as 6-4-2-1. If you add 0.25 for the Ten the first one has a sum of 10 just like Milton Work's. The second one is easier to remember because you have to add the HCP to the values of high card controls as counted in the Blue Club (A=2, K=1).

The other thing (which would be quite a relief for people that don't want to be bothered with new evaluation schemes) is that another researcher (Thomas Andrews - http://bridge.thomasoandrews.com/valuations/) found out that on NT contracts BUMRAP doesn't perform any better than Work's point count. He proposes his own evaluation method for NT (A=4, K=2.8, Q=1.8, J=1, T=0.4), which he calls "fifths" because you take one fifth of a point from Kings and Queens, and give two fifths to the Ten.

Petko

tysen2k · September 7, 2005

To copy the metric being developed in the other thread on DD evaluations (which is starting to generate some interesting discussion), let's see how it applies to shapes as a whole.

Given the shape of our hand, what is the chance that we have a game?

Shape    Game?  531    Zar
4-3-3-3   25%
4-4-3-2   26%
5-3-3-2   27%
5-4-2-2   28%
6-3-2-2   30%
4-4-4-1   32%   <--
5-4-3-1   33%   <--
7-2-2-2   34%   <--    <--
6-3-3-1   34%   <--    <--
6-4-2-1   37%
5-5-2-1   38%          <--
7-3-2-1   38%
5-4-4-0   45%          <--
5-5-3-0   46%
6-4-3-0   47%
6-5-1-1   49%
7-4-1-1   50%
6-5-2-0   54%
7-4-2-0   55%

The first set of arrows shows hands that a 5-3-1 system counts as "equivalent." The second set of arrows shows some hands that Zar counts as equivalent. Which set looks more tightly clustered to you?

hrothgar · September 8, 2005

It’s a good to idea to check out first though – just for yourself so you don’t get embarrassed in public. You can go to the website and check automatically the 7-4-1, 5-3-1, and 3-2-1. Yeah ... it will take some reading, sorry.

I have a better idea... BUMRAP + 5/3/1 is one of the distributions that people are most interested in. Wouldn't it make sense to bother to post the results rather than forcing folks to wade through your web site?

I'm still a bit confused regarding the accuracy of the Goren 4/3/2/1 point count...

When Tysen provided standard error calculations for a variety of hand evaluation metrics he posted the following data:

R2 Standard Error

Zar + fit 0.74 1.05

HCP 0.65 1.21

It might be worthwhile to try to reconcile the difference...

inquiry · September 8, 2005

To copy the metric being developed in the other thread on DD evaluations (which is starting to generate some interesting discussion), let's see how it applies to shapes as a whole.

Given the shape of our hand, what is the chance that we have a game?
Shape [space] [space]Game? [space]531 [space] [space]Zar
4-3-3-3 [space] 25%
4-4-3-2 [space] 26%
5-3-3-2 [space] 27%
5-4-2-2 [space] 28%
6-3-2-2 [space] 30%
4-4-4-1 [space] 32% [space] <--
5-4-3-1 [space] 33% [space] <--
7-2-2-2 [space] 34% [space] <-- [space] [space]<--
6-3-3-1 [space] 34% [space] <-- [space] [space]<--
6-4-2-1 [space] 37%
5-5-2-1 [space] 38% [space] [space] [space] [space] [space]<--
7-3-2-1 [space] 38%
5-4-4-0 [space] 45% [space] [space] [space] [space] [space]<--
5-5-3-0 [space] 46%
6-4-3-0 [space] 47%
6-5-1-1 [space] 49%
7-4-1-1 [space] 50%
6-5-2-0 [space] 54%
7-4-2-0 [space] 55%
The first set of arrows shows hands that a 5-3-1 system counts as "equivalent." The second set of arrows shows some hands that Zar counts as equivalent. Which set looks more tightly clustered to you?

I am not exactly sure you want to compare ZAR initial evaluation to BUMRAP 5+3+1 using the data you created in the other thread, or BUMRAP will suffer by the comparison. But since you do, ok.. here goes.

First the table you show here is if GAME can be made. This depends upon satistical evaluation of the chance of fit, whether the fits found are in the major or not, etc. For example, in the table you quote so happily that BUMRAP 531 bundles numbers close together, you ignore the fact that the bundling in game is a mixture of shapes.. take the 5440 shape, you quote it as 45%. When in fact, your data suggest it is either 49% (if both majors) or 42% (if 54 is in the minors). As you correctly point out in the other thread, the difference here is related to the fact that with 54 in the minors, if you have a minor fit you need to take one more trick. But even when 54 in the major, some percent of the hands will make game in the suit with the four card major (or NT). So this is not an easy evalaution to make. And BUMRAP 531 and ZAR seperate paths once the bidding has progressed, ZAR will, if fit is found, with a 5440 hand ballon up by at least 3 more points, and possibly by 9. That is between 0.5 and 1.5 additional tricks. And if no fit is found, ZAR evalation on this hand might shrink. So for "GAME" evalation, you have to take into account statisitical probabilties of fits, and what affect such fits would have on the evaluation of these patterns (Alone that is interesting enough, I have toyed with it using your data to see if ZAR's "correction" factors for FIT and MISFIT are close).

But instead lets just deal with the concept of "DISTRIBUTIONAL POINTS" from the hand patterns in isolation. To do so, don't use the GAME % data (which is a conconction of major versus minor, fit versus no-fit). Let;s just take the overall trick taking potential into account. To do this, we make the following assumptions (I like it when assumptions are given so all can agree or disagree). Bumrap distribiton is +1 for each card over four in a suit, +1 for doubleton, +3 for singleton and +5 for void. ZAR BASIC distribution point is twice the longest suit, plus the difference between the second longest and the shortest. For ZAR points, one trick is worth FIVE POINTS, for BUMRAP 5+3+1 one trick is worth 2.5 points.

With these assumptions. we further agree that 4333 pattern is the base hand pattern. For BUMRAP 531 this is worth 0 Distributional points. For ZAR, this is worth 8. For calculation purposes, we will subtract 8 ZAR points from this pattern and all other patterns. This is to determine how many ZAR points more (the trick taking potential) that hand pattern is compared to the worse holding. For BUMRAP we will calculate the points as above. Futher, we will divide the ZAR points (minus the base of 8) by five to determine the "trick taking potential" of the hand pattern. We will divide BUMRAP 5+3+1 by 2.5 to determine the same value. To detemine the number of tricks, we then add the "trick corrected" values for ZAR points and Bumrap 5+3+1 to 7.8 (the number of tricks taken with 4333 hands) and compare the results with the observed trick taking potential of each distribution.

The number of tricks present in each hand pattern will be the number you determined on your investigation. This is wthout regard to the which suits have the pattern. Here is the data....

Pattern[space][space][space][space]Trick[space][space][space][space]+trk[space][space][space][space]ZAR[space][space][space][space]Ztri[space][space][space][space]Bum[space][space][space][space]Btric
4-3-3-3[space][space][space][space]7.80[space][space][space][space]0.00[space][space][space][space]8.00[space][space][space][space]0.00[space][space][space][space]0.00[space][space][space][space]0.00
4-4-3-2[space][space][space][space]8.09[space][space][space][space]0.29[space][space][space][space]10.00[space][space][space][space]0.40[space][space][space][space]1.00[space][space][space][space]0.20
5-3-3-2[space][space][space][space]8.14[space][space][space][space]0.34[space][space][space][space]11.00[space][space][space][space]0.60[space][space][space][space]2.00[space][space][space][space]0.80
5-4-2-2[space][space][space][space]8.41[space][space][space][space]0.61[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]1.20
6-3-2-2[space][space][space][space]8.51[space][space][space][space]0.71[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]1.60
4-4-4-1[space][space][space][space]8.62[space][space][space][space]0.82[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]1.20
5-4-3-1[space][space][space][space]8.69[space][space][space][space]0.89[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]1.60
6-3-3-1[space][space][space][space]8.78[space][space][space][space]0.98[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]5.00[space][space][space][space]2.00
7-2-2-2[space][space][space][space]8.91[space][space][space][space]1.11[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
6-4-2-1[space][space][space][space]9.02[space][space][space][space]1.22[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]6.00[space][space][space][space]2.40
5-5-2-1[space][space][space][space]9.03[space][space][space][space]1.23[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
7-3-2-1[space][space][space][space]9.14[space][space][space][space]1.34[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]6.00[space][space][space][space]2.40
5-4-4-0[space][space][space][space]9.38[space][space][space][space]1.58[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
5-5-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]7.00[space][space][space][space]2.80
6-4-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]7.00[space][space][space][space]2.80
8-2-2-1[space][space][space][space]9.57[space][space][space][space]1.77[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
6-5-1-1[space][space][space][space]9.61[space][space][space][space]1.81[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]9.00[space][space][space][space]3.60
7-3-3-0[space][space][space][space]9.65[space][space][space][space]1.85[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]8.00[space][space][space][space]3.20
7-4-1-1[space][space][space][space]9.67[space][space][space][space]1.87[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
8-3-1-1[space][space][space][space]9.83[space][space][space][space]2.03[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space][space]10.00[space][space][space][space]4.00
6-5-2-0[space][space][space][space]9.88[space][space][space][space]2.08[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
7-4-2-0[space][space][space][space]9.89[space][space][space][space]2.09[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space][space]9.00[space][space][space][space]3.60

A quick examination of this table shows that ZAR Basic point initial evalation is much closer to trick taking potential of the VAST majority of the hands than bumrap +531. In fact, bumrap estimate of the number of tricks is closer to the actual number of tricks on only one hand pattern 4432, where it esitiamte 0.2 tricks, while ZAR estimated 0.4 tricks. The pattern was worth 0.29 tricks so ZAR was off by 0.11 tricks, bumrap by 0.09 tricks. And of course, both methods by definition tied for 4333 patterns. In all over cases, ZAR is closer, and usually much closer to the trick taking potential.

For instance if we examine the last row of data, we find that 7-4-2-0 is 18 zar points (7*2 + 4 -0). When we subtract the base of 8 this is 10 ZAR points above the base distribution. 10 Zar points is 2 tricks (10/5 = 2). So when we add 2 to the base trick of the 4333 hand, we calculate this as 9.8 tricks (7.8 for 4333 plus two is 9.8). The actual tricks you calculated for 7420 was 9.89. So ZAR base estimate is "off" by 0.09 tricks. While BUMRAP + 5+3+1 would add 3 points for the 7 card suit, 1 point for the doubleton and 5 points for the void. That totals 9 points. Then we divide the 9 point by 2.5 to discover that BUMRAP 5+3+1 predicts this hand is worth 3.6 tricks more than the base hand pattern. So it over estimates by slighly more than 1.5 tricks the "power of this hand".

Anyone looking at the column of number will see that for "initial" evaluation of the hand patterns for trick taking potential, ZAR BASE points handily does a much better job than 5+3+1.

Now if you go back to the old standard 3+2+1 which you abandoned, it would work better for you, but you realize from your own study that 321 is not aggressive enough when it comes to biddign game and slams. So why is it that 531 is "better" when clearly it over-estimates the power of the hand pattern in isolation (as shown here by your own data)?

The answer is clear and which is why ZAR method works better, and it can be addressed in looking at the power of such odd little hand patterns as 5440. A 5440 hand pattern is worth "only" 14 zar (about 1.2 tricks) while BUMRAP 531 evaluates this as 6 points (or 2.4 tricks). 5440 is Way up on the game taking potential list. Much higher than reflected by the 1.2 trick shown by ZAR. But this is a statistical probability thing. If you are 5440 what are the changes you have an 8 card or better fit in one of you suits. If you find a fit, ZAR will automatically add from 3 to 9 more points to this hand, lets say an average of six. So he jumps up by another trick to the 2.2 range.And if a superfit is found then with this distribution it can be even more. So with ZAR, when a fit is found, hands like 5440 can upgrade dramatically. Consider 7420. This pattern is worth 18 zar (2 tricks). But if you open the sevn card suit and partner raises, you gain 6 more points for the void and 1 point for the doubleton (although superfit points might be more valuable) and you 2 trick evalation becomes worth more than 3. There is a similiar up evalation if you partner bids your four card suit.

So it turns out ZAR is "as aggressive" and in fact, more aggressive than 531 when fits are found, but are (excuse me for this), safer (sane?) when no fit is found or a misfit exist.

So the question for the readers and the method developers is, is it better to stetch the intial evaluation by using 531 (instead of 321 which more accurately reflect the trick taking potential of these patterns) so as to "guess" at the potential if a fit is found, or is it better to start with an accurate estimate of the potential of each hand, and then "Adjust" as fits are found.

A case in point, the lowly 4441 hand. This is worth only 11 zar points (11-8 = 3 or 1/2 trick), but also 3 bumrap points, or slightly more than a trick. But notice what happens when the fit is found (not a sure thing, but a statistical probability). Now zar will add 2 points for the singleton, so his method approaches 1 full trick. And if this hand opened, there is a great chance it will have an honor (A, K, Q, J, T) in the suit fit, for another point or two. So ZAR becomes the same as BUMRAP distributionally here. In fact, when you see this hand and count your 11 zars, you can almost mentally add another two full ZAR points, becasue better than 9 times out of 10 you will get those.

In fact, the "Statistical" chances for a fit make looking at which hand patterns are out of line with ZAR and BUMRAP 531 in the game percentage calculations make for some interesting evalations. If you are 5440 for instance, would you stick with the mear 14 ZAR count? What is the statistical probability you will have instead of 14, 17 or 20 ZAR points. The chances are fairly good. 20 ZAR points is 2.5 tricks, BUMRAP 531 seems to assume fit, as it counts this hand as 2.4 distributional tricks from the start. So could the real difference between ZAR and bumrap 531 be how well, statistically bum rap predicts fits? If fit exsit, bumrap is ok? Something to ponder.

Ben

cherdano · September 8, 2005

Ben, first about some naming confusion: BUMRAP is just the 4.5-3-1.5-0.75-0.25 hcp scale, no distribution. BUMRAP + 531 is this plus 5 for void, 3 for single, 1 for doubleton (no points for lengths). If you do BUMRAP +531 + points for lengths you completely overvalue distribution.

Then there is TSP, which uses IIRC the same point scale for honors as Zar, and 531+length points for distribution (plus other stuff). So what your table shows are the TSP distribution points, not BUMRAP.

Where did you take this number 2.5 from, by which you divided the TSP distribution points? It seems completely random, and one look at your table confirms that it is non-sense. I think you should use 5 if I understand Tysen's RGB post correctly. This also makes sense since, as you have pointed out a couple of times, TSP distribution points and Zar points are pretty close.

Arend

mikestar · September 8, 2005

Pattern[space][space][space][space]Trick[space][space][space]+trk[space][space][space][space]ZAR[space][space][space][space][space][space]Ztri[space][space][space][space]TSP[space][space][space][space]Ttric[space][space][space][space]135[space][space][space][space][space]135tric
4-3-3-3[space][space][space][space]7.80[space][space][space][space]0.00[space][space][space][space][space]8.00[space][space][space][space]0.00[space][space][space][space]0.00[space][space][space][space]0.00[space][space][space][space]0.00[space][space][space][space]0.00
4-4-3-2[space][space][space][space]8.09[space][space][space][space]0.29[space][space][space][space]10.00[space][space][space][space]0.40[space][space][space][space]1.00[space][space][space][space]0.20[space][space][space][space]1.00[space][space][space][space]0.33
5-3-3-2[space][space][space][space]8.14[space][space][space][space]0.34[space][space][space][space]11.00[space][space][space][space]0.60[space][space][space][space]2.00[space][space][space][space]0.40[space][space][space][space]1.00[space][space][space][space]0.33
5-4-2-2[space][space][space][space]8.41[space][space][space][space]0.61[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]0.60[space][space][space][space]2.00[space][space][space][space]0.67
6-3-2-2[space][space][space][space]8.51[space][space][space][space]0.71[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]0.80[space][space][space][space]2.00[space][space][space][space]0.67
4-4-4-1[space][space][space][space]8.62[space][space][space][space]0.82[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]0.60[space][space][space][space]3.00[space][space][space][space]1.00
5-4-3-1[space][space][space][space]8.69[space][space][space][space]0.89[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]0.80[space][space][space][space]1.00[space][space][space][space]1.00
6-3-3-1[space][space][space][space]8.78[space][space][space][space]0.98[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]5.00[space][space][space][space]1.00[space][space][space][space]3.00[space][space][space][space]1.00
7-2-2-2[space][space][space][space]8.91[space][space][space][space]1.11[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]1.20[space][space][space][space]3.00[space][space][space][space]1.00
6-4-2-1[space][space][space][space]9.02[space][space][space][space]1.22[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]6.00[space][space][space][space]1.20[space][space][space][space]4.00[space][space][space][space]1.33
5-5-2-1[space][space][space][space]9.03[space][space][space][space]1.23[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]1.20[space][space][space][space]4.00[space][space][space][space]1.33
7-3-2-1[space][space][space][space]9.14[space][space][space][space]1.34[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]6.00[space][space][space][space]1.20[space][space][space][space]4.00[space][space][space][space]1.33
5-4-4-0[space][space][space][space]9.38[space][space][space][space]1.58[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]1.20[space][space][space][space]5.00[space][space][space][space]1.67
5-5-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]7.00[space][space][space][space]1.40[space][space][space][space]5.00[space][space][space][space]1.67
6-4-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]7.00[space][space][space][space]1.40[space][space][space][space]5.00[space][space][space][space]1.67
8-2-2-1[space][space][space][space]9.57[space][space][space][space]1.77[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]1.80[space][space][space][space]5.00[space][space][space][space]1.67
6-5-1-1[space][space][space][space]9.61[space][space][space][space]1.81[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]9.00[space][space][space][space]1.80[space][space][space][space]6.00[space][space][space][space]2.00
7-3-3-0[space][space][space][space]9.65[space][space][space][space]1.85[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]8.00[space][space][space][space]1.60[space][space][space][space]5.00[space][space][space][space]1.67
7-4-1-1[space][space][space][space]9.67[space][space][space][space]1.87[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]1.80[space][space][space][space]6.00[space][space][space][space]2.00
8-3-1-1[space][space][space][space]9.83[space][space][space][space]2.03[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space]10.00[space][space][space][space]2.00[space][space][space][space]6.00[space][space][space][space]2.00
6-5-2-0[space][space][space][space]9.88[space][space][space][space]2.08[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]1.80[space][space][space][space]6.00[space][space][space][space]2.00
7-4-2-0[space][space][space][space]9.89[space][space][space][space]2.09[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space][space]9.00[space][space][space][space]1.80[space][space][space][space]6.00[space][space][space][space]2.00

I have added entires to the table for simple 135 at 3 points per trick. The distribution points Ben is calculating under BUM RAP are actually TSP distribution points and the scale is 5 points per trick. I have recalculated the trick values accordingly. As you see, TSP tracks quite closely with ZAR.

All of these distribution counts are improvements over 123. There really is little difference among them. Where ZAR, TSP, and BUM RAP pick up their big gains over Goren is the more accurate evaluation of honors for suit contracts.

hrothgar · September 8, 2005

I am not exactly sure you want to compare ZAR initial evaluation to BUMRAP 5+3+1 using the data you created in the other thread, or BUMRAP will suffer by the comparison. But since you do, ok.. here goes.

First the table you show here is if GAME can be made. This depends upon satistical evaluation of the chance of fit, whether the fits found are in the major or not, etc.

Ben

I think that you are missing Tysen's point:

Tysen provided a set of 4 distributions, each of which is worth a total of 14 Zar points.

He also provided a set of 4 distributions which are are worth 3 distributional points using a 5/3/1 scale. This relationship has nothing to do with whether or not game can be made. It is a simple function of the shape of the hand.

Having created a set of hand that are treated identically by a given evaluator, it seems reasonable to examine these to determine whether there is any kind of clustering. In order to do so, its necessary to impose some kind of "order" to the data... Tysen has suggested ordering the hand shapes based on the double dummy analysis that he provided in another thread.

Please feel free to suggest alternative orderings if you think that this will add to the discussion.

inquiry · September 8, 2005

Ben, first about some naming confusion: BUMRAP is just the 4.5-3-1.5-0.75-0.25 hcp scale, no distribution. BUMRAP + 531 is this plus 5 for void, 3 for single, 1 for doubleton (no points for lengths). If you do BUMRAP +531 + points for lengths you completely overvalue distribution.
Then there is TSP, which uses IIRC the same point scale for honors as Zar, and 531+length points for distribution (plus other stuff). So what your table shows are the TSP distribution points, not BUMRAP.

Where did you take this number 2.5 from, by which you divided the TSP distribution points? It seems completely random, and one look at your table confirms that it is non-sense. I think you should use 5 if I understand Tysen's RGB post correctly. This also makes sense since, as you have pointed out a couple of times, TSP distribution points and Zar points are pretty close.

Arend

For the naming post, Tysen's post have always confused me. His evalautoin method seems slippery to me (at least when I read it), because it is always changing. I did a TSP study and the then changed the number of "points" needed for different levels. It was like 39 or something. HE has changed from pushing blinky, to bumrap, to bumrap 321 to bumrap +531, and his TSP thing. He ahs been talking exclusively lately about bumrap 531... and richard calls for bumrap 531 as well, so bumrap i used...

This is why I spelled out what my creteria were, so lets see if I can clear up some of the problems...

Problem 1, Now why did I add the point for both long and short suits? Because my reading of tysens post suggest this is what he is calling bumrap 531, probably you are right, his BUMRAP + 5-3-1 plus point for cards over 4 in a suit is probably his TSP (or maybe his TSP *version 9.32). This is why I spelled out exactly what criteria was being used to calculate the numbers (notice the difference here, where I said how I calcuculated everything, you could say "WAIT.. that is not right... this is different from how Tysen calculated ZAR points in a number of his studies.. he showed the results in this thread and said "see ZAR doesn't work" when in fact he failed to do the calculations are described by ZAR". I will repost the data with BUMRAP 5+3+1 without the addition for long suits (which would be TSP I guess)...

Problem one, where did I get 2.5 I divided by. Fair question. Tysen argues that 25 points is enough for game with goren points, 25 points divided by ten tricks is 2.5 points per trick. Tysen also agues that Zar has been that it's slightly more complicated to calculate than BUM+531 and it's not any more accurate. Plus it uses a completely different "scale" which some people don't want to use and makes it more difficult to explain to opponents . So since Tysen's arguement is BUMRAP uses the same scale, 2.5 points per trick seems not arbitrary at all. For ZAR, he states clearly he thinks 5 points per level in his documents, which explains the division by five in his case.

Here is the correct BUMRAP 531 with no correction for "long suits". As you can see Bumrap wins on 4432 (by 0.02) and the 5332 distribution (by 0.14). ZAR is more "accurate" on all the other distirubtions, and often by a large margin. This time, bumrap 531 underestimates rather than overestimate in the last post where I followed what I though tysen was doing by ADDING long points too. Since Tysen does add long points (must be with TSP as you suggest), if you use the 4.5 scale for an ACE with TSP, you should be back to the "normal" scale where you can divide by 2.5 again, but that isn't clear...But clearly ZAR is dong a better job here by these method (pseudo TSP if that is what it was in last post by me, or fair BUMRAP 5+3+1 here). Also note, when you actually start really counting distribution (who gets excited about 4432 or 5332), BUMRAP 531 not only loses to ZAR, another feature is that BUMRAP 531 underestimates the value of the distibution for EACH of the distributions

So, it seems to me, as it always has, that ZAR BASIC points give you a good evaluation to start off (demonstratable better here than bumrap 531, or how I think tysen does what i think now he might call TSP). I am not sure what pluses and minuses people take with fits but the hands where ZAR is "low" zar can quickly go high on if a fit is found. The difference between the half a point zar is low on these hands (like 5440) is no doubt the 3 points that is auto matic if a fit is found.

Pattern[space][space][space][space]Trick[space][space][space][space]#NAME?[space][space][space][space]ZAR[space][space][space][space]Ztri[space][space][space][space]Bum[space][space][space][space]Btric
4-3-3-3[space][space][space][space]7.8[space][space][space][space]0[space][space][space][space]8[space][space][space][space]0[space][space][space][space]0[space][space][space][space]0
4-4-3-2[space][space][space][space]8.09[space][space][space][space]0.29[space][space][space][space]10[space][space][space][space]0.4[space][space][space][space]1[space][space][space][space]0.2
5-3-3-2[space][space][space][space]8.14[space][space][space][space]0.34[space][space][space][space]11[space][space][space][space]0.6[space][space][space][space]1[space][space][space][space]0.2
5-4-2-2[space][space][space][space]8.41[space][space][space][space]0.61[space][space][space][space]12[space][space][space][space]0.8[space][space][space][space]2[space][space][space][space]0.4
6-3-2-2[space][space][space][space]8.51[space][space][space][space]0.71[space][space][space][space]13[space][space][space][space]1[space][space][space][space]2[space][space][space][space]0.4
4-4-4-1[space][space][space][space]8.62[space][space][space][space]0.82[space][space][space][space]12[space][space][space][space]0.8[space][space][space][space]3[space][space][space][space]0.6
5-4-3-1[space][space][space][space]8.69[space][space][space][space]0.89[space][space][space][space]13[space][space][space][space]1[space][space][space][space]3[space][space][space][space]0.6
6-3-3-1[space][space][space][space]8.78[space][space][space][space]0.98[space][space][space][space]14[space][space][space][space]1.2[space][space][space][space]3[space][space][space][space]0.6
7-2-2-2[space][space][space][space]8.91[space][space][space][space]1.11[space][space][space][space]14[space][space][space][space]1.2[space][space][space][space]3[space][space][space][space]0.6
6-4-2-1[space][space][space][space]9.02[space][space][space][space]1.22[space][space][space][space]15[space][space][space][space]1.4[space][space][space][space]4[space][space][space][space]0.8
5-5-2-1[space][space][space][space]9.03[space][space][space][space]1.23[space][space][space][space]14[space][space][space][space]1.2[space][space][space][space]3[space][space][space][space]0.6
7-3-2-1[space][space][space][space]9.14[space][space][space][space]1.34[space][space][space][space]16[space][space][space][space]1.6[space][space][space][space]4[space][space][space][space]0.8
5-4-4-0[space][space][space][space]9.38[space][space][space][space]1.58[space][space][space][space]14[space][space][space][space]1.2[space][space][space][space]5[space][space][space][space]1
5-5-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]15[space][space][space][space]1.4[space][space][space][space]5[space][space][space][space]1
6-4-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]16[space][space][space][space]1.6[space][space][space][space]5[space][space][space][space]1
8-2-2-1[space][space][space][space]9.57[space][space][space][space]1.77[space][space][space][space]17[space][space][space][space]1.8[space][space][space][space]5[space][space][space][space]1
6-5-1-1[space][space][space][space]9.61[space][space][space][space]1.81[space][space][space][space]16[space][space][space][space]1.6[space][space][space][space]6[space][space][space][space]1.2
7-3-3-0[space][space][space][space]9.65[space][space][space][space]1.85[space][space][space][space]17[space][space][space][space]1.8[space][space][space][space]5[space][space][space][space]1
7-4-1-1[space][space][space][space]9.67[space][space][space][space]1.87[space][space][space][space]17[space][space][space][space]1.8[space][space][space][space]6[space][space][space][space]1.2
8-3-1-1[space][space][space][space]9.83[space][space][space][space]2.03[space][space][space][space]18[space][space][space][space]2[space][space][space][space]6[space][space][space][space]1.2
6-5-2-0[space][space][space][space]9.88[space][space][space][space]2.08[space][space][space][space]17[space][space][space][space]1.8[space][space][space][space]6[space][space][space][space]1.2
7-4-2-0[space][space][space][space]9.89[space][space][space][space]2.09[space][space][space][space]18[space][space][space][space]2[space][space][space][space]6[space][space][space][space]1.2

September 8, 2005

Just curious, do any of you actually use these methods at the table, or do you just use your judgement?

Zar · September 8, 2005

>

Okay, then is this 2 or 4 superfit points if there is no opener?

xxxxx

x

xxx

xxxx

xxxxx

xx

xxx

<

You say 2 or 4 ... well, it is 3 actually. WHY? Because each partner calculates his values INDEPENDENTLY. You don’t have to exchange any information to do that. So N has 1 additional trump on top of the promised 4 for either opening or responding and S has one by himself regardless of opening or responding. N has a singleton so HIS supertrump is valued at 2 and S has a doubleton so HIS value of the supertrump is 1. And 2 + 1 = 3 easily :-)

>

The idea of such a measure is that you can look at your hand and compute some number. Your partner looks at his hand and computes some number.

<

Not every method provides that simplicity though. In Lawrence you have to EXCHANGE information about your shortest suits so the partners can actually be able to calculate the critical value of (13 – d1 – d2) AFTER exchanging these d1 and d2 values with special bids.

That’s why lots of experts consider Lawrence Points to be of theoretical value only. For the purposes of the different comparisons tough, I ignore this issue completely – I just assume that you somehow know.

In Zar Points everything is independent so it is easy for at-the-table use.

>

By combining these two numbers together, without any other information about the hands, we can decide whether we have game, whether we have slam, and so forth with a reasonable degree of accuracy.

<

IF the method provides such independency – see above.

>

The goal in bridge is to find your best contract. The problem is, you don't have enough bidding space to exactly describe every card to partner.

<

Not literally true.

One of my partners has proven mathematically that ALL the 52 cards of the 4 players can be known 100% by the “6 Clubs” bib-pip IF all the 4 players cooperate. That means if they bid via a specially-designed bidding system that cooperates among all the 4 players towards the common goal of revealing ALL the 52 cards.

Unfortunately, in bridge the nature of the game is not cooperation between the opponents, but rather tough battle for capturing the bidding space.

>

Zar has designed a method of hand evaluation which he believes to be good.

<

We all try to believe in what we are doing :-)

>

This method counts as follows: Start with your standard 4-3-2-1 point count. Add two points for each ace and one for each king (controls). Now add the sum of the lengths of your two longest suits. Now add the difference of your longest suit length and shortest suit length. This gives you the "number" described above.

<

Is is actually essential to realize that this (a – d) difference between your longest and your shortest suit does NOT come out of the blue but is rather the SUM of all 3 differences of the 4 ordered-by-length suits (a – :rolleyes: + (b – c) + (c – d). This will make you understand the logic behind easier.

>

Zar claims that you should bid game if your number plus your partner's is 52.

<

Actually it’s Culbertson rather than Zar that makes that claim.

His rule states that “Two opening hands make a Game” (IF they have a fit). And an opening hand contains 26 Zar Points (now THAT’s the claim of Zar Points :-)

As I mentioned once before, Zar Points encapsulated the COMBINED REQUIREMENTS of WBF for an opening hand:

- The Rule of 18;

- The Rule of the Queen.

That’s something no other method comes close to and that’s the reason why Zar Points constitute the absolute minimum for a “Legally”- opening hand.

>

(1) Zar's method of counting distribution by adding/subtracting suit lengths is actually less accurate than adding points for singletons and voids.

<

You are free to believe that :-) Some people still believe that the Earth is flat :-)

>

(2) Do the additional fit/misfit points Zar adds when distribution is known accurately reflect these sorts of features? Here less seems to be known; Tysen points out that Zar's scheme of adding fit/misfit points seem to weigh things differently depending on who opens (with two identical hands) which seems kind of odd.

<

See above comments for that INDEPENDENCE issue.

As for the accuracy, there are already SEVERAL different approaches that and the results are pretty much in-line.

>

(3) There is also some debate about what is simple to compute at the table, and what is reasonably simple to explain to opponents.

<

I think that what you presented would be a good summary of what this discussion thread is about indeed.

ZAR

P.S I’ll catch-up with the rest of the replies later today.

awm · September 8, 2005

Just curious, do any of you actually use these methods at the table, or do you just use your judgement?

Most of the time, nope, don't use them. But occasionally with a "borderline" hand I will evaluate ZAR points, or I will consider some of the other factors from these threads. Some of the things I've learned from this discussion:

(1) I've tried passing more 4333 twelves, particularly those poor in controls. Usually I have obtained good results from this when I do it.

(2) I've been opening more aggressively with length in the majors even on balanced hands. So I open a lot of 11-counts with 4-4-3-2 and not many 11-counts with 3-2-4-4. This has also worked out pretty well.

I should note that a lot of the discussion here actually justifies methods that I've been playing since before I ever read about ZAR points etc.

hrothgar · September 8, 2005

Just curious, do any of you actually use these methods at the table, or do you just use your judgement?

In all honesty, I make the most use of evalutors like this when I am developing scripts to simulate bidding systems. The metric that I use most often for scripting is K+R.

When I am playing at the table or writing system notes I prefer to rely on "judgement". Case in point: The most recent version of my MOSCITO discusses the minimum strength necessary for a "contructive" opening. After going back and forth for a while, I felt that the most accurate way to go was to provide a set of 5 minimum strength unbalanced hands and 5 minimum strength balanced hands that looked to be right on the "edge" of a constructive opening where I'd prefer not to pass...

With this said and done, it is necessary to have some way to communicate judgement to third parties which is where "simple" metrics like BUMRAP + 5/3/1 shine

tysen2k · September 8, 2005

Problem one, where did I get 2.5 I divided by. Fair question. Tysen argues that 25 points is enough for game with goren points, 25 points divided by ten tricks is 2.5 points per trick.

Okay, I can see where you got this, but there is no need for 0 points to equal 0 tricks. Using 2.5 points per trick would also predict that you would need only 30 points for slam and 32.5 for a grand. That's obviously off. The real value should be slightly bigger than 3 I'd say. And the scale for TSP (which uses the length points too) is 5 points per trick.

Also, Ben, you've recreated a table I used way back on page 5 of this super-long thread. Same metrics using tricks.

inquiry · September 8, 2005

Classifying a distribution by ZAR points as "14" and then classify as to the % chance random hands make game is an interesting exercise, but not for the reason tysen supposes. Let's take the four ZAR 14 point distrubutions that tysen quoted as an example.

6331

7222

5521

5440

Each of these represent 14 ZAR distributional point. But the potential of these hands are quite markedly different using ZAR points. With 6331, your potenial upside once fit is found is fairly low. If partner fits your six card suit, and you might get +2 more ZARs for 16 with fit. If either of your three card suits fit a long suit with partner, you will get +2 as well. It is fairly likely you will find a fit of sorts, so this is 14 with fair chance of 16.

Take the second one, here partner needs just one in your seven card suit to have eight card fit, and there is good chance he will have two. So you can in theory take a point for each doubleton in this case. So this is 14 with a reasonable chance to go up to 17.

That the third one. Now, if you find a fit, in either 5-5 suit, it might be a superfit (if it is partners suit), or regular fit. For your singleton you get two points, for your doubleton you get one point, but with five in the fit suit, there is a chance you could get two points for the doubleteon and four points for the singleton. So this hand is 14 and can be as high as 20. Note also, now if you DONT fit for either of the five-five suits, this can become a misfit, and the 14 will actually plummet in value (say partner is 5521 but doesn't match). Now instead of being worth 14, this is worth only 5 pts or so. The odds favor, obviously you will find an acceptible fit, but the range here is from (5)14-20.

The last one is 14, but you have wonderful chances of finding a fit. Now the void can be worth from 3 points to 9 points. There is much less chance you can have a total misfit, but if you do, it is a whooper, and can drop your DP to ZERO. The rareness of the huge misfit, makes this (0)14-21.

At this point someone good as distributions and stats could, if they are able, calculate the probabilities of fits, no fits, superfit facing each of these and stastically figure out how much better is the last hand (14 points with small chance of downside misfit and huge pluses when fit is found) is than the other hands. But with ZAR, these 14 points are not created equally. There is an upside to each distribution as given above, an upside that is obvious in both there "trick taking potential" when opened and in their potential to make game.

HAND   +TR      Game%
6331    0.98     34%
7222    1.11     34%
5521    1.23     38%
5440    1.58     45%

To use ZAR (or any evaluation method) without "re-evaluating" as the auction develops is simply not a good idea. My arguement agaist tysen's approach was and always has been, he didn't do the re-evalution (Fit and Misfit). As you can see from this illustration, the "potential" of these hands are quite different, and that potential is realized if a fit is found. And finding a fit increase the ZAR count, which not surprising increases the tricks and games.

The mistake tysen, and many people make, in evaluating ZAR is not realizing this potential, and investigate the accuracy of the adjustments. They look as say "14" next please. You can do the same with other ZAR hands.. which hand would you think has more "potential" to find a fit and if so be upgrade among these 15 point ZAR hands? 6-4-2-1 or 5-5-3-0

How about among these 12 point ZAR hands? 4-4-4-1 or 5-4-2-2?

Zar says somewhere something like "it is the difference that creates the potential". Bridge is a game of statistics and chance.. like what are the odds you find a missing honor when it finessee, and in these cases, what are the odds you will find a fit with these patterns, and finding one, how much does your hand up-evaluate. I never had a problem determine if a hand was an opening bid.. I did that very well before I ever heard of ZAR. Where I hope ZAR helps is by how much to re-evaluate the hand as the bidding continues. Any method that says, this is what a hand is worth before the bidding starts and that is it, will not work for me.

My assumption is that TSP takes a randomization approach to the evaluation of hands assuming a fit will be found (more or less). Thus it was better than straight ZAR because Tysen added an "fit points" (counting long and short suits from the beginning). Zar counts long suits (531 does it by short suits), but upon finding fit, ZAR then adds points for short suits (albeit with some fancy count the short suits points if...xyz). Now ZAR has furhter refined this calculation by a method that is often not practical (use unrefined when not practical). This is what needs to be evaluated, the static evalaution of bum531, of tsp, of goren, and the fluid one of ZAR. Maybe some zar=type ideas can be added to some other methods making them less static, maybe we can find more features that are worth pluses and minuses. That seems to be a hard concept for some to grasp....

tysen2k · September 8, 2005

Just curious, do any of you actually use these methods at the table, or do you just use your judgement?

I don't really either (is that sacrilegious?) :rolleyes:

Believe it or not, I'm not really interested in finding the perfect evaluator. What I am interested in is how evaluations change as you gather information from the bidding. So I need an accurate evaluator as a beginning so that I can measure changes.

I've found lots of interesting stuff in my studies and I've posted some of it here and on RGB. Stuff like:

How much better is Qxx in partner's suit better than Qxx in a side suit? Does Axx improve by the same amount or different?
How much should we adjust the value of Kxx if RHO opens the suit? If LHO opens it?
How the relative weight of high cards to shape changes. If partner is balanced, high cards gain more weight and shape loses importance. If partner is unbalanced, high cards lose weight and shape gains. The same is true if the opps are balanced/unbalanced, and to a different extent.

Tysen

tysen2k · September 8, 2005

My assumption is that TSP takes a randomization approach to the evaluation of hands assuming a fit will be found (more or less).

No, it just evaluates the average value opposite all the potential hands partner could have.

Here's the problem I have with your argument that my comparison doesn't take into account the potential of adjusting once a fit is found/not found. You can use the same adjusting principle with any evaluation scheme, not just Zar. No matter what system you use, you can adjust up and down the same way in order to improve it once you have more information.

The point of judging the initial values to get as close as you can to begin with so that the amount you have to adjust is kept to a minimum. Sometimes the opps interfere and you can't exchange all the info you'd like. So try to make the best guess you can now instead of relying on info you might never get.

Tysen

inquiry · September 8, 2005

Problem one, where did I get 2.5 I divided by. Fair question. Tysen argues that 25 points is enough for game with goren points, 25 points divided by ten tricks is 2.5 points per trick.
Okay, I can see where you got this, but there is no need for 0 points to equal 0 tricks. Using 2.5 points per trick would also predict that you would need only 30 points for slam and 32.5 for a grand. That's obviously off. The real value should be slightly bigger than 3 I'd say. And the scale for TSP (which uses the length points too) is 5 points per trick.

Also, Ben, you've recreated a table I used way back on page 5 of this super-long thread. Same metrics using tricks.

Well.. you took ZAR to task not once, but several times for using 26 points for game. You suggested he use 25 or maybe even 24 points for game. Now you are suggesting 3 points per trick, that would mean for 10 tricks 30 points (4 more than ZAR used and you complained about).

Now tell me, are you really being serious? BTW, BUMrap 531 already suffers badly using 2.5 points per trick by underestimating the distributional stregnth, if I divide by 3, the number of tricks taken would plummet, so this would be even worse, clearly for that metric.

hrothgar · September 8, 2005

The mistake tysen, and many people make, in evaluating ZAR is not realizing this potential, and investigate the accuracy of the adjustments. They look as say "14" next please. You can do the same with other ZAR hands.. which hand would you think has more "potential" to find a fit and if so be upgrade among these 15 point ZAR hands? 6-4-2-1 or 5-5-3-0

Maybe some zar=type ideas can be added to some other methods making them less static, maybe we can find more features that are worth pluses and minuses. That seems to be a hard concept for some to grasp....

Ben, the problem is not that your "point" is in any way difficult to grasp. Unfortunately you're wrong and you have a really annoying blind spot about it

Fit points, super fit points and the like are all well and good. However, NONE of this can be applied before the partnership learns whether or not they have a fit. Last I checked, Zar points were used to determine whether or not a pair should open...

My main point (and I suspect that Tysen would agree) is that we should start by using as accurate an evaluator as possible. Once this has been identified you can start adding complexity and evaluating different types of plastic evaluation.

inquiry · September 8, 2005

My assumption is that TSP takes a randomization approach to the evaluation of hands assuming a fit will be found (more or less).
No, it just evaluates the average value opposite all the potential hands partner could have.

Here's the problem I have with your argument that my comparison doesn't take into account the potential of adjusting once a fit is found/not found. You can use the same adjusting principle with any evaluation scheme, not just Zar. No matter what system you use, you can adjust up and down the same way in order to improve it once you have more information.

The point of judging the initial values to get as close as you can to begin with so that the amount you have to adjust is kept to a minimum. Sometimes the opps interfere and you can't exchange all the info you'd like. So try to make the best guess you can now instead of relying on info you might never get.

Tysen

Oh, I complete agree that hands should be adjusted. What I like about ZAR is he tries to stick a quantification on his. We all do adjustments. We discount stiff honors somewhat, we devaluate honors in a suit bid behind us, we get excited with big fits and "bid one more for the road". What justin calls "judgement". Judgement comes from experience. But sometimes there is an ususual collection of several small pluses, or some pluses and some minuses on a hand. How do you judge which is better or by how much the pluses push you forward. We all know a collection of Tens will help with opening 1NT maybe shaded a point. We all know we like our hcp in our long suits.

The question is can someone tell by how much extra these small pluses are? I am fully aware that applying judgement (or ZAR or anyone esles) correction factors can be done on any hand. Here is the problem yet again.. ZAR tells us how to do this, yet when running similations, you don't and then you tell us how much better some other methiod is. Apply the rules fairly (Anyone) and I will believe the data (As long as you make it publically available so I can spot test it).

Can you apply ZAR correction factors to your methods? Sure. Might Bumrap 531 with zar corrections beat ZAR base points with zar correct? Maybe. But someone,anyone (this includes ZAR), do the darn test with the corrections applied. I lack the computer skills to do this... but it must be doable (Zar has never subtracted misfit points in his large analysis for example).

Ben

Al_U_Card · September 8, 2005

Just curious, do any of you actually use these methods at the table, or do you just use your judgement?

If they do, you have to wonder about slow play penalties.....lol

Zar points, useful or waste of energy

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

hrothgar

awm

awm

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest Jlall

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation