Zar vs. TSP

tysen2k · June 9, 2004

Okay, prepare yourself for another massive data dump. The table below takes the 13,094 hands posted on my yahoo group, looks at how many Zar points there are and how many tricks are actually taken. I’ve also been able to isolate the situations where the points predict slam, but the partnership is missing some top tricks so that Blackwood, etc. would enable you to stay out of slam.

              ACTUAL TRICKS TAKEN
Zar+fit       9   10   11   12  13   Score  Ave
40            1    0    0    0   0     140  140
41            6    0    0    0   0     840  140
42           10    0    0    0   0    1400  140
43           22    0    0    0   0    3080  140
44           36    1    0    0   0    5210  141
45           87    5    0    0   0   13030  142
46          146   12    0    0   0   22480  142
47          195   29    1    0   0   32430  144
48          285   44    5    0   0   48380  145
49          387   91    7    0   0   71050  146
50          473  134   12    0   0   91400  148
51          478  232   34    2   0  113620  152
52          495  297   43    5   0  121740  145
53          526  341   89    5   1  159880  166
54          419  409  111   18   0  209420  219
55          381  404  140   14   0  220350  235
56          286  418  192   22   1  258730  282
57          235  408  255   45   1  296470  314
58          149  312  255   66   2  271040  346
59          105  275  282   86   6  281490  373
60           67  190  256   83   3  233020  389
61           48  199  226  107  15  241890  407
             
62+ no cntl  58  208  335    1*  0  123430  205
             
62            4   46  116  111  11   98290  341
63            2   23   99   98  11   94300  405
64            5   19   65   96  18  100810  497
65            0    8   64   97  18  107640  576
66            0    9   28   79  16   89480  678
              
67+ no cntl   8   20  118  159   0  139920  459
             
67            0    2    1   28  26   36560  641
68            0    1    5   10  17   23170  702
69            0    1    3    6  26   37560  1043
70            0    0    2   12  14   19940  712
71            0    0    2    6  11   15710  827
72            0    0    1    3  13   19180  1128
73            0    0    1    1   6    8710  1089
74            0    0    0    1   3    4480  1120
75            0    0    0    0   6    9060  1510
76+           0    0    0    2   4    5940  990
             
                            Total  3631270  275

These hands only include those which can take at least 9 tricks. There would be many more hands that take 8 or fewer if this were allowed. The SCORE is the sum total of points that would be won on these hands if bid to the level predicted by the points (not vul). AVE is the average score per hand. So for example, the 52-point Zar hands have 495 going down (-50) + 297*420 + 43*450 + 5*480 = 121740 points or an average of 145 points per hand. Note that if I had allowed hands that can only take 7 or 8 tricks, this score would be even lower as you include hands that go down multiple tricks. I was also generous in allowing the 57-61 point hands to bid only 4M and never 5M. The hands separated out under “62+ no cntl” means that there were 2+ top tricks missing and I assume the pair could stop at 5M. There is one hand in the bunch that has 2 top tricks missing, but the slam still makes since the defense can’t cash them due to blockage.

Now let’s look at the numbers for TSP:

              ACTUAL TRICKS TAKEN
TSP+fit       9   10   11   12  13   Score  Ave
26            3    0    0    0   0     420  140
27            8    0    0    0   0    1120  140
28           18    0    0    0   0    2520  140
29           39    1    0    0   0    5630  141
30           81    1    0    0   0   11510  140
31          142    8    0    0   0   21240  142
32          226   22    1    0   0   35580  143
33          356   50    2    0   0   58740  144
34          401   82    8    0   0   71680  146
35          502  129    9    0   0   94010  147
36          498  227   22    0   0  112710  151
37          543  308   35    3   0  136070  153
38          523  398   75    5   0  157030  157
39          421  417  104   11   0  206170  216
40          375  460  150   12   0  247710  248
41          240  420  224   19   1  274830  304
42          191  383  229   40   2  274580  325
43          118  340  278   70   1  296110  367
44           81  280  290   70   4  279690  386
45           57  195  275  107   3  255690  401
46           29  142  228   94   5  208460  419
47           22   99  212  105  13  192910  428
48           14   68  168  107   9  159410  436
               
49+ no cntl  13   51  173    0   0   71400  301
               
49            0   12   61  110  28  129430  613
50            3   11   50   99  15  104870  589
51            1   10   34   78  23   94470  647
52            2    1   20   71  27   94550  781
53            0    2   10   54  18   70000  833
               
54+ no cntl   0    5   30   78   0   73440  650
               
54                      2   12  16   22960  765
55                      1    7  20   29550  1055
56                      1    4  19   28190  1175
57                 1    0    2   7    9970  997
58                           1   7   10520  1315
59                           2   3    4430  886
60                      1    0   3    4230  1058
61                           2   2    2920  730
62+                              3    4530  1510
               
                            Total  3859280  294

TSP scores about 19 points per hand more than Zar. Most of these points come from Zar overbidding on many of the games and slams, even when all the controls are there. I limited the hands (by request) to 9+ tricks because that makes Zar look better. If we lower that requirement Zar looks even worse when it goes down multiple tricks.

Tysen

tysen2k · June 9, 2004

I'm just taking a closer look at the data...

Looks like others were right when they guessed TSP might be too conservative. Bumping the requirement for game down to 38 TSP gives some more points, bringing the average points up to 21 per hand better than Zar. There aren't enough hands to say what the slam ranges should be; I'll have to look at the larger database.

MickyB · June 9, 2004

This is great Tysen, just the sort of data I've been waiting to see. With regards to required point counts for game and slam, double dummy analysis cannot be relied on too heavily anyway.

inquiry · June 10, 2004

Tysen, this is nice data. I wonder if you could edit your data on the yahoo group site to reflect the ZAR fit and TSP fit points you used.

I will point out a flaw, however in your data. You assumed all the ocntracts were not vul, of course aggressive bidding pays off most at imps when vul, not vul. If you change the assumption to they were all vul, ZAR points wins. If you change the assumption that half the contracts would be "vul" and the other half not vul, the two system is a virtual tie. To do this, I used for making game, rather than 420, you 520 pts (average of 50% vul, 50% not vul). And for down one, rather than 50 you can use minus 75, etc.

If you take the data you posted (which I did), and apply that metric, the difference between TSP + fit and Zar + fit is 1.45 points per board. That is not imps, that is points per board. Now, TSP was still ahead, and maybe deservingly so. But I would like to examine the hands and the way you calculated ZAR and TSP points... and updating your file is the way to allow me us to do that. For example, I would like to look at, say, all the 52 or 53 ZAR point hands to see why your data is in such disagreement with ZAR's.

Ben

MickyB · June 10, 2004

Interesting, Ben. What happens if the target point count for both evaluators is optimised based on this data?

inquiry · June 10, 2004

I am not sure. The first problem is to see if Tysen applied Zar + fit points correctly. I am willing to make the assumption that he handled his TSP points correctly. So I want to spot check some hands to see how Tysen handled the fit points. If after checking 20 or 40 hands, the fit points are calculated correctly, I am prepared to accept the data as presented, and then try to figure out the ramifications. Oh, and for my recalculation of the data, I used the older 39 TSP for game, not the 38.

Ben

tysen2k · June 10, 2004

Tysen, this is nice data. I wonder if you could edit your data on the yahoo group site to reflect the ZAR fit and TSP fit points you used.

I will point out a flaw, however in your data. You assumed all the ocntracts were not vul, of course aggressive bidding pays off most at imps when vul, not vul. If you change the assumption to they were all vul, ZAR points wins. If you change the assumption that half the contracts would be "vul" and the other half not vul, the two system is a virtual tie. To do this, I used for making game, rather than 420, you 520 pts (average of 50% vul, 50% not vul). And for down one, rather than 50 you can use minus 75, etc.

I just used the simple fit method that Zar uses on his tests. +3 for each extra trump. Maybe extreme, but I wanted to use the same thing that Zar uses.

With regard to the vul/not thing. Yes I agree. The best thing to do would actually be to have different point requirements for bidding game depending on vulnerability (say 52 when vul, but 53 when not). However, it also depends on the scoring system since it doesn't pay to be as aggressive at matchpoints. I used total points here since it's something that everyone can actually see. IMPs and MP involve comparisons and can't easily be posted.

I pointed out on my original TSP post that the requirements for game, slam, etc. should be modified depending on the vul & scoring system.

inquiry · June 10, 2004

Tysen, this is nice data. I wonder if you could edit your data on the yahoo group site to reflect the ZAR fit and TSP fit points you used.

I will point out a flaw, however in your data. You assumed all the ocntracts were not vul, of course aggressive bidding pays off most at imps when vul, not vul. If you change the assumption to they were all vul, ZAR points wins. If you change the assumption that half the contracts would be "vul" and the other half not vul, the two system is a virtual tie. To do this, I used for making game, rather than 420, you 520 pts (average of 50% vul, 50% not vul). And for down one, rather than 50 you can use minus 75, etc.
I just used the simple fit method that Zar uses on his tests. +3 for each extra trump. Maybe extreme, but I wanted to use the same thing that Zar uses.

With regard to the vul/not thing. Yes I agree. The best thing to do would actually be to have different point requirements for bidding game depending on vulnerability (say 52 when vul, but 53 when not). However, it also depends on the scoring system since it doesn't pay to be as aggressive at matchpoints. I used total points here since it's something that everyone can actually see. IMPs and MP involve comparisons and can't easily be posted.

I pointed out on my original TSP post that the requirements for game, slam, etc. should be modified depending on the vul & scoring system.

Well, your Zar Fit points are wrong then. Zar only counts plus three points for fit if hte hand that is doing the counting also has a VOID. You get plus 2 points if the hand doing the counting has a singleton, and plus 1 point if it has a doubleton. This is explained clearly in ZAR's document. So for example if your partner opens 1S and you hold the following hand patterns..

xxxx xxx xxx xxx

You get no distributional fit points for the "ninth" spade

xxxx xx xxxx xxx

You get one point

xxxxx xx xxx xxx

You get two points (one for the fourth and one for the fifth spade, because you have a doubleton).

xxxx x xxxx xxxx

You get two zar fit points.

The way it sounds like you did it, was give 3 points for example 1 (instead of 0). 3 for example 2 (instead of 1)), 6 for example 3 (instead of 2), and 3 for the last one (instead of 2). If so, you are overestimating a fairly large number of ZAR hands. Essentially you are only RIGHT when the hand has a void.

Ben

tysen2k · June 10, 2004

Well, your Zar Fit points are wrong then. Zar only counts plus three points for fit if hte hand that is doing the counting also has a VOID. You get plus 2 points if the hand doing the counting has a singleton, and plus 1 point if it has a doubleton. This is explained clearly in ZAR's document. So for example if your partner opens 1S and you hold the following hand patterns..

I know this is the way that Zar fit point are usually counted. I have read the articles. But in all of Zar's computerized tests he uses the "simplified" Zar fit count of a straight +3 per trump. So I'm just doing it the same way he does so we can compare. There is also no bonus for honor's in partner's suit, etc.

mikestar · June 10, 2004

Tysen is correct. In all of Zar's articles "Zar+Fit" is based on +3 per extra trump. The presumably more accurate count where the addition varies with the length of the short suit is consistently called "Zar Ruffing Power" and he has not used it in his computer studies.

inquiry · June 11, 2004

Well Tysen maybe right, but that matters not a wit for me. When I do Zar points, I count fit honors, and fit points as he suggested in his article and misfit points the way Eric and he agreed here. I add to that the two quick trick check (absolutely necessary with Zar points that can sore to unbelievable heights). This is the way I check it, and I do the same with TSP. All these comparisions are valid if you are goint to test x versus y only if you use the system being advocated.

What good is it to simulate something that is not what is being advocated to use? This is directed to both Tysen and Zar. Tysen has done a lot of good work here with eliminating the hands with two quick losers for slam and one quick loser for grand slam. This is more than ZAR did I think. Just a little more work, and maybe he can have a test bed that really gets somewhere. I can tell a quick screening of the hands with 9 tricks and 52 and 53 Zar points counted his way (3 pts for each extra trump), many of these come in very less than 52. Also, a lot of these are misfit hands, were no fit exist but high ZAR totals show up. But if you start subtracting for less than adequate trump fit, the totals drop just as fast.

I guess my liking ZAR is because I use real world hands, and I apply the count the way I think it is advocated. Testing something else with Zar or Tsp, seems not a realistic approach. But as I have said many times, I am not a programmer...so maybe it is just too hard to do it right.

Ben

tysen2k · June 11, 2004

Also, a lot of these are misfit hands, were no fit exist but high ZAR totals show up. But if you start subtracting for less than adequate trump fit, the totals drop just as fast.

I did subtract 3 points for the Zar hands if there was no 8-card fit. I'm not sure if Zar did this for his studies or not.

inquiry · June 11, 2004

Also, a lot of these are misfit hands, were no fit exist but high ZAR totals show up. But if you start subtracting for less than adequate trump fit, the totals drop just as fast.
I did subtract 3 points for the Zar hands if there was no 8-card fit. I'm not sure if Zar did this for his studies or not.

That's another good start. Could you take down the old evaulator comparison zip file (which lacks, for instance TSP), and repost the data with the hands the current Zar+fit and TSP+fit is based upon?

Ben

Zar · June 11, 2004

Sorry I was away for awhile ...

I have no idea how we ended up calling Richard Pavlicek's points TSN or TSP or ...

Richard adds 1 pt for every suit length above 4 and the 1-3-5 points for shortness, which is exactly what we discuss here (besides the HP + CTRL or 6-4-2-1 for honors).

You can download the Richard Pavlicek article from his website www.rpbridge.com and the article has some coded name of “7Z70.pdf”.

The problem with the Pavlicek points is that it doesn't reflect the impotence of the flat 4333 pattern. Even Goren subtracts 1 pt for the 4333 shape, making the difference between the 4432 pattern and the 4333 be 2 points, while the difference between 5332 and 4432 is “virtually” non existent, each valued at 1 pt (no pattern actually gets 0 points and everything after that is linear).

If you say that the great Goren lived in different times and you happen to rely on the so called trick-taking potential, you'll end up in the same mud-spot:

4333 - 7.76

4432 + 0.31

5332 + 0.06

meaning that the first difference is 0.31 while the second difference is 0.06, highlighting a similar "drop" in value for the 4333.

In Zar Points the flat pattern is 2 points below the linear “pack” with 1-point difference which follows after that.

Make it a great day:

ZAR

tysen2k · June 11, 2004

Richard adds 1 pt for every suit length above 4 and the 1-3-5 points for shortness, which is exactly what we discuss here (besides the HP + CTRL or 6-4-2-1 for honors).

You can download the Richard Pavlicek article from his website www.rpbridge.com and the article has some coded name of “7Z70.pdf”.

I had no idea this was the case. I derived mine on my own.

Zar, do you have a direct link to that article? Pavlicek's website is www.rpbridge.net, (not .com) and I've searched the whole site, but I can't find it. There is a "pavlicek count" mentioned on his site, but that uses shortness points only, not length.

tysen2k · June 11, 2004

That's another good start. Could you take down the old evaulator comparison zip file (which lacks, for instance TSP), and repost the data with the hands the current Zar+fit and TSP+fit is based upon?

I will, but not for a few days at least. I just inherited a new project, so I'm pretty busy. Work has to come first sometimes... B)

inquiry · June 11, 2004

Richard adds 1 pt for every suit length above 4 and the 1-3-5 points for shortness, which is exactly what we discuss here (besides the HP + CTRL or 6-4-2-1 for honors).

You can download the Richard Pavlicek article from his website www.rpbridge.com and the article has some coded name of “7Z70.pdf”.
I had no idea this was the case. I derived mine on my own.

Zar, do you have a direct link to that article? Pavlicek's website is www.rpbridge.net, (not .com) and I've searched the whole site, but I can't find it. There is a "pavlicek count" mentioned on his site, but that uses shortness points only, not length.

http://www.rpbridge.net/p/7z70.pdf ...

Hey, I am not computer literate, but found it from filename with google, no problems. 3 page document.

Ben

MickyB · June 12, 2004

Richard's count is the same as Tysen's shape count, except hands with two singletons/voids count a point less.

tysen2k · June 14, 2004

http://www.rpbridge.net/p/7z70.pdf ...

Ah, I have seen this before. But knowing Richard, this is just what the article says it is: a way to quantify freakness. It's not meant to be a point count. It was never meant to be added to HCP or any derivative of HCP to approximate hand strength.

Zar vs. TSP

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation