Evaluating ZAR points

hotShot · February 19, 2005

Hi!

I'm just programming a simulation to test ZAR points. I'm using a double-summy solver to check the result. If you think I should modify something speak up now.

Here is what I'll do:

Dealer is always N, if the north hand is a ZAR opening and south holds an answer, i'll analyse the hand, otherwise it will be counted as a dropout.

To analyse it, i look for the longest fit available. If it is 8+ cards, the hands will be reevaluated using the fit information. The sum of of both players is used to find proper level. Then the double-dummy solver will try to make the contract. If the contract is made, i increment the good results, otherwise i'll increment the bad ones.

Sections are levels 2,3,4 and 5.

There is an extra section for 6/7, that is satisfied, when 11 tricks can be made. 11 is enough, because i expect any decent partnership to check for missing keycards before bidding a slam. To do this the 5 level must be save.

If the best fit has only 7 cards, the contract will be:

2 some suit, if the level is 2.

NT if the level is 3+. Since 3NT needs the same points as game in a major, i put (n NT) to the (n+1) level. So if the level says 3, the contract will be 2NT.

Every contract is played by south!

(1) Some strange settings are due to the fact that i don't want to write a bidding engine. People have invested much more brain and time to do that, than I'm willing to put on this project. And I not convinced, by most of the results.

(2) If i'd implement a bidding engine, the results would depend on the given bidding system.

(3) EW cards or possible bids are not taken into account.

hrothgar · February 19, 2005

Hi HotShot...

If your willing to go to this much trouble, then I strongly suggest that you look at some of the earlier posts in which Tysen and I suggested various methodologies for testing the accuracy of different hand evaluation metrics. This is a complex subject and if you design an inappropriate test then there is a very real possibility that you'll waste a lot of time...

As I noted in the past, I'd recommend an approach like the following:

1. Generate 1000 hands using any one of a variety of Dealer programs

2. Define a set of 13 buckets. Each bucket defines the maximum number of tricks that can be taken on a double dummy basis.

3. Sort the hands into buckets

4. For each hand in a given bucket, Let X = the sum of the Zar points for Declarer and Dummy

5. Calulate the Mean and Standard Deviation

The relative accuracy of different metrics can be determined from these two statistics, so a c"complete" analysis would need to compare Zar Points to an alternative schema like Bum Rap.

If you prefer, you could invert this entire proceed. Your initial buckets would measure the ccombined Zar Points of the two hands. You could then calculate average number of tricks taken for two hands with X combined Zar points.

February 19, 2005

Using a double dummy analyzer wouldnt help too much.

inquiry · February 19, 2005

i would suggest that before you undertake this, you state here how you are going to count ZAR points, zar fit points, and anti-Zar points (points off for short support, and for singelton honors)...so others can offer suggestions to help you get zar points correct before running whatever test you decide to try.

hotShot · February 20, 2005

Well here is my first impression.

I made 2 test runs yet. One counting K, KQ, Qx, Jxx, QJ with their full load.

There where more bad's than good's as expected. In the second run i put all those to 0 (exept KQ = 3+1).

This is the result of the second run.

Droped: 207
Level: 1 Good: 1 Bad: 0
Level: 2 Good: 13 Bad: 3
Level: 3 Good: 22 Bad: 10
Level: 4 Good: 20 Bad: 3
Level: 5 Good: 17 Bad: 3
Level: 6 Good: 6 Bad: 0
Level: 7 Good: 2 Bad: 0

There are problems with non fit hand, most of the bad 3 Level contract's are missfit NT's. Although I treat the misfit NT's as one level lower, they still go down.

Up to now i only use the HCP + ControlPoints + 2*longest suit + 2nd longest suit - shortest suit.

I'm going to implement the following extra's:

+1 if 15+ hcp concentrated in 3 suits or +1 if 12+ hcp in 2 suits

KQ,QJ, K, Q, J each -1 for unsave honors

For the fit reevaluation I intend to implement:

+1 for each trump honor (incl. T) with a maximum of 2 (both sides ?)

I'll look for the combined shortest suit and downgrade honors by one

Since i have no bidding taking place, I still thinking about the second suit.

So I'm not sure, if and how i will implement the extra points for the second suit.

Additionally I don't know "how many trump" were promised, because i counted the combined length, and must deside when to add the 3 HC for additional trump length.

Since it's middle of the night here, I'll take a break now.

hotShot · February 20, 2005

Using a double dummy analyzer wouldnt help too much.

Maybe so, but it can analyse a thousand boards, much faster than i could.

You will usually not play that good, but on the other hand you won't get the perfect defence either.

hotShot · February 21, 2005

I've been working on the downgrade of "disability combination" of honors.

Here's my list:

-4 K

-3 QJ

-2 AQ, AJ, KJ, Q, Qx

-1 A, AKJ, KQJ, Jx, Jxx

Upgrades:

11-14 HCP with more than 11 in 2 suits +1

15+ HCP with more than 15 in 3 suits

So i think i have the pre-bidding evaluation done.

Anyone interested, can get a csv-Files to be read with Excel or Open Office containing a list auf deal, Zar_points for each hand, the selected fit and the number of tricks the double dummy solver made.

Gerben47 · February 21, 2005

So what question is it you are answering?

Is it: "If I add the Zar points of the hands and select a contract, the contract will make" ?

I'm very interested in these results, if you could send me the files I'd be very grateful.

Email: gerben AT t-online DOT de

To save time you might want to use the deals from the GIB Double Dummy library (see the GIB research page).

Gerben

hotShot · February 21, 2005

So what question is it you are answering?

Is it: "If I add the Zar points of the hands and select a contract, the contract will make" ?

This is one of the questions, the others are:

How good is the prediction beween 3/4M?

As we know vul @ imps you start gaining, if your game/down ration is better than 38%.

How good is the prediction fo 3m, because if it is accurate 5m may be a good defence.

Weak Zar openings need controls to open, are they worth 2 defence tricks?

If i find time again, i'll try with other evaluation methods, too.

hrothgar · February 21, 2005

Using a double dummy analyzer wouldnt help too much.

I'd be very interested to know what this assertion is based on?

"Everyone" knows that double dummy analyzers do not provide a perfect approximation of single dummy play, let alone the behaviour of "falliable" wetware systems like the human brain.

With this said and done, double dummy solvers are orders of magnitude faster than alternative approaches and there is an awful lot to be said for substituting brute force and massive numbers of repition for elegance. As an analogy, consider the way that high end pharaceutical scales are now developed. The circuits built into high end scales are actually quite innaccurate. The scales themselves achieve their accuracy by weighing a samples tens of thousands of times and the averaging the results. Since the "noise" is randomly distributed, it will cancell itself out.

From my perspective, a similar approach is more than appropriate in measuring the accuracy of hand evaluation systems.

It should be noted that there can be problems with this approach. Most notably, if the double dummy analyzer introduces systemic bias, there could be problems. For example, assume that the double dummy analyzer was biased in favor of declarer this bias function was a function of the algorithm being evaluated... In this case it would be extremely difficult to differentiate the two error sources.

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.

hotShot · February 21, 2005

2000 Boards:

Droped: 492 = no opening at N or S

Misfit: 224 = no 8+ Fit (might be source of bad results)

Level: 1 Good: 73 Bad: 65

Level: 2 Good: 153 Bad: 125 42-46

Level: 3 Good: 283 Bad: 168 47-51 ZAR

Level: 4 Good: 247 Bad: 130 52-56 ZAR

Level: 5 Good: 134 Bad: 52 57-61 ZAR

Level: 6 Good: 40 Bad: 28 62-66 ZAR

Level: 7 Good: 5 Bad: 5 67+

NT contracts are shifted one level e.g.: 3NT = 52-56.

cherdano · February 21, 2005

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.

I think I have seen statistical analysis of Word Championship hands that showed that declarers there would on average get more tricks than they should on a double dummy basis. The deviation was s.th. like a third or half a trick.

Sounds plausible to me, given how many tricks are lost on the opening lead alone.

Arend

mikestar · February 21, 2005

Double dummy solvers would tend to have a bias that varies with level. Take a grand slam the depends on a two-way finesse for the Queen of trumps. The DD declarer will never get it wrong and the DD defender gains no benefit whatever. On the other hand, DD defense will always find the opeing lead ruff to set a grand. But on the whole, the stronger the declaring side's hands, the more likely it is that DD information won't help the defense because they have little or no control of the play.

My guess is that this pro-declarer bias at higher levels helps bring DD results closer to table results. DD may be a bit unfair to Zarpoints at the partscore level--when the strength is fairly equally divided, DD info will be useful to both sides and that will be a gain for the defense vs. table results.

By the way, Zar points could be quite useful for suit contracts while being worthless for NT (compare the LTC) so the NT results will be of limited utility.

hrothgar · February 21, 2005

To date, I've never seen a good analysis that suggests that double dummy analyzers introduce systemic bias. I'd be interested in seeing anything to the contrary.
I think I have seen statistical analysis of Word Championship hands that showed that declarers there would on average get more tricks than they should on a double dummy basis. The deviation was s.th. like a third or half a trick.

Sounds plausible to me, given how many tricks are lost on the opening lead alone.

Arend

Thanks for the data point: One addition "quick" comment.

Its still unclear the extent to which any such bias would impact the analysis in question.

Assume for the moment that Single Dummy play is .3456 tricks "better" than double dummy play. Furthermore, assume that this bias is the same regardless of the relative strength of the hands in question.

In this case, the bias would adjust the mean number of tricks taken but would NOT effect the relative variance. And, since the accuracy of the hand evaluation technique depends on the variance this really doesn't effect the methodology...

tysen2k · February 22, 2005

As was said earlier, it's probably best for you to review what has already been done. Here and here are the best places to start. No point in reinventing the wheel.

Also about the accuracy of DD data compared to real world declarers. Peter Cheung did an extensive study of 383,000 okbridge hands (25 million plays) and found that on average there is only 0.1 tricks difference. A DD declarer has the advantage in slam contracts, but the DD defenders have the advantage at partscores. Around game, DD is very accurate.

Tysen

hotShot · February 22, 2005

Thanks Tysen,

those are interesting links.

But there is something about wheeles, some have spokes, some have rims, and if they don't match in form or size, you need to get your own.

hotShot

Evaluating ZAR points

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest Jlall

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation