New hand evaluation method

tnevolin · August 11, 2016

To all forum buddies. I need your expert opinion.

m1cha proposed to shift scale for trump model down so critical contract requirements will match popular values. Upon some thinking I agreed with him that even though such shift distorts the exact trick estimate bidders are much more often interested in critical contract condition rather than generic trick estimate. So the convenience is huge enough to outweigh the incorrectness. So I did and updated evaluation document accordingly. Again thanks to m1cha for pointing this out.

With this is done I continued thinking about further improvement in this direction to make point to contract requirement transition is even more convenient. Here is the background info for you. I have analyzed standard HCP evaluation model and found out that 3NT contract requirement is actually 23 points, not 25. That number marks the breaking point where declaring the contract become profitable on average, not that contract will be made 100%. This is an interesting finding and I believe it's correct because my recent NT model calculations suggest to add 2 points as a constant value to combined strength for better trick count estimate in NT contracts. My critical contract strength table lists 25 points for 3NT (IMP). With 2 points constant value in mind this is equivalent to 23 points 3NT requirement in HCP model. So far the math matches. Now let's go back to the convenience. This 2 points constant value is practically the only difference between Evolin NT model and HCP. The long and strong suit rule happens quite rare. So if we would remove the constant value these two models would match exactly 99% of the time and in the rest of the case they would differ by 1 point max. Which would be a huge in game convenience. The only consequence of this change would be shifting NT critical contract requirements down 2 points. I understand that this would drift away from popular values. However, let me reiterate again that HCP 3NT 25/26 points requirement is incorrect one. The correct one is 23/24 points anyway. Please let me know which approach would be more convenient in your opinion.

If we decide to shift NT model scale the same should be done with Trump model to keep critical contract requirements in sync. Trump model shift doesn't present a challenge, though. I would just add/subtract the value for number of trumps and this would do it. Trump model is already quite complicated so such shift doesn't make it more or less complicated anyway.

jogs · August 11, 2016

With this is done I continued thinking about further improvement in this direction to make point to contract requirement transition is even more convenient. Here is the background info for you. I have analyzed standard HCP evaluation model and found out that 3NT contract requirement is actually 23 points, not 25. That number marks the breaking point where declaring the contract become profitable on average, not that contract will be made 100%. This is an interesting finding and I believe it's correct because my recent NT model calculations suggest to add 2 points as a constant value to combined strength for better trick count estimate in NT contracts. My critical contract strength table lists 25 points for 3NT (IMP). With 2 points constant value in mind this is equivalent to 23 points 3NT requirement in HCP model.

You'll need to show proof. Defenders with 17 HCP and the opening lead can't find 5 tricks quicker than declarer finds 9 tricks? I've actually made my own studies. Don't agree with your conclusion. Sometimes declarer has long suits as a source of tricks. It is easier to make 9 tricks while in 1NT than in 3NT. In 3NT the defenders know their goal is to win 5 tricks.

We all know Meckwell often bid 3NT with 24 HCP or less. But does anyone have a study of their boards? Have they won or lost imps on these shaky 3NTs?

Stefan_O · August 11, 2016

Have to agree with Jogs here...

Could it also be that your deals/scoreboard data-files come from a too weak field of players?

I might expect weak/inexperienced players would be quicker to pick up how to develop the tricks they need as declarer, than how to defend reasonably...

Not that you should blindly accept any old rules-of-thumb, of course, but this specific point of

how many HCP you need for 3NT, is probably the one that has been by far most thoroughly examined in past hand-eval studies...

And if it was that wrong, it would certainly have been refuted by good players long time ago, because it is so easy to test/experience in every-day actual play.

tnevolin · August 11, 2016

Have to agree with Jogs here...

Could it also be that your deals/scoreboard data-files come from a too weak field of players?

I might expect weak/inexperienced players would be quicker to pick up how to develop the tricks they need as declarer, than how to defend reasonably...

Not that you should blindly accept any old rules-of-thumb, of course, but this specific point of
how many HCP you need for 3NT, is probably the one that has been by far most thoroughly examined in past hand-eval studies...

And if it was that wrong, it would certainly have been refuted by good players long time ago, because it is so easy to test/experience in every-day actual play.

OK. Let me try to find another game source and see if it holds. From the other side, 2 points is a huge difference that would be very difficult to explain by certain constant dumbness of players in thousands of games.

By the way, can you point me to where it was "most thoroughly examined in past hand-eval studies..."? I never found one after extensive search for few years. That might resolve many of questions right away.

Stephen Tu · August 11, 2016

You might look at

http://bridge.thomasoandrews.com/valuations/

This was based on double-dummy simulations. But other studies have compared single dummy to double dummy and found that actual results are fairly close to DD, declarers tend to do slightly better than DD at lower levels and somewhat worse at slam level. At 3nt I've seen studies seeing declarers do somewhere between 0.1-0.2 tricks better than DD on average.

I'd be pretty surprised to see that 23 is enough to bid 3nt.

tnevolin · August 11, 2016

You might look at

http://bridge.thomasoandrews.com/valuations/

This was based on double-dummy simulations. But other studies have compared single dummy to double dummy and found that actual results are fairly close to DD, declarers tend to do slightly better than DD at lower levels and somewhat worse at slam level. At 3nt I've seen studies seeing declarers do somewhere between 0.1-0.2 tricks better than DD on average.

I'd be pretty surprised to see that 23 is enough to bid 3nt.

Nice link. Let me study it. Maybe I indeed have some statistical error inside.

jogs · August 11, 2016

OK. Let me try to find another game source and see if it holds. From the other side, 2 points is a huge difference that would be very difficult to explain by certain constant dumbness of players in thousands of games.

For you to conclude that it is right to be in 3NT with 12 opposite 11 your sample can only contain boards where declarer is in 3NT.

If declarer is in 1NT, defenders sometimes don't cash their 5th trick. Their goal was to win 7 tricks.

tnevolin · August 11, 2016

For you to conclude that it is right to be in 3NT with 12 opposite 11 your sample can only contain boards where declarer is in 3NT.
If declarer is in 1NT, defenders sometimes don't cash their 5th trick. Their goal was to win 7 tricks.

You are right, jogs. This is a inherent skew that all analysts are pointing out. Number of tricks depends on contract because both declarer and defenders tacktics is driven by it. Unfotunately, I don't know a cure. I also didn't see anybody posting a good idea how to filter this skeweness out in analysis.

jogs · August 11, 2016

You are right, jogs. This is a inherent skew that all analysts are pointing out. Number of tricks depends on contract because both declarer and defenders tacktics is driven by it. Unfotunately, I don't know a cure. I also didn't see anybody posting a good idea how to filter this skeweness out in analysis.

I have no idea how much you know about statistics. This is hypothesis testing. You have the cure. Large data base. Only include observations which fit a strict criteria. Offense must bid 3NT. Defenders must know their goal is to win 5+ tricks.

Do not include observations where declarer is in 1NT. Defender didn't know they were expected to cash their 5th trick.

tnevolin · August 11, 2016

I have no idea how much you know about statistics. This is hypothesis testing. You have the cure. Large data base. Only include observations which fit a strict criteria. Offense must bid 3NT. Defenders must know their goal is to win 5+ tricks.
Do not include observations where declarer is in 1NT. Defender didn't know they were expected to cash their 5th trick.

I know some. The problem here is that there is a correlation between number of tricks and contract level. So if someone bid 3NT that means they have around 9 tricks. Maybe some more or less but not the whole lot. So if I restrict whole set to only those bidding 3NT I will get a conditional result that in simple human words would read: "If we bid 3NT and have so much points - how many tricks we have?". When you rephase it like this it is obvious that you don't even need statistical analysis to know that the answer will be somewhere in 8.5 - 9.5.

I actually did what you proposed. Tried to narrow down to the interval of interest. Like if I would want to predict 3NT games better and don't care about 1NT, I would just cut out interval where people bid (or get) 8-10 tricks. Sounds promising but it didn't work. The coefficient values becomes ridiculous. Like I would get one huge constant of 9 tricks plus some very small coefficients for other features to account for slight variation around 9 tricks. And the overall prediction around 3NT got worse comparing with the case where I considered whole range of games.

jogs · August 11, 2016

I actually did what you proposed. Tried to narrow down to the interval of interest. Like if I would want to predict 3NT games better and don't care about 1NT, I would just cut out interval where people bid (or get) 8-10 tricks. Sounds promising but it didn't work. The coefficient values becomes ridiculous. Like I would get one huge constant of 9 tricks plus some very small coefficients for other features to account for slight variation around 9 tricks. And the overall prediction around 3NT got worse comparing with the case where I considered whole range of games.

You should care about 1NT. With the same cards defenders best line of defense is dependent on whether they are defending 1NT or 3NT. Have you studied game theory?

Stefan_O · August 12, 2016

Hi Tim,

I had to see for myself, and wrote a little program here to do some number-crunching on your datafile ;)

Here's what I found:

Turns out, on all the deals, when people end up in 3NT:

- if they have 25 HCP, they make the contract in 62.9% of the cases (119566 samples)

- if they have 24 HCP, they make the contract in 51.0% of the cases (98613 samples)

- if they have 23 HCP, they make the contract in 41.1% of the cases (62852 samples)

- if they have 22 HCP, they make the contract in 33.0% of the cases (36243 samples)

Now, it's quite different animals when you have a normal well-controlled auction like 1NT-3NT

vs very unbalanced hands, where you end up in 3NT just because you find no fit at all,

or one of the players has a 6 or 7+minor, etc.

So I was thinking, perhaps the classic ~25HCP guideline mostly applies when we have two balanced hands,

and made an additional check when filtering on this extra restriction, too.

Indeed, the percentages for 24HCP or less then go down, but perhaps still not as much as one might expect....

Turns out, on the deals where both hands are 4333/4432/5332 and people bid 3NT:

- if they have 25 HCP, they make the contract in 63.0% of the cases (99079 samples)

- if they have 24 HCP, they make the contract in 50.0% of the cases (75201 samples)

- if they have 23 HCP, they make the contract in 37.4% of the cases (39524 samples)

- if they have 22 HCP, they make the contract in 25.1% of the cases (15859 samples)

At MPs, I would think I want at least a 50% chance when bidding 3NT,

so it seems to make sense to at least not go below 24HCP.

But at IMPs... well...

Anyways, perhaps, this is where your Evolin points come in handy...

To distinguish between 24 good HCPs, and 24 (or even 25?) bad HCPs, to always land us in the optimal contract :)

Definitely an interesting exercise, this stuff! :)

Stefan_O · August 12, 2016

Some other thoughts ---

1. This might well be the most important factor, skewing the data:

I believe it's clearly much more common with temporary partnerships when playing online,

which should give the average declarer's one-man-show a distinctive advantage online (= needs less HCP on average to make contract).

2. How well do defenders play in online-play vs face-to-face play?

Myself, I definitely feel more concentrated when playing IRL -- for whatever reason,

it is hard to keep the same focus and "mental sharpness" when playing along in front of the computer perhaps "just killing time"...

Dunno if this is just a personal thing?

People might also, of course, have more distractions in the environ, that you dont have at IRL competitions, where everything is setup for the game.

3. It also raises a possible issue,

how much does a face-to-face competition actually contain, when it comes to defenders' "table-presence", etc...

or all the "signals" of sighs and breathing, frowning and facial-expressions, "body-language", etc?

Even if it is of course "subconscious" and not intentional in the majority of cases,

evidently, there is quite less of this component online, which might then also add to defenders "disadvantage"?

jogs · August 12, 2016

But at IMPs... well...

Anyways, perhaps, this is where your Evolin points come in handy...
To distinguish between 24 good HCPs, and 24 (or even 25?) bad HCPs, to always land us in the optimal contract :)

Definitely an interesting exercise, this stuff! :)

Saw it somewhere:

12 oppo 12 plays the best.

13 oppo 11

14 oppo 10

15 oppo 9

The bigger the difference between the two hands the lower the expected tricks.

tnevolin · August 12, 2016

You should care about 1NT. With the same cards defenders best line of defense is dependent on whether they are defending 1NT or 3NT. Have you studied game theory?

You lost me. Isn't it what we started with?

tnevolin · August 12, 2016

Turns out, on the deals where both hands are 4333/4432/5332 and people bid 3NT:
- if they have 25 HCP, they make the contract in 63.0% of the cases (99079 samples)
- if they have 24 HCP, they make the contract in 50.0% of the cases (75201 samples)
- if they have 23 HCP, they make the contract in 37.4% of the cases (39524 samples)
- if they have 22 HCP, they make the contract in 25.1% of the cases (15859 samples)

At MPs, I would think I want at least a 50% chance when bidding 3NT,
so it seems to make sense to at least not go below 24HCP.

But at IMPs... well...

You results and conclusion correspond to my findings exactly. My 3NT contract requirements are 25 for IMP and 26 for MP. I just often replace 25/26 with 25 for the sake or simplicity. Then if we shift it 2 points down, as I explained in my initial post, we would get the results for Evolin points without constant which almost exactly corresponds to HCP. And here you get your 23(IMP)/24(MP) borderlines! If you use pure score and calculate what probability you need to win 3NT to still profit on average, you would get 40% (nv), 33% (v). Jamming v and nv together it'll be somewhere 37%. Of course, game points is not equivalent to IMP but they are good approximation.

tnevolin · August 12, 2016

Anyways, perhaps, this is where your Evolin points come in handy...
To distinguish between 24 good HCPs, and 24 (or even 25?) bad HCPs, to always land us in the optimal contract :)

Yep, the idea exactly. Help people to make most difficult and important decision there in the game.

tnevolin · August 12, 2016

Saw it somewhere:
12 oppo 12 plays the best.
13 oppo 11
14 oppo 10
15 oppo 9

The bigger the difference between the two hands the lower the expected tricks.

I know this research. This is specially true for NT contracts. It could be explained by lack of entries to weak hand and, therefore, lack of maneuver. I tried to factor it into the calculation. Didn't work well for two reasons. First, the effect is very subtle and there are not much hands with large skew like 24-0. Probably need larger dataset to catch it. Second, and more important one, is that this is second iteration factor. I.e. you need to evaluate your hands first and then do a second iteration for this adjustment. I decided not to include them for now as it complicates evaluation rules tremendously.

jogs · August 12, 2016

You should care about 1NT. With the same cards defenders best line of defense is dependent on whether they are defending 1NT or 3NT.

You lost me. Isn't it what we started with?

Hasn't this ever happened to you? 1NT+2. But you were in 3NT, it would be 3NT-1. Defenders would cash the setting trick.

tnevolin · August 12, 2016

You might look at

http://bridge.thomasoandrews.com/valuations/

This was based on double-dummy simulations. But other studies have compared single dummy to double dummy and found that actual results are fairly close to DD, declarers tend to do slightly better than DD at lower levels and somewhat worse at slam level. At 3nt I've seen studies seeing declarers do somewhere between 0.1-0.2 tricks better than DD on average.

I'd be pretty surprised to see that 23 is enough to bid 3nt.

I like the link and the study. I even send the man a message but he didn't reply yet. Do you know him personally?

Stefan_O · August 12, 2016

Saw it somewhere:
12 oppo 12 plays the best.
13 oppo 11
14 oppo 10
15 oppo 9

The bigger the difference between the two hands the lower the expected tricks.

I ran another crunch on the same ~400K deals, to examine this relation.

Outcome:

When a pair has a combined 24 HCP and both hands are 4333/4432/5332, and they bid 3NT:

- if they have 24 HCP (12 vs 12), they make the contract in 50.10% of the cases (6271 samples)

- if they have 24 HCP (13 vs 11), they make the contract in 49.44% of the cases (9975 samples)

- if they have 24 HCP (14 vs 10), they make the contract in 51.25% of the cases (7005 samples)

- if they have 24 HCP (15 vs 9), they make the contract in 50.85% of the cases (8752 samples)

- if they have 24 HCP (16 vs 8), they make the contract in 48.11% of the cases (7151 samples)

- if they have 24 HCP (17 vs 7), they make the contract in 47.41% of the cases (3227 samples)

- if they have 24 HCP (18 vs 6), they make the contract in 47.19% of the cases (2115 samples)

- if they have 24 HCP (19 vs 5), they make the contract in 46.03% of the cases (1336 samples)

- if they have 24 HCP (20 vs 4), they make the contract in 44.39% of the cases (865 samples)

- if they have 24 HCP (21 vs 3), they make the contract in 41.47% of the cases (340 samples)

- if they have 24 HCP (22 vs 2), they make the contract in 47.32% of the cases (298 samples)

- if they have 24 HCP (23 vs 1), they make the contract in 28.57% of the cases (91 samples)

- if they have 24 HCP (24 vs 0), they make the contract in 47.30% of the cases (74 samples)

So, as long as the stronger hand has no more than 15 HCP, this data does not indicate any negative impact on your chances.

But stronger than that, it seems they start to fall off a bit...

Stefan_O · August 12, 2016

Hasn't this ever happened to you? 1NT+2. But you were in 3NT, it would be 3NT-1. Defenders would cash the setting trick.

Hmm... but this might also work in the opposite direction...

When declarer knows he needs 9 tricks, he is of course eager to take a chance/risk to make his contract,

that he might not risk if he played 1NT or 2NT,

thus giving a higher percentage for 9 tricks in 3NT.

Guess I'll do another run to check this... ;)

Stefan_O · August 13, 2016

Hasn't this ever happened to you? 1NT+2. But you were in 3NT, it would be 3NT-1. Defenders would cash the setting trick.

Yes... the datafile indeed does indicate the opposite...

i.e. with a given combined HCP-strength [22-25],

you are more likely to score 9+tricks if you bid 3NT, than in if you do not.

Outcome:

When both hands are 4333/4432/5332 and the pair ends up in an NT-contract:

- if they have 25 HCP and the contract is 1NT, they score at least 9 tricks in 57.20% of the cases (6913 samples)

- if they have 25 HCP and the contract is 2NT, they score at least 9 tricks in 55.48% of the cases (5056 samples)

- if they have 25 HCP and the contract is 3NT, they score at least 9 tricks in 62.28% of the cases (61179 samples)

- if they have 25 HCP and the contract is 4+NT, they score at least 9 tricks in 56.75% of the cases (289 samples)

- if they have 24 HCP and the contract is 1NT, they score at least 9 tricks in 42.87% of the cases (16882 samples)

- if they have 24 HCP and the contract is 2NT, they score at least 9 tricks in 43.10% of the cases (12421 samples)

- if they have 24 HCP and the contract is 3NT, they score at least 9 tricks in 49.31% of the cases (47500 samples)

- if they have 24 HCP and the contract is 4+NT, they score at least 9 tricks in 46.95% of the cases (164 samples)

- if they have 23 HCP and the contract is 1NT, they score at least 9 tricks in 29.67% of the cases (29047 samples)

- if they have 23 HCP and the contract is 2NT, they score at least 9 tricks in 30.60% of the cases (15530 samples)

- if they have 23 HCP and the contract is 3NT, they score at least 9 tricks in 36.44% of the cases (25448 samples)

- if they have 23 HCP and the contract is 4+NT, they score at least 9 tricks in 29.17% of the cases (96 samples)

- if they have 22 HCP and the contract is 1NT, they score at least 9 tricks in 18.35% of the cases (35317 samples)

- if they have 22 HCP and the contract is 2NT, they score at least 9 tricks in 21.02% of the cases (10307 samples)

- if they have 22 HCP and the contract is 3NT, they score at least 9 tricks in 24.61% of the cases (10829 samples)

- if they have 22 HCP and the contract is 4+NT, they score at least 9 tricks in 18.52% of the cases (54 samples)

(4+NT means 4NT or higher NT-contract)

Not very surprising, really, since overall, declarer has a much better picture of the deal and his risks/chances and knows what he is doing, than defenders.

Another point of interest, perhaps, is that in this data-sample, with 24HCP and two balanced hands,

the players are more likely to end up in 3NT than an NT-partscore (47500 instances vs 29303 = 62% vs 38%)

jogs · August 13, 2016

How much can binky points improve our tricks estimates over work count?

Introducing a new parameter can improve estimates by leaps and bounds.

Lawrence's short suit total vastly improve estimates. SST is interdependent of trumps.

http://jogsbridge.weebly.com/uploads/1/8/0/2/1802582/554030.jpg?617

The double fit is another tremendous source of tricks. Nearly 15% of the time

our side has 17 or more cards in two suits.

tnevolin · August 13, 2016

Hm, last month the 53rd European Team Championships took place in Budapest, the results are here:
http://www.eurobridge.org/repository/competitions/16budapest/microsite/Results.htm

and if you click on a round and then on a table you get the individual scores such as here:
http://www.eurobridge.org/repository/competitions/16budapest/microsite/Asp/BoardDetails.asp?qmatchid=34985

This is high-class bridge, definitely good data but probably not sufficient quantity for you, I am afraid, even if you check for earlier years.

Another possibility perhaps if you write to the BBO people, they might give you the accumulated data of these new "Free Daylong tournaments". That is ~ 10000 participants @ 8 boards each EVERY DAY. Worth writing another crawler for it, I guess ;) . Not all of this is high-class bridge though. (Edit: and three of the players are always robots.)

I finally downloaded them! Thanks for interesting source. The sad thing is that there is only 10k boards there. A tiny amount. :(

Anybody knows some similar sources?

New hand evaluation method

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

The_Badger

m1cha

Stefan_O

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites