LoTT is a parabola

whereagles · November 5, 2014

Prove what? Significance levels are used when you are interested in rejecting a null hypothesis. Here the null hypothesis might be that the LOTT is accurate but we know that that isn't true. If you want to prove that the parabola is "correct" then you could postulate an noninferiority hypothesis, i.e. the hypothesis that the optimal parabola is not so much worse than the optimal generic model that it matters for practical purposes.

It's not proving that LOTT is accurate (i.e. tricks = trumps). It's rejecting H0: E(tricks) = trumps. I think you can do it with a chi-square test.

barmar · November 5, 2014

It's not clear to me how he's calculated these numbers. Assuming the ± bit gives a confidence interval for the expected number of tricks, the greater uncertainty for the 21+ range would just be a consequence of the low sample size; it doesn't mean there is a higher variance in the number of tricks made. However, the actual numbers given make this interpretation a bit implausible.

Since he probably calculated them using either GIB or a DD solver, he could presumably generate enough hands to avoid sampling error.

But high total trump hands don't come up so often in the real world, so the accuracy of LOTT is not so critical for them in practical terms.

jogs · November 5, 2014

Eh? New prescription drugs get 0.1% or less, don't they?

Don't know. Only know that it sometimes costs nearly $1B to get a drug pass the FDA.

helene_t · November 5, 2014

Eh? New prescription drugs get 0.1% or less, don't they?

The 95% significance level is quite widely accepted AFAIK but FDA often requires two trials, so if both pass 95% you have 99.875% combined.

Anyway, for a superitority trial it is not enough to reject the hypothesis of zero superiority.

campboy · November 5, 2014

The same holds for Ginsbergs experiment. There are deals with a larger amount of tricks than trumps and there are deals with less tricks than trumps, just like there are heavy apples and light apples. The deviation in the number of total tricks is a property of the total trick distribution of bridge deals. If he would have used more deals, he would have been able to determine the average number of tricks, as well as the standard deviation, more accurately, but there is no reason why the value for the standard deviation would decrease or increase. From the start, the reported standard deviation has been the best estimate for the standard deviation of the total trick distribution. More measurements will lead to a better estimate for this standard deviation, but we cannot say whether it will be higher or lower.

Ah ok. It just didn't occur to me that the second number was supposed to be the standard deviation, because the way it is written with +/- makes it sound like a confidence interval. But of course the actual numbers are much more consistent with it the former interpretation.

jogs · November 5, 2014

I had forgotten about the Ginsberg article posted by Yunling. The info was wedged into the back of my mind.

LoTT has its limitations. We know our own trumps. Often we can deduced partner's trumps. How often can we decipher opponent's trumps? Using total trumps 3/4ths of our estimator is unknown.

http://i61.tinypic.com/5p0x9t.png

Your chart looks better than mine.

Look at the Ginsberg chart. As trumps increase the std dev increases. That means with as trumps increase, trumps play a smaller role in the estimates.

At the table it is really easier to think in terms of our tricks. Total tricks is too much beyond our control. There are 26 cards in our partnership hands, 13 in each hand. All 26 cards play a role in determining the number of tricks our partnership can make. It is my suit pattern's fit with my partner's suit pattern. The independent random variable is our joint pattern pair. Trumps is the primary component of pattern. With many trumps flat patterns become less likely. Lawrence/Wirgren's short suit totals play a larger role as trumps increase. SST is another component of pattern.

1. Well, HCP count works fine until a fit is found. That's why people teach it :) After fit is found, HCP needs corrections (points for singletons, voids, etc). In fact, it is much like the LOTT + corrections.

2. They did present an alternative: the SST/WP stuff. Just that it's a bit too complicated to use at the table. But yeah, I tend to agree that LOTT + corrections, while not ideal, should be good enough for most practical cases.

Use our tricks

E(tricks) = trumps + (HCP-20)/3

Notice with 9 trumps, one can bid game with 22 points. An additional trump is worth an additional trick.

E(tricks) = trumps + (HCP-20)/3 + SST

You can use SST to adjust the estimates both up and down. You know your hand's contribution to SST. Sometimes you can deduce partner's.

navahak · November 5, 2014

LoTT has its limitations. We know our own trumps. Often we can deduced partner's trumps. How often can we decipher opponent's trumps? Using total trumps 3/4ths of our estimator is unknown.

Our trump length correlates with opponents trump length. Basically if e have 9 card fit opponents have it too or they have two times 8 card fits.

mikeh · November 5, 2014

Use our tricks

E(tricks) = trumps + (HCP-20)/3

Notice with 9 trumps, one can bid game with 22 points. An additional trump is worth an additional trick.

E(tricks) = trumps + (HCP-20)/3 + SST

You can use SST to adjust the estimates both up and down. You know your hand's contribution to SST. Sometimes you can deduce partner's.

This is just silly. Yes, there are hands on which we can, should, and do bid games on 22 hcp (or less, haven't we all bid and made slams on 15 counts or less..admittedly usually as a save that happens to make!) but that doesn't mean that we should always be bidding major suit games on 9 card fits every time we hold a combined 22 count.

Bridge valuation is a fuzzy process. I accept that in theory it ought to be possible to come up with a mathematical approach that does better than the best judgment of the best players in the world. After all, in chess it is now accepted that the best software running on the best computer will trounce the best human players.

The problem is that we humans are not capable of being able to apply such a 'perfect' or 'near-perfect' mathematical model. We can't hold the parameters or equations in our heads (if one ever figured out what they would be) and we can't crunch the numbers unaided, or in a realistic playing time.

Meanwhile, focusing on some bits of what an ultimate theory would encompass, while ignoring other just as important bits, is an exercise in futility.

When we ascertain, or assume, that we have 22 combined hcp and a 9 card fit, we look at where the cards are, and what they are. We upgrade for the presence of 9's and 10's in our long suits. We upgrade for honours in our long suits, but downgrade the J in the 9+ suit, since it may be redundant. We downgrade Queens and Jacks in short suits. We look kindly on Aces, somewhat also on Kings.

We pay attention to the bidding by the opps, including passes on occasion.

We weigh the merits of exploratory bidding, which informs the opps as well as partner, against the merits of blasting (or passing low).

We weigh our partnership style. We weigh the state of the match, and the relative strengths of our teams, if playing a head-to-head team game. We look at the vulnerability.

Even if we could build an effective set of equations that would include all of these factors, and correctly assign weights to them, which probably vary between every hand and every match, we couldn't possibly come up with a method that a human could play at the table.

Which means that what we need to develop is a series of methods that are relatively easy to use, and offer reasonable approximations, and then synthesize, through experience, discussion with better players, and so on, what will largely be an unconscious analytical approach.

By spending so much attention on narrow aspects of this process you run the risk of being able to count the leaves on an individual tree while having very little idea of what the forest looks like....when it is the forest that concerns you, not the tree you are staring at so intently.

I can tell you, for example, that when I am playing well, what I note mostly about my successful aggressive auctions and decisions is that I 'like' or 'dislike' my hand. Of course, I will think carefully about the various factors I listed above, but the single most important criterion is how I feel about the hand...do I like it or dislike it? In my opinion, that feeling, when I am 'on', is the result of an unconscious synthesis of a lot of little bits of information. It's called judgment, and we all have it to some degree.

jogs · November 5, 2014

This is just silly. Yes, there are hands on which we can, should, and do bid games on 22 hcp (or less, haven't we all bid and made slams on 15 counts or less..admittedly usually as a save that happens to make!) but that doesn't mean that we should always be bidding major suit games on 9 card fits every time we hold a combined 22 count.

18 trumps, 20-20 HCP for each partnership.

Both sides are willing to bid to the 3 level now. 3 HCP is a king worth a trick.

Bergen is telling ppl to bid 4M after partner opens 1M. Doesn't work when neither pard has a singleton(or void).

5-4 fit. If there is a singleton with the 4-card suit, yes you should always bid 4. Also it doesn't always make. IMPs, you are suppose to bid every 45% game. Some bid lower % games.

jogs · November 5, 2014

On the contrary, I think we should expect more convincing proof in this case than for a medical study. It should be trivial to get huge amounts of data on the LoTT, just running random hands through a double-dummy analyser. Even with the resources of a massive multinational company, you just can't do medical trials on enough people to compete.

Bridge is a probabilistic game with high variance. Regardless of the size of your study you cannot lower the true population variance. Large sample sizes can only improve the estimate of the means.

jogs · November 5, 2014

It's not proving that LOTT is accurate (i.e. tricks = trumps). It's rejecting H0: E(tricks) = trumps. I think you can do it with a chi-square test.

H0: E(tricks) = trumps

H1: E(tricks) = c + c1 trumps + c2 (trumps)²

H1 represents LoTT better than H0.

yunling · November 6, 2014

It's not clear to me how he's calculated these numbers. Assuming the ± bit gives a confidence interval for the expected number of tricks, the greater uncertainty for the 21+ range would just be a consequence of the low sample size; it doesn't mean there is a higher variance in the number of tricks made. However, the actual numbers given make this interpretation a bit implausible.

No it is not a some % interval, it is just sample variance.

By running DD analyses, he gets a distribution of number of total tricks given the total number of trumps.

x±y means that the mean of the distribution is x and the standard error is y.

I agree with jogs that it shows trump number plays a smaller role as they increase.

jdeegan · November 6, 2014

When LC published his first book called, as I recall To Bid or Not to Bid, Bob Hamman's comment was reported to be "Bid!". The fundamental problem is that with a blind opening lead and with two of the hands concealed, figuring out the number of total tricks in a given hand is not even achievable. Bridge is not played double dummy.

Still, LOTT is a considerable aid. Later LOTT 2.0 with adjustment factors was an improvement. Imho, IFTL is a better tool than even LOTT 2.0, but it has its limits. The main thing I got out of it was that honors in the opponents' suits are deadly in a competitive auction. Of course, this is exactly what my mentors were telling me 50 years ago when I was first learning to play.

Many trumps are good. More trumps even better.

"Purity" of the hand good.

1098762

A109

A32

32

vastly better weak 2 opener than,

Q87542

872

K76

K4

Bottom line is that Bridge is a game based on incomplete information.

campboy · November 6, 2014

No it is not a some % interval, it is just sample variance.
By running DD analyses, he gets a distribution of number of total tricks given the total number of trumps.
x±y means that the mean of the distribution is x and the standard error is y.
I agree with jogs that it shows trump number plays a smaller role as they increase.

It would, if that is what he means by "x±y" and the figures are accurate. But I tried doing the same thing (with Thomas Andrews' Deal program), and got completely different values for the sample standard deviation. I used 1000 samples for each number of total trumps; I'll try again with bigger samples if I have time.

trumps		sample mean	sample s.d.
14		13.82		0.866
15		14.873		0.909
16		16.113		1.010
17		17.032		0.999
18		17.947		1.095
19		18.783		1.131
20		19.55		1.186
21		20.138		1.208
22		20.679		1.192
23		21.192		1.195
24		21.675		1.111

whereagles · November 6, 2014

H0: E(tricks) = trumps
H1: E(tricks) = c + c1 trumps + c2 (trumps)²

Actually, you can fit a non linear regression model to Ginsberg's data and run significance tests on c, c1, c2. I might set c = 0 from the start, though.

I'll try and do it this week-end... gotta read an MSc thesis right now, and see if I can dig up a few embarassing questions to the candidate lol.

helene_t · November 6, 2014

Have Campboy and Ginsberg defined the problem in the exact same way? If a pair has two 8-card fits you could count the number of tricks they take in whichever suit gives the highest number of tricks, or you could take one suit at random. If you take the average of the two you will underestimate the variance and if you count both with full weight you will overestimate the mean. A similar issue relates to choice of declarer.

campboy · November 6, 2014

Have Campboy and Ginsberg defined the problem in the exact same way? If a pair has two 8-card fits you could count the number of tricks they take in whichever suit gives the highest number of tricks, or you could take one suit at random. If you take the average of the two you will underestimate the variance and if you count both with full weight you will overestimate the mean. A similar issue relates to choice of declarer.

Good question. Where there are two fits of the same length my script uses the higher-ranking suit, so essentially picks one at random. It does assume that contracts are right-sided, where relevant.

[edit]

Based on the mean, it seems Ginsberg is also right-siding the contract but either picking the suit at random or averaging the two suits where there is a double fit. For 14 total trumps (where there is always a double fit), on a sample of 1000 (different to the previous sample but the same one for each calculation) I got the following means:

average over both suit and declarer	13.763
average over suits of better declarer	13.848
better suit and better declarer		14.426
better suit; average of declarers	14.341
(Ginsberg's mean			13.85)

The mean alone can't tell me whether he is averaging for the two suits or picking a random one; the fact that his standard deviation is lower suggests the former, but making that change would only push the standard deviations in my previous post down, and the most striking difference is that the deviations for the higher numbers are already much lower than his. I also think picking a random suit is a better way to estimate the standard deviation, since that is more like what happens at the table (whereas I expect contracts are right-sided most of the time in practice).

helene_t · November 6, 2014

Actually, you can fit a non linear regression model to Ginsberg's data and run significance tests on c, c1, c2. I might set c = 0 from the start, though.

I'll try and do it this week-end... gotta read an MSc thesis right now, and see if I can dig up a few embarassing questions to the candidate lol.

I would call this a linear regression model. That one of the covariates happens to be a transform of something doesn't make it nonlinear.

A nonlinear model would be something that could not be rewritten as a linear model, for example

E(Tricks) = a*x +b*x^c

where a, b and c are parameters to be estimated.

jdeegan · November 6, 2014

Big problem. When you abstract a situation to a level where you can easily model it, you can easily create a situation that is so different from the reality you were originally concerned with as to be uninteresting.

For example. With two 8 card fits, if one is 4-4 and trumps = 5+ possible tricks. What about the other one - a side suit? If it is also 4-4, not so hot. If it is 6-2 and solid = six tricks.

Over the years I have modeled many things using statistical methods - credit card default rates, various financial markets, parts of the US economy, optimal fast food joint locations, drilling rig utilization, you name it. I have taught more courses in statistics and econometrics than I can remember.

Modeling Bridge is really, really tough, and I have little confidence anyone can get very far with it. Again, there is the underlying problem is that Bridge is not played double dummy.

whereagles · November 6, 2014

I would call this a linear regression model. That one of the covariates happens to be a transform of something doesn't make it nonlinear.

A nonlinear model would be something that could not be rewritten as a linear model, for example

E(Tricks) = a*x +b*x^c

where a, b and c are parameters to be estimated.

hmmm you might be right. I'd have to check my definitions. As posed, it's certainly linearized (at least).

jdeegan · November 6, 2014

hmmm you might be right. I'd have to check my definitions. As posed, it's certainly linearized (at least).

News flash! If your equation can be transformed into something linear, you can use least squares. You might want to consider what a transformation does to your error term, though few actually worry about it.

helene_t · November 6, 2014

The transformation applies to a covariate so it doesn't do anything to your error term.

jogs · November 6, 2014

Have Campboy and Ginsberg defined the problem in the exact same way? If a pair has two 8-card fits you could count the number of tricks they take in whichever suit gives the highest number of tricks, or you could take one suit at random. If you take the average of the two you will underestimate the variance and if you count both with full weight you will overestimate the mean. A similar issue relates to choice of declarer.

From the Ginsberg TBW article NOV1966 page 9.

The program selected randomly among trump suits of equal length but always assumed that the contract was played from declarer's best side.

jogs · November 6, 2014

Actually, you can fit a non linear regression model to Ginsberg's data and run significance tests on c, c1, c2. I might set c = 0 from the start, though.

I'll try and do it this week-end... gotta read an MSc thesis right now, and see if I can dig up a few embarassing questions to the candidate lol.

Do you have a copy of the Ginsberg article? There's a chart of the data in its raw form on page 10. The tricks are occasionally +/- 4 from trumps.

campboy · November 6, 2014

From the Ginsberg TBW article NOV1966 page 9.
The program selected randomly among trump suits of equal length but always assumed that the contract was played from declarer's best side.

Thanks. The answer to Helene's question is "yes", then.

LoTT is a parabola

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

mikeh

Trinidad

Bbradley62

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation