Jump to content

Bidding system designed by computer


Recommended Posts

If you are going to use a genetic algorithm, why not let it determine the ranges and card counts, as an alternative to starting with a decision tree?

The problem is that there are millions of decision trees (one for each possible situation that could arise during the auction), so you can't optimize individual ranges.

 

You need some general principles that guide the construction of the decision trees. Those general principles can have parameters which can be optimized. For example, the utility of knowing that p has 3+ in a major when you have four cards yourself could be a parameter.

 

what form will your cost function take?

I was thinking of using a very crude heuristic, say tricks=trumps + (hcps-20)*0.4, where trumps is 6.5 for NT contracts. Then IMPing the achieved result to "PAR", or maybe just counting total points. Obviously something could be thought of that would be a lot better but I think this is a minor issue compared to all the other problems this problem will/would face.

 

How big is your sample size likely to be?

I dunno, what would you suggest? Say I have computer resources for 100,000 auctions, would something like 3333 generations of 30 individuals be reasonable?

Link to comment
Share on other sites

I would like to point out, that the suggested approach will lead to a "computer optimized" system and not to a "computer generated" system.

 

The parameterization is in fact the core of a system, what would be left to the computer would be to optimize some ranges.

The same is to say about fitness functions.

 

Of cause computers can't generate systems yet and I doubt that they will be able to do anything like that in the near future, so a computer optimized system is the only realistic target reach.

 

I would suggest to start with a reduced decision tree / sample set e.g.

1M openings and follow up bidding with fit.

Link to comment
Share on other sites

- double dummy results

- actual results by experts

- something else?

Strictly speaking, single dummy. Double dummy results tend to over estimate slam potential on strong hands and under estimate part score hands (in the former case, the declaring side has most of the top cards, is in control and will always drop/finesse or squeeze correctly - in the latter case the defence has enough cards to do some damage and will always get off to the right lead etc).

Link to comment
Share on other sites

Hi Helene

 

Haven't given all that much thought to this, however, this sounds a lot more like a genetic algorithm rather than a decision tree...

 

From my perspective, the right starting point is defining how you plan to measure fitness, after which you can starting considering initial populations, followed by mutations and recombinations.

Link to comment
Share on other sites

Hi Richard,

 

I think those issues you mention are the relatively easy ones.

 

What I find difficult is the "physiology" of the bidding system: how to map it's "genes" (mutable characteristics) to decision making in concrete situations.

 

When I design conventions and systems "by hand", I do it quite differently from what I have sketched above. The manual method is simpler in that I will assume certain symmetries (for example, the follow-ups after a natural 1 opening are similar to those after a 1 opening). But it is more complex in that I will not sequentially optimize the meaning of the first step, then the next step etc., nor will I separate the informational part (which information to convey) from the semantic part (how to encode that information).

 

I wonder if it would be better to let the computer-based system should mimic the manual procedure more closely.

Link to comment
Share on other sites

Without thinking this to the end I suggest something like that:

 

The way I understand this, is that the "DNA" needs to be the full decision tree aka the whole bidding sequence.

So you automatically generate a starting generation with decision trees that contains (random) combinations from limit raises, preemptive raises, strong raises, Bergen, Jacoby, Mini-Splinter, Splinter, ... ,passing if a termination rule is true.

To each decision tree generate several individuals with different HCP / suit length intervals for each decision.

 

As fitness function take the percentage of the testset boards that fit the function decision tree difference_in_total_points (decision tree result , par result) < (something like 199).

 

To build the new generation take the average/combined ranges of identical decisions in the tree. (The problem can/will occur that the decision tree will not have ?disjunct? decisions!)

Or perhaps transfer complete nodes.

Link to comment
Share on other sites

- double dummy results

    - actual results by experts

    - something else?

Strictly speaking, single dummy. Double dummy results tend to over estimate slam potential on strong hands and under estimate part score hands (in the former case, the declaring side has most of the top cards, is in control and will always drop/finesse or squeeze correctly - in the latter case the defence has enough cards to do some damage and will always get off to the right lead etc).

Is there a database containing hands and single dummy results available?

Link to comment
Share on other sites

If you are going to use a genetic algorithm, why not let it determine the ranges and card counts, as an alternative to starting with a decision tree?

The problem is that there are millions of decision trees (one for each possible situation that could arise during the auction), so you can't optimize individual ranges.

 

You need some general principles that guide the construction of the decision trees. Those general principles can have parameters which can be optimized. For example, the utility of knowing that p has 3+ in a major when you have four cards yourself could be a parameter.

 

What if the problem was separated into a number of developmental stages. For example, instead of looking at HCP and distribution, why not initially look at just the distribution (ignoring HCP) to ensure that the algorithm can find the optimal suit fit, or determine if no trumps is the better option.

 

This may result in a set of possible solutions that could be used as a basis for the genetic algorithm when both HCP and distribution are taken into account.

Link to comment
Share on other sites

I don't know squat about programming, but as a human I have been down the evolutionary approach in system design, wherein marginal changes are trialled in order to ascertain whether the result is superior. This approach suffers from one major drawback: You might conclude that you have optimised the system at the point when you are unable to make a small change to the system such as to provide an overall benefit. The problem with that is that the changes may not be large enough to identify a benefit that could be available by making a much larger or more fundamental change.

 

Visualise your achievement in system design as graphically presented on the face of a mountain. The higher up the mountain you climb, the better the system. Being blind, you feel about with your foot and notice that (say) travelling North leads uphill, so you move in that direction until travelling in any direction leads downhill. You conclude that you have reached the summit, and indeed so you have. But then if you take off the blindfold, in the distance you see a higher mountain altogether. If your baby steps had been a couple of miles apart you might have detected it. Translating this effect into a self-programming computer sounds to me like a monumental task.

Link to comment
Share on other sites

Yeah, your sample size is too small. And if you try the new system out for years to obtain a good sample size, you will get confounding due to changing fields, changing partners, or improvement of your own game irrespective of system.

 

My feeling is that to make sense of real life experience with bidding methods you need to assess not only how often it gave good results / bad results, but also why it gave those good results. Most good results may turn out not to be evidence for the strength of the system but could have been obtained with the reference system as well. Or maybe a flaw in the system took you to an inferior spot which happened to give a good result due to an unlikely split. But I may be wrong about this since it is very hard to make all these assessment objectively. Probably the only way to avoid an emotional bias is simply to compare the MPs or IMPs you get with one system to what you get with the other system.

 

However, experience can be combined with theoretical assessment. Say you change to a strong club system and from a theoretical POV there could be an issue with opps preempting over your strong club. You can identify some situations that are likely to take you to a bad spot, find out how often they come up either by back-of-the-envelope probability calculations or by simulations, and then inspect simulation results manually to get an idea of how often you would actually end up in a wrong spot.

 

All this knowledge can be augmented by your experience when playing the system at the club: maybe we have a great defense against enemy preempts in theory, but in practice it often leads to bad results because we have difficulties dealing with opps with unknown preempt style. Maybe lots of boards that should have given us a bad result give us a good result because opps preempt too much, too little, or are not on the same wavelength.

 

And then there are issues that can hardly be dealt with theoretically. Such as: Do we feel comfortable playing this new style? Maybe it turns out that having to memorize complex conventions and explain them to annoyed opps take too much focus away from the really important things like judgement.

Link to comment
Share on other sites

- double dummy results

    - actual results by experts

    - something else?

Strictly speaking, single dummy. Double dummy results tend to over estimate slam potential on strong hands and under estimate part score hands (in the former case, the declaring side has most of the top cards, is in control and will always drop/finesse or squeeze correctly - in the latter case the defence has enough cards to do some damage and will always get off to the right lead etc).

Is there a database containing hands and single dummy results available?

There are couple of articles

here (linked about 2/3rds of the way down the page) about the perils of both double and single dummy data.

 

Suggest (could be wrong) that to get a worthwhile database of single dummy data you need to generate a large quantity (like 6 figure number of hands at least) of results of relatively decent bots playing against one another.

 

Dunno exactly how you do that - the organisers of the computer world championships specify a standard interface so that bots can communicate over a network - and the makers of said bots ought, in theory, to be interested in helping to produce such a database (if they don't have one already!) provided they were given a copy of it - I would have thought...

 

Nick

Link to comment
Share on other sites

Just making a little experiment with a neural network instead, to get something running quickly.

 

After 300 generations it still bids horribly ineffecient. Decisions to leave in partner's double are generally reasonable but the constructive bidding is ridiculous and they have a bad habit of redoubling everything and partner doesn't rescue.

 

They almost never open at a higher level than 1, and you end up with bizarre systems like

pass=basically 0-7 or 18+

1=8-14 points with 4-6 clubs and either 1- hearts or exactly 4 hearts.

Maybe something like that could be effective if the follow-ups worked. Would be interesting to see.

 

Now I have a go with 100,000 generations, I wonder if something sensible will materialize. I am hoping for something really bizarre with many multi-bids, forcing passes etc :P

Link to comment
Share on other sites

pass=basically 0-7 or 18+

That's also featured in human-designed systems.

 

1=8-14 points with 4-6 clubs and either 1- hearts or exactly 4 hearts.

even that is vaguely similar to the TRS major-suit openings

I would be highly amusing if the Neural Network produced something akin to the Vulcan Variable Pass

Link to comment
Share on other sites

pass=basically 0-7 or 18+

That's also featured in human-designed systems.

 

1=8-14 points with 4-6 clubs and either 1- hearts or exactly 4 hearts.

even that is vaguely similar to the TRS major-suit openings

I would be highly amusing if the Neural Network produced something akin to the Vulcan Variable Pass

Actually I played that system, and I would again but it is illegal pretty much anywhere.

Link to comment
Share on other sites

This doesn't work. Lots of deals are passed out and lots end up in nonsense 5 contracts. There seems to be very little association between the hands and what calls they make, except that in 3rd/4th seat a pass tends to be weaker than an opening bid. Yet this is not certain. It will sometimes pass with 25 HCPs in 3rd seat.

 

I must be doing something wrong. Now trying with uncontested auctions only, this may be simpler. Part of the problem with contested auctions is there is a lot of noise in the scores as you get rewarded for your opponents' foolishness.

 

Neural network:A system is comprised of two different networks, one for 1st/2nd hand bidding and one for 3rd/4th hand bidding. The synapses that connect the hidden layer to the output layer are shared, though.

Input layer: binary indicators for whether a legal call has been made or not, for example one indicator for the event that LHO bid 1 at his first turn, one for the event that I bid 1NT at my second turn etc. Also 5 hcp-indicators and 4 suitlength-indicators for each suit, based on 5(rsp 4) different inver-logistic functions with a range of 0 to 100 instead of 0 to 1! This is because it was my experience that without that amplification, bidding decisions would initially (with random synapse weights) be insensitive to the hand. In the end this doesn't matter as synapse weights can adjust to achieve the needed amplification, I just made that decision to get some variance to select on in the beginning.

Also, I have amplified output neurons by a factor 1.3^k where k is the step (0 for pass, 1 for double etc). I found 1.3 to give a reasonable degree of aggressiveness of the initial networks. Without such an amplification, most auctions would end up in 5xx initially.

20 neurons in the hidden layer.

 

Evolution: Each generation is a single board round where each system delivers one EW pair and one NS pair. Results are based on a simple heuristic that predicts the number of tricks probabilistically from the number of HCPs and the number of trumps and then computes mean rewards for the reached contracts.

Matchpointed, fitness is updated is 0.1*matchpoints+0.9*previousfitness, where previousfitness is set to the mean of the field for new systems entering the population.

After each generation, the 2N (N may be 10 or 20 or so) worst systems are killed and replaced with:

- Mutant copies of the N best system, by which each synapse gets a Gaussian mutation level of 1/400 times the variance in the seeding population.

- Hybrids between the two best system and the 2nd..(N+1)th best, by which half the synapse weights are inherited from each.

Link to comment
Share on other sites

This doesn't work...

 

...20 neurons in the hidden layer...

20 neurons in the middle layer is miniscule.

 

300 generations is miniscule to evolve something as complex as a bidding system.

 

One of the most successful applications of neural nets to game programming is backgammon - I believe (iirc) the best open source, neural net program is gnu - I think it has 128 neurons in the middle layer - further I think it has 2 maybe 3 different neural nets (for different situations) plus bear-off database. Its training dataset is in the order of millions of positions - I think.

 

I don't want to put a downer on what you're doing - coz I find it extremely interesting - but you're at the stage where you're playing with the problem.

 

If I've understood you correctly you're presenting the bidding so far and the hand to the net and asking it what is the best call - another simpler way of doing it is presenting it with the same input plus one alternative call at a time and ask it (via one single output) how it rates each theoretically possible call. This saves the net having to work out that illegal calls should have a low score coz you never present illegal calls to the network in the first place - the business of deciding what is legal or not can stay in a procedural part of the program.

 

Also - do you need to tell it what the hcp is - let it learn what cards are good - ok it blows up your input layer - so what.

 

Also - not sure that you want to concentrate on constructive auctions only - unless you have two different nets - one for the constructive auction and one where there is interference. I am reminded of the case of the US army training a neural net to recognise friendly and soviet tanks - in testing it performed wonderfully - with live data it was awful - turned out that the pictures of US tanks in the training data had all been taken on a fine day and the soviet ones on an overcast day. In other words they had produced a neural net that was very good at recognising what the weather was like!!!

 

Nick

Link to comment
Share on other sites

Thanks, Nick. I will try with more neurons in the middle layer. I think this is more complex than backgammon so I might need several hundreds.

 

In the meantime I have done 100,000 generations with constructive auctions. It is not very good but at least better than random. 1000 deals, result compared to PAR:

_____Neural Network

__PO PS_ G

PO 3 71 88

PS 2 135 278

G_ 2 157 264

(This table shows that for example on 71 boards PAR was passout but it bid a partscore. On 264 PAR was game and it bid game). With respect to denomination it is only slightly better than random.

 

the business of deciding what is legal or not can stay in a procedural part of the program.
It is, sorry if I didn't make this clear.

 

Also - do you need to tell it what the hcp is - let it learn what cards are good - ok it blows up your input layer - so what.

Yeah but it appears already to be too difficult so I don't want to add to the complexity.

 

Maybe it would make sense fist to train one neural network to predict rewards looking at both hands, and then somehow use that network when constructing networks for bidding.

 

not sure that you want to concentrate on constructive auctions only

Yes, conceptually contested auctions are simpler because you don't have the issue of which hands to include. You just include all of them. Also, in terms of programming and in terms of computing time it doesn't make much of a difference. It's just that the results from the contested auctions were pretty useless so I thought maybe it would be easier if EW were made silent.

 

Of the final population (100 systems), here is a randomly chosen system's 3rd seat opening scheme (the initial pass could be anything):

Pass: 0-8 points, 5- spades

1: could be anything

1: 6-18 points, 4- diamonds, 5-hearts

1: 7-8 points, 2-4 clubs, 3- diamonds, 5-6 hearts, 3-4 spades

1: 5-19 points, 2+ clubs, 5- diamonds, 1-6 spades

1NT: 5+ points, 5- clubs, 2+ spades

2: could be anything

2: 5+ points

2: 10+ points, 3-suited short in spades

 

Obviously there is a lot of overlap here so apparently most openings (except for the well-defined 1 and 2 openigns) must be some kind of multi-bids. Not sure how to make sense of those, any ideas? I tried a classification tree algorithm, leading to

Pass: 0-6 HCPs, 3+clubs, 4+ hearts

1: 0-9 HCP, 3+ clubs, 4+ diamonds

or 10+ HCPs, 3- clubs, 2- spades

or 10+ HCPs, 4+ clubs, 1- spades

1: 10+ HCPs, 5+ clubs, 4+ spades.

1: (undefined)

1: 7-9 HCPs, 3+ clubs, 5+ hearts, 2 spades

or 10-12 HCPs, 4+ clubs, 5+ hearts, 2-3 spades

1NT: 10+ HCPs, 2- clubs, 3+ spades

or 12+ HCPs, 4+ clubs, 2-3 spades

2: 0-6 HCPs, 3+ clubs, 3- hearts

or 0-6 HCPs, 3+ clubs, 2- spades

or 0-9 points, 2- clubs, 3- diamonds

or 10-11 HCPs, 4+ clubs, 2-3 spades

2: 7-9 HCPs, 3+ clubs, 3+ spades

or 10 HCPs, 3 clubs, 3+ spades

or 10+ HCPs, 4+ clubs, 4+ spades

 

Sorry to bother you with negative findings, this obviously doesn't work.

 

I'll try with some more hidden neurons. Any other ideas are very welcome. Also, I am happy to share code if anyone wants to have a go.

Link to comment
Share on other sites

...Any other ideas are very welcome. Also, I am happy to share code if anyone wants to have a go.

Did you say what you've written it in? Can do Java and C - suppose I could figure out C++ if you've done it in that - I *might* have some time. Anyway, the email is:

nickwarren at tinyworld dot co dot uk if you think some help would be useful.

 

Nick

Link to comment
Share on other sites

Sorry to bother you with negative findings, this obviously doesn't work.

 

I'll try with some more hidden neurons. Any other ideas are very welcome. Also, I am happy to share code if anyone wants to have a go.

Very interesting, and worth trying. Do you have any results for the overcalls or responses to the openings?

 

The negative results could also give an insight on how bidding systems are developed. I would be interested in them.

 

I am also interested in the code.

 

In your notation, what is the difference between 3+ and 3- ?

Link to comment
Share on other sites

3+ means three or more, 3- three or less.

 

I write in R. Maybe should change to Java or C++.

 

Cascade suggested making random calls and then as the experience accumulates, choosing the calls that have lead to good results in the past with similar hands in similar situations.

 

I thought of something else. It will rely on a reference DB which is simply a list of some 1000 hands (single hands, not deals) which broadly covers the HCP/distribution combinations that can come up. Rather than letting the neural network use the auction so far as input, I would let it use the subset of the reference DB that partner can have, as well as the subset of a reference DB that it could have itself. This would lead to a two-stage neural network:

 

The secondary stage uses the HCPs and suit lengths of the hand, plus the subset of hands in the reference DB that p could have, plus the subset it could have itself, as input. And legal calls as output. This would lead to some 23 output neurons and (with a reference DB of size 1000) 2000+ input neurons. So 100,000+ synapses with a hidden layer of size 50.

 

The primary stage would take the same subsets of the reference DB as input, and also some essential information about the auction (what was the last bid, by which side, is it dbld/redoubled, does p get another turn if I pass). And it would take the synapse weights for the secondary stage as output. So 2000+ input and 100,000 output.

 

This will be very computationally expensive, not only because of the size of the primary-stage network, but also because the subsets of the reference DB have to be constructed all the time.

 

But the advantage would be that the secondary stage will not have to rely explicitly on the auction (which it won't know to interpret initially, and also the interpretation of a given auction changes in the course of the evolution so what was optimal in the past may not be optimal now). Rather, it relies on the information conveyed by partner, and by itself, as coded as subsets of the reference DB.

Link to comment
Share on other sites

I've made a very small start on writing a computer program that would build a bidding system. The idea is to start with something very simple (preferably nothing) and to apply evolution to it.
AFAIR, Mathew Ginsberg's Bridge Program was intended to use a bidding system optimised for computers.
At every stage we make many small changes to the existing systems, and then let the new programs bid against eachother. The programs that do well multiply, those that do badly die off, and again small changes are made to the new programs.
Claude Shannon's Chess program and Arthur Samuel's Checkers program altered the weights of assessment criteria by playing programs against each other, using genetic algorithms. AFAIR, they discovered that, if you make only small changes, the evolution may suffer from a plateau effect: you find the summit of a local hill, while ignoring a neighboring mountain.

 

In theory, a computer program has access to a large database and never suffers from memory problems, so it has no need for simple or consistent treatments (ie system on). In practice, the onus of disclosure limits the amount of complexity. If it has to compete in the real world, it may also have to conform to illogical and constantly changing local system regulations.

Link to comment
Share on other sites

OK. Brain dump...

 

1. I think hrothgar's comments about termination conditions are important. A useful bot has to either have (be programmed with) or be able to evolve some criteria for when to stop at what level - and in what denomination. My experience in doing this is somewhat double dummy. Some years ago I corresponded with Zar Petkov quite a bit and was somewhat both inspired by, but suspicious of his results - so I decided to check it out in a practical way - but I also realised that doing this by real life experience would be inadequate (especially as I wasn't playing any real life bridge at the time!) So what I did was build a db of double dummy results - one bot played perfectly - it had access to the double dummy results - my only concession to real life was that it would only double you if you were two off (the other, test bot also had access to the doubling function). Naturally the double dummy bot always won - but the test bot could lose by less depending on how good it was. The test bot had access to both its side's hands and had ways of counting points (by whatever point count system I cared to use), adding extras for shape, longer fits, deducting points for 7 card fits, shortages, stiff and doubleton honours, a different point counting algorithm for assessing prospects in NT were allowed as well - it also had an array of numbers of points being equivalent to bid to a certain level. Both the algorithms and the all important arrays (there was one for NT, one for major contracts and one for minors*) were all tweaked by me - though there is no reason why you couldn't have a program do its own tweaking.

 

(*You might ask why different arrays were required for majors and minors - surely the trick taking power of hands are independent of some arbitrary classification of suits into categories. Well yes - but the scoring system isn't - I was scoring boards by IMPs and it turns out that the bot, just like in real life, needed to push for major games and tend to steer clear of 5M and 4m contracts - whereas some push for 5m is desirable. Anyway - turned out that 6-4-2-1 was a perfectly good way to program a bot to bid suit contracts [beat 4-3-2-1 like hands down - so Petkov was on the right track] - but it was for the birds when it came to NT.)

 

Naturally, the above is not much direct use to you in that you're doing this single dummy so to speak and trying to evolve a system - which I certainly wasn't doing - merely testing hand evaluation criteria. Never the less, an auction has to be able to stop - and stop at a sensible point - also - for the results to be useful to a human - or even human interpretable - one needs to have bid definitions that show something - or show a range of options. Further, I know it is getting a little a bit ahead of ourselves at this stage, but there is the matter of disclosure to the opps as well - this also needs some way of communicating what your bids show too.

 

2. I understand the matter of the evolving a decision tree which you and others alluded to earlier in the thread - even for uncontested auctions the tree would need to be huge. Possibly that is where things have to inevitably end up - but it is a gargantuan ask for a bot being asked to evolve a system from scratch - or so it seems to me.

 

3. Your (it seems to me) compromise idea above - about starting with a database of hands - seems a possible way to go - the database is of manageable size - yet its use gives partner access to a sub set of hands that would match a certain sequence of bids. 1000 hands is plenty big enough - I think - for openings and first responses - but may start to be looking very small for the 3rd round of bidding... So it possibly needs to be bigger.

 

4. Binary indexes - take some time to build and occupy quite a bit of memory by the time you have a lot of them - but can be enormously quick for searching - just and the indexes together to get references to which hands match - that is, of course, if you have a definition of what a bid shows!

 

5. A solution doesn't need to be entirely neural net based. I've followed the evolution of the bots that the University of Alberta have been producing over the last decade or so to play Holdem. One of their earlier successes (Pokibot) (iirc) had a neural net - and other algorithms - and actions being determined on a voting system between the algorithms - Poki is still one of the best bots there is years on for the multi player limit game (they've made much more sophisticated ones for the heads up game - but that is such a restricted subset of the larger problem).

 

.... The Mrs - ye gods - has put the footie on the goggle box ... can't concentrate anymore.... the Seth Efricans have scored against the French - Yes!!!!!! And a Froggie has been sent off!!!!

 

Nick

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...