Leaderboard
Popular Content
Showing content with the highest reputation on 06/24/2022 in Posts
-
With yesterday's and today's Supreme Court decisions, we're seeing just how much the US has been Trumped for decades to come (with a little help from Mitch McConnell).3 points
-
It is extremely well-known and has been stated many times over (by myself included) that GIB works by performing a very basic algorithm: 1) Generate a bunch of deals roughly consistent with the auction. 2) Calculate the double dummy result for each card you can play. 3) Pick the card that averages the best score. (On top of this, when advanced GIB is declaring, there is a single-dummy algorithm that kicks in after trick 2 that tries to make a plan to avoid putting off guesses. But this is not relevant to defending, or the cases discussed in this post). There is course a lot of complexity in step 1 - how does it find hands that match the auction? What if the auction is impossible, or so rare it can't be simulated? If it allows for some variation - as has been stated by barmar in the past - how does it know if a deal is 'close' to matching? But steps 2 and 3 are trivial. 5 years ago gwnn posted a hand where GIB plays a card at trick 10 that is guaranteed to be strictly worse than any other card, if any nonzero number of hands is simulated: [hv=http://www.bridgebase.com/tools/handviewer.html?sn=gwnn&s=SAT4HK964DAKC9872&wn=Roboter&w=S983HAJT75DJ4CKT5&nn=Roboter&n=SKQJ65HQ82DQ852CA&en=Roboter&e=S72H3DT9763CQJ643&d=s&v=b&b=7&a=1N(notrump%20opener.%20Could%20have%205M.%20--%202-5%20%21C)P2H!(Jacoby%20transfer%20--%205+%20%21S)P2S(Transfer%20completed%20to%20S%20--%202-5%20%21C%3B%202-5%20%21)P3D(New%20suit%20--%204+%20%21D%3B%205+%20%21S%3B%2010+%20total%20points)P3S(Support%203rd%20S.%20No%204th%20D%20--%202-5%20%21C%3B%202-3%20%21)P4C(Cue%20bid%20--%204+%20%21D%3B%205+%20%21S%3B%20%21CA%3B%2013+%20total%20points)P4S(2-5%20%21C%3B%202-3%20%21D%3B%202-5%20%21H%3B%203%20%21S%3B%2017%20HCP%3B%2018)P4N(Blackwood%20%5BS%5D%20--%204+%20%21D%3B%205+%20%21S%3B%20%21CA%3B%2015+%20total%20points)P5H(Two%20or%20five%20key%20cards%3B%20no%20queen%20--%202-5%20%21)D(6+%20HCP%3B%20rebiddable%20%21H%3B%20%21HKQ%3B%2020-%20total%20points)6S(4+%20%21D%3B%205+%20%21S%3B%20%21CA%3B%2015+%20total%20points)PPP&p=S3S5S2S4CAC3C2C5D2D3DAD4DKDJD5D7C7CTS6C6D8D9SAH5STS8SKS7SQC4C8S9DQDTC9H7H2H3HKHJH4HTHQD6H8CJH6HACKSJCQH9]400|300[/hv] That blew the usual 'maybe you just got a very very unlucky set of sims' excuse out the window. It has been bugging me every since - to be honest, it's very rare a week has gone past over the last several years where I don't think "gah, I wish I knew what GIB was doing". While my attempts to get access to the source code have failed, I can finally announce I know why it made (and continues to make) these mistakes. And that is because, rather shockingly, GIB does not perform step 2 as everyone believes it does. -- A couple of weeks ago, Lorand Dali posted about his new AI bridge project. Very interesting stuff and worth a read. I was slightly disappointed to find out it was entirely reliant on a having a pre-existing robot - learning to bid from a huge sample of hands generated by the GIB robot. Until I discovered how he generated the hands - not via online GIB hand records as I had expected, but by piping them into the bridge.exe program that freely comes with the Windows downloadable version of BBO. Wait, there's a free command line version of GIB? Yes - though of course, it's a version from 2012, so extremely out of date in terms of the bidding. If you think GIB has bidding flaws now, BBO did an amazing job of improving it from where it started while they were still working on it. How does this help? Because Matt Ginsberg added some debugging flags, documented in an archived version of his website. While the BBO version appears to have been altered somewhat, with some of the flags not working and some workarounds needed, there's still one available which outputs a trace of all of the simulated hands, how GIB scored them, and how that averaged out to its choice of play. For example, when trying to decide what to lead to trick 1 in gwnn's example hand, it generates 100 hands (this was another flag, I used 100 but BBO will be even less) and displays them each in this format: deal 76: S A Q 7 5 4 H 9 D A K 7 2 C J 7 6 S 9 8 3 S J T H A J T 7 5 H K 6 4 3 2 D J 4 D Q T 6 5 C K T 5 C Q 4 S K 6 2 H Q 8 D 9 8 3 C A 9 8 3 2 West to lead; S trumps mismatch 32.00 CK: 100 CT: 100 C5: 200 DJ: 300 D4: 300 HA: 200 HJ: 200 H7: 200 H5: 200 S9: 200 S3: 200 The first bunch of deals don't include the 'mismatch' line - the last group has increasing mismatch scores, which is presumably widening the range of hands it considers acceptable in include in the the simulation. And then at the end, a conclusion that averaged out of the double dummy results over all of the deals: S3: -24.70 -> 2.45 DJ: -27.70 -> 2.34 S9: -24.70 -> 2.15 D4: -26.70 -> 2.07 HA: -119.20 -> 0.95 HJ: -270.80 -> -0.83 C5: -275.70 -> -0.99 H5: -289.10 -> -1.10 H7: -289.10 -> -1.10 CT: -280.70 -> -1.45 CK: -339.70 -> -2.39 I play S3 I expect the DJ got a slight boost due to signalling, but so far it's all making sense. There's just one small catch. Some of the earlier deals in the set have question marks after some of the double dummy results: deal 0: S A Q T 7 2 H 4 2 D Q 9 3 C 9 7 2 S 9 8 3 S 5 H A J T 7 5 H 9 6 3 D J 4 D K T 8 7 6 2 C K T 5 C A J 6 S K J 6 4 H K Q 8 D A 5 C Q 8 4 3 West to lead; S trumps CK: 300? CT: 400? C5: 400? DJ: 400? D4: 400? HA: 300? HJ: 400? H7: 400? H5: 400? S9: 400? S3: 400? In this case, the first 32 deals have ? after all results, and the remaining 68 have none - though on other occasions, some deals have ? for some play cards and not for other played cards. And most importantly - some of the scores with ? are incorrect. Look at what happens when we get to the crucial card at trick 10 in gwnn's case. The first simulated deal: deal 0: S J H Q 8 D --- C --- S --- S --- H A J T H 4 D --- D 6 C K C Q S --- H 9 6 D --- C J West to play to H2, H3, HK; S trumps N/S have taken 9 tricks HA: -1430? HJ: 100? The question-marked figures say that playing the heart Ace will allow the slam to make - and the J will cause it to go down! In fact, the first 44 simulated hands all have the same conclusion. On deal #44 (0-indexed!), it gets it right for the first time: deal 44: S J H Q 8 D --- C --- S --- S --- H A J T H 9 D --- D 6 C K C Q S --- H 6 4 D --- C J West to play to H2, H3, HK; S trumps N/S have taken 9 tricks mismatch 16.00 HA: 100 HJ: -1430 Note that there is nothing special about this deal that separates it from the others - the exact same hand with East left with holding 9-6-Q appeared several times in the first 44. On deal 45 it's also correct, but 8 of the next 19 hands it has the incorrect figures, before all others are correct. So as the final result, on 52 of the 100 hands, it believes ducking is required to beat the contract - when it isn't true once. When it combines 52*100 and 48*-1430, you get 63440 - which it provides as its final output: HJ: -634.40 -> 0.68 HA: -695.60 -> -0.68 I play HJ Oops. I took a second example, posted by bixby a few months ago, where throws away its high card on trick 10 in a no-win, rarely-tie, mostly-lose scenario. [hv=https://www.bridgebase.com/tools/handviewer.html?lin=st||pn|bixby,~Mwest,~Mnorth,~Meast|md|2SAQ62H53DQTCAJT92,SJ854HDKJ7654C764,SKT9HAQ76DA832C53,S73HKJT9842D9CKQ8|sv|b|rh||ah|Board%204|mb|P|mb|1D|an|Minor%20suit%20opening%20--%203+%20!D;%2011-21%20HCP;%2012-22%20total%20points|mb|2H|an|Aggressive%20weak%20jump%20overcall%20--%206+%20!H;%204-10%20HCP|mb|D|an|Negative%20double%20--%204+%20!S;%207+%20HCP;%208+%20total%20points|mb|P|mb|2N|an|4+%20!D;%203-%20!S;%2011-14%20HCP;%2012+%20total%20points;%20stop%20in%20!H|mb|P|mb|3N|an|5-%20!H;%204-5%20!S;%2014-21%20HCP|mb|P|mb|P|mb|P|pc|S7|pc|S2|pc|SJ|pc|SK|pc|C5|pc|CQ|pc|CA|pc|C7|pc|CJ|pc|C6|pc|C3|pc|CK|pc|H4|pc|H3|pc|D7|pc|H6|pc|ST|pc|S3|pc|S6|pc|S8|pc|DA|pc|D9|pc|DT|pc|D4|pc|S9|pc|H2|pc|SQ|pc|S5|pc|SA|pc|S4|pc|H7|pc|C8|pc|CT|pc|C4|pc|D2|pc|HJ|pc|C9|pc|D6|pc|D3|pc|HK|mc|12|]400|300[/hv] Is this because it was unlucky and every hand it simulated resulted in the equals case? On my run, it found the equals case just 5 times - no question marks: HK: -690 HT: -690 On 19 occasions, it had a definitive value for the heart T, but thought - with a question mark - that throwing the king would get a *better* score HK: -630? HT: -660 On the other 76, it came up with the right values: HK: -690 HT: -660 In this case, that was enough to weight it to making the correct play (it chooses 8 among equals after the analysis): HT: -661.50 -> 0.57 HK: -678.60 -> -0.57 I play H8 But given it's capable of including completely wrong dummy double analysis scores in its calculations, it's no longer surprising with a smaller / different set of hands that the incorrect ones could end up biasing the results enough to play the wrong card. -- Note that it's quite possible that BBO have improved the play engine of GIB since v21, which is the one tested here, though all reports have been that they haven't touched it other than forcing it to lead an ace against 7NT. Conclusion: I don't know why GIB's "double dummy" analysis causes it to give correct scores for some cards, and incorrect scores for others. Clearly, this is a deliberate part of the program, due to the fact it it marking potentially wrong figures with a question mark (they're not all incorrect) - not that it is intentionally making mistakes, but I assume it is running some sort of optimization that speeds up the double dummy calculations rather than guaranteeing correct output. If this is required for some reason, why it does not at least switch to guaranteed correct output at least for later in the play when this should be fast, I also don't know. But at least I know more than I did.1 point
-
Thanks Mikeh for an interesting improvement. That point about different 4-1 trump breaks and their effects on the odds is interesting. I didn't think about the vacant spaces issue, which is probably an indication of the gap between a decent player and a real expert. Here are some simple probabilities for the basic line (with no allowance for vacant spaces or the absence of opposition bidding which could indicate against extreme distribution): Works where Spades 3-2 (68%) or 4-1 (28%) and Hearts 3-3 (36%) or 4-2 with Jack dropping (16%) or singleton Jack (3%). That's 96% x 55% = 52.8%1 point
-
I’m not able to calculate the odds, at least not without far more time and/or pen and paper, and I always try to ‘solve’ these play problems within the time I’d be able to take at the table. It being mps makes that challenging: at imps in a long match one can usually afford to play important hands slowly and catch upon the easier ones I’d be reluctant to play on diamonds. Sure, it could well work and, as I say, I haven’t crunched the numbers. I’d be concerned that I’m going to be in trouble if diamonds sit badly and anyone but especially east has 4 trump. Say I duck the lead, win the next club, assuming they continue the suit, cross in spades, lead the diamond 9. East pops and plays a club. I have to ruff. Say I play another diamond…to the 10. West wins. If he has KJxxx or the more likely AJxxx (given that east popping is more suggestive of Kx than Ax) he should win the king, not the jack. Now he plays another diamond…a decent opponent by this time knows almost precisely what’s going on. There are various other risks with playing on diamonds. If I duck the club, the odds are they continue, but that’s not a sure thing…maybe opening leader can usefully play on diamonds. But let’s assume clubs are continued Firstly, while it’s normal to duck from Axx, it’s not clearly correct here (though that’s not the same as saying it’s wrong)…again, if diamonds are splitting badly and trump are 4-1, they may be able to hurt us with a diamond switch. Also, simply knowing that the opps are ‘decent’ is less than I might know at the table. Some opps, even some experts, love to give count. Against them I’d consider winning the club and playing a heart to the Queen, as if I’m taking a finesse. Frequent count givers may tell me how hearts are breaking. Then I draw two rounds of trump, ending in dummy. If trump are 3-2, I play on hearts, succeeding with 3-3 or Jx or Jxxx holding the third spade. If trump are 4-1, and east has 4, I play a heart to my 9 unless I believe they gave honest count in hearts and that hearts are 3-3. If west has 4 trump, I need to hope hearts come home.1 point
-
You appear to have joined the "Prime" community. If you don't want the cool purple badge go back to the Prime area and cancel the subscription. Your normal sallow complexion will be restored when the subscription expires.1 point
-
"Single mode of evaluation" is bad. Mollo mocks it with the Walrus; it is screamingly clear that pure and only LTC is also (usually) awful. That's why nobody past novice does that. We just assume this when having Walrus count conversations. Why does nobody allow this for LTC, but just say that "KQxxxx xxxx xx x and AJ9xxx AT9 A98 Q are considered the same, and that's stupid"? Of course it's stupid. That's why nobody past low intermediates and LTC zealots (but I repeat myself) does that. As I have said repeatedly - including here - LTC is a great "substitute for judgement" *on top of* HCP. It's a great way of explaining your judgement to those who don't have it yet - MikeH's "a 6-loser 18 count is very strongly on the weak side of 18s, I would downgrade out of GF" - makes sense to people in one sentence that wouldn't understand the 3 paragraphs of *actual judgement* Mike actually does. I think - but have not tried - LTC + cover cards could play well; "an Ace is equal to a Queen" is fine if partner's cards are therefore more likely to be Aces than Queens because you have them. After all, that's what we do with control cue bids (for AK) and spiral scan (for AKQ). I'm at the point where my judgement is good enough for me that relearning how to think isn't worth it, but I think a good, solid, detailed version of this (likely with some sort of spiral scan) would work at least as well as HCP + adjustments. But no single mode of evaluation will be better than good judgement. It's just that all the people here have judgement for HCP "automatically", and laugh at LTC without judgement applied (partly because it's easy, partly because they don't "automatically" have the LTC-based judgement to apply).1 point
