tysen2k Posted September 8, 2005 Report Share Posted September 8, 2005 The question is can someone tell by how much extra these small pluses are?Yes! Right here: [improving Hand Evaluation Part 1]http://tinyurl.com/25huc [improving Hand Evaluation Part 2]http://tinyurl.com/383e6 Everything is in terms of tricks, so there is no discussion of Zar, TSP, BUMRAP, or anything else. Tysen Quote Link to comment Share on other sites More sharing options...
Zar Posted September 8, 2005 Report Share Posted September 8, 2005 >He proposes his own evaluation method for NT (A=4, K=2.8, Q=1.8, J=1, T=0.4)< That’s the problem with all these Binki, Kinki, Rum etc. stuff that some people think that they are great and “PRECISE” because they use “PRECISE” numbers like 0.75, 0.15, 1.8 etc. If using such tiny fractions is what constitutes “precision” for you, hey – go ahead :-) >The first set of arrows shows hands that a 5-3-1 system counts as "equivalent." The second set of arrows shows some hands that Zar counts as equivalent. Which set looks more tightly clustered to you?< Here is what’s tightly clustered, so you understand once and forever. Dist. 531 Zar 4441 3 115431 3 136331 3 14 All of these are the SAME in 531 as they are in 321 and 741. Are they tightly clustered enough when they are ALL equal to 3? Is it just fine with you to pull-out 2 CARDS from your longest suit and say “Sorry, your longest suit will be 2-carsds shorter, but I know it’s OK with you since for you it doesn’t matter if you have a 6-card-suit as your longest or a 4-card-suit as your longest suit”. Dist. 531 Zar 5440 5 136430 5 167330 5 17 All of these are the SAME in 531 as they are in 321 and 741. Are they tightly clustered enough when they are ALL equal to 5? Is it just fine with you to pull-out 2 CARDS from your longest suit and say “Sorry, your longest suit will be 2-carsds shorter, but I know it’s OK with you since for you it doesn’t matter if you have a 7-card-suit as your longest or a 5-card-suit as your longest suit”. I know your “logic” that it doesn’t matter since “when I am short my partner will be long” but ... how to tell you ... I guess you just bid 4S when the bidding comes to you since “your partner will cover for your shortness in Spades” :-) >I'm still a bit confused regarding the accuracy of the Goren 4/3/2/1 point count...When Tysen provided standard error calculations for a variety of hand evaluation metrics he posted the following data: R2 Standard ErrorZar + fit 0.74 1.05 HCP 0.65 1.21 It might be worthwhile to try to reconcile the difference... < I have posted the FORMULAS I am using, took you by the hand and walked you through, right? And all the data is available on the site. Nobody has a clue what Tysen and you are doing, so I just cannot judge. And you see from his previous posting that he doesn’t even know how fit is calculated so ... PLUS I use only contracts in Major, as you well know. ZAR Quote Link to comment Share on other sites More sharing options...
awm Posted September 8, 2005 Report Share Posted September 8, 2005 Zar points seem to place a lot of emphasis on having long suits, with the idea that "if you have a long suit, you're likely to take more tricks." For example, according to Zar points before you know partner's distribution: 6331 (14 zar) is much better than 4441 (11 zar) 7222 (14 zar) is roughly equivalent to 5440 (14 zar) Now obviously these will be adjusted up or down when you find (or fail to find) a fit. But it seems like the initial evaluation should roughly measure the average or expected value of the hands. If some particular pattern almost always finds a fit and always gets huge upgrades after the fit is found, it seems reasonable to assume that the initial valuation of that hand is probably too low. The funny thing is, 5440 distribution seems to be very powerful. From Tysen's data, it seems like 5440 is much more likely to find a game than 7222 or 6331. Shouldn't the initial evaluation reflect this? Quote Link to comment Share on other sites More sharing options...
inquiry Posted September 8, 2005 Author Report Share Posted September 8, 2005 The funny thing is, 5440 distribution seems to be very powerful. From Tysen's data, it seems like 5440 is much more likely to find a game than 7222 or 6331. Shouldn't the initial evaluation reflect this? Thank you... exactly... Do you know I open 5440 hands with five controls 2♣ (forcing). This is in part because of presumed fit provides me with a lot of safety and playing stregth provides a lot of up=potential. Quote Link to comment Share on other sites More sharing options...
cherdano Posted September 8, 2005 Report Share Posted September 8, 2005 Ben, sorry, but you have gotten it wrong another time. And that after mikestar already gave a correct table. If you use TSP distribution points (531 + length points), divide by 5, and stop calling it BUMRAP. If you use 531, divide by 3 (and remember it is only about differences, the absolute value has no meaning). And even though it combines well with BUMRAP, you shouldn't call this BUMRAP either. If you do this, you will realize that TSP distribution points and Zar points are extremely close, maybe that TSP distribution points are slightly more accurate, and possibly you might also realize that mikestar has already done exactly this. Look on page 18 of this thread. One thing of interest in that table confirms what I thought, namely that Zar points undervalue 4-3-3-3 distributions. Arend Quote Link to comment Share on other sites More sharing options...
kfgauss Posted September 8, 2005 Report Share Posted September 8, 2005 Zar points seem to place a lot of emphasis on having long suits, with the idea that "if you have a long suit, you're likely to take more tricks." For example, according to Zar points before you know partner's distribution: 6331 (14 zar) is much better than 4441 (11 zar) 7222 (14 zar) is roughly equivalent to 5440 (14 zar) Now obviously these will be adjusted up or down when you find (or fail to find) a fit. But it seems like the initial evaluation should roughly measure the average or expected value of the hands. If some particular pattern almost always finds a fit and always gets huge upgrades after the fit is found, it seems reasonable to assume that the initial valuation of that hand is probably too low. The funny thing is, 5440 distribution seems to be very powerful. From Tysen's data, it seems like 5440 is much more likely to find a game than 7222 or 6331. Shouldn't the initial evaluation reflect this?The data are of course very interesting, but things are a bit more subtle. The 7222 has a higher degree of safety than the 5440 hand (and can thus open lighter in the "how often do we make game" metric). After you find a fit, everything changes, but I'd suggest that (to some extent) upgrading when you find a fit is better than having to downgrade when you don't find one. Not that "safety" should be the sole factor either -- what I'm suggesting is that both "how often game" and "safety" factor into your openings (read: initial valuation?). Andy Quote Link to comment Share on other sites More sharing options...
Al_U_Card Posted September 8, 2005 Report Share Posted September 8, 2005 I have a question. Since "support" (either as trump or for helping to establish winners in declarer's hand) is important, which combination of hands is better for 3NT; 5332 opposite 2335 or 4333 opposite any 3 (334,343,433)? Quote Link to comment Share on other sites More sharing options...
Zar Posted September 8, 2005 Report Share Posted September 8, 2005 >I have a question. Since "support" (either as trump or for helping to establish winners in declarer's hand) is important, which combination of hands is better for 3NT; 5332 opposite 2335 or 4333 opposite any 3 (334,343,433)? < This is something I haven't actually studied - but I will (have to figure out the exact restrictions needed). It is closely related to the question well-covered by the book with the exact numbers regarding the fact that 5:3 is better for NT than for Trump while 4:4 is better for Trump than for NT. Also, 5:2 is better for NT and 4:3 is better for Trump in general (obviously not for 4333 vs. 4333). I'll let you know when I do the run. I assume you understand that your question is related to a "choice" that you don't actually have at the table from the view point of playing the SAME contract of 3NT (rather than being able to chose from 3NT and 4S for example, based on the fit you have). ZAR Quote Link to comment Share on other sites More sharing options...
Zar Posted September 8, 2005 Report Share Posted September 8, 2005 >If you do this, you will realize that TSP distribution points and Zar points are extremely close, maybe that TSP distribution points are slightly more accurate, and possibly you might also realize that mikestar has already done exactly this. Look on page 18 of this thread.< We are back to these TSP which we determined are the Richard Pavlicek points. And we also determined that they not "may be" or "probably" or "eventually" or "supposedly" lower in the runs :-) We start running in circles I suess :-) ZAR Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 8, 2005 Report Share Posted September 8, 2005 We are back to these TSP which we determined are the Richard Pavlicek points.And then we determined that they actually were not Pavlicek points... Quote Link to comment Share on other sites More sharing options...
mikestar Posted September 9, 2005 Report Share Posted September 9, 2005 I have a question. Since "support" (either as trump or for helping to establish winners in declarer's hand) is important, which combination of hands is better for 3NT; 5332 opposite 2335 or 4333 opposite any 3 (334,343,433)? Definitely the 5332's. Odds are that one of the long suits will break 3-3 and the other won't. Then the 4333's take 1 length trick and the 5332's take at least 2 length tricks (if the suit that doesn't break is 4-2, then there may be a third trick if you can develope it in time.) Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 Okay, I think I have an idea of how to solve the "distribution accuracy" question and get over the whole upgrading/downgrading issue. I haven't run this yet since I'm at home now and all of my bridge stuff is on my work laptop. Ben, let me know if you think this is a fair test. We'll look at the average number of tricks that each shape takes like we did before, except that I'll limit the hands to be only those where our longest fit is exactly 8 cards. Since every distribution method counts an 8-card fit as "normal" with no adjustments, it should be fair to all unadjusted counts. Sound good? Quote Link to comment Share on other sites More sharing options...
inquiry Posted September 9, 2005 Author Report Share Posted September 9, 2005 Okay, I think I have an idea of how to solve the "distribution accuracy" question and get over the whole upgrading/downgrading issue. I haven't run this yet since I'm at home now and all of my bridge stuff is on my work laptop. Ben, let me know if you think this is a fair test. We'll look at the average number of tricks that each shape takes like we did before, except that I'll limit the hands to be only those where our longest fit is exactly 8 cards. Since every distribution method counts an 8-card fit as "normal" with no adjustments, it should be fair to all unadjusted counts. Sound good? not sure, why limit to 8 card fits? Sometimes you have no 8 card fit, sometimes, 9, 10, 11, etc. It is the Adjustments I am interested in. So while the study would be interesting on its own, it would hardly answer the basic question. Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 not sure, why limit to 8 card fits? Sometimes you have no 8 card fit, sometimes, 9, 10, 11, etc. It is the Adjustments I am interested in. So while the study would be interesting on its own, it would hardly answer the basic question.Did you read the two long articles I referred to earlier on adjustments? Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 not sure, why limit to 8 card fits? Sometimes you have no 8 card fit, sometimes, 9, 10, 11, etc. It is the Adjustments I am interested in. So while the study would be interesting on its own, it would hardly answer the basic question.Also, in order to look at adjustments, you have to have a baseline right? I'm going to do this anyway so that we can compare baselines. For each evaluation scheme, the baseline has always been set as its value in an 8-card fit. Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 Was Goren right after all?So I decided to look at the cases where we have exactly an 8-card fit. I looked at the trick-taking potential of each shape. Tricks Tricks Tricks Tricks Tricks Error Error Error Error Shape Count Real Zar 531 TSP Goren Zar 531 TSP Goren 4-3-3-3 10.54 0.00 -0.29 -0.22 -0.14 -0.08 0.90 0.49 0.20 0.08 4-4-3-2 21.55 0.22 0.11 0.12 0.06 0.25 0.30 0.25 0.56 0.01 5-3-3-2 15.52 0.23 0.31 0.12 0.26 0.25 0.10 0.19 0.02 0.01 5-4-2-2 10.58 0.43 0.51 0.45 0.46 0.58 0.07 0.01 0.01 0.25 6-3-2-2 5.64 0.46 0.71 0.45 0.66 0.58 0.33 0.00 0.22 0.08 6-3-3-1 3.45 0.65 0.91 0.78 0.86 0.58 0.22 0.06 0.15 0.02 5-4-3-1 12.93 0.66 0.71 0.78 0.66 0.58 0.03 0.20 0.00 0.08 4-4-4-1 2.99 0.68 0.31 0.78 0.46 0.58 0.41 0.03 0.13 0.03 7-2-2-2 0.51 0.82 0.91 0.78 1.06 0.92 0.00 0.00 0.03 0.00 6-4-2-1 4.70 0.85 1.11 1.12 1.06 0.92 0.30 0.33 0.21 0.02 5-5-2-1 3.17 0.91 0.91 1.12 1.06 0.92 0.00 0.14 0.08 0.00 7-3-2-1 1.88 0.91 1.31 1.12 1.26 0.92 0.30 0.08 0.24 0.00 5-4-4-0 1.24 1.16 0.91 1.45 1.06 0.92 0.08 0.11 0.01 0.07 5-5-3-0 0.90 1.16 1.11 1.45 1.26 0.92 0.00 0.07 0.01 0.05 6-4-3-0 1.33 1.19 1.31 1.45 1.26 0.92 0.02 0.09 0.01 0.10 6-5-1-1 0.71 1.29 1.31 1.78 1.66 1.25 0.00 0.17 0.10 0.00 6-5-2-0 0.65 1.47 1.51 1.78 1.66 1.25 0.00 0.06 0.02 0.03 Totals 3.06 2.29 2.00 0.83 Count is the % frequency of that shapeTricks Real are the number of tricks that each shape really takes more than the 4333 shapeTricks for each evaluation scheme are the number of tricks predicted by that count system. I allowed everything to be shifted by a constant so that you won't have a problem if the 4333 shape is off. This helps Zar's performace by a lot. It would be worse without it.Error is the square of the difference between the real and predicted numbers, multiplied by the Count.Total at the bottom is the sum of the errors. It looks like most of the systems are predicting more tricks than are really available. Good old Goren is the closest! Implications? Distribution when we only have an 8-card fit maybe doesn't have as much weight as a lot of us were thinking. But since it has much more weight on average over all possible fits, that must mean that we really have to increase it a lot when we have a superfit. So maybe the best solution is for all methods to tone down a bit closer to Goren as the baseline, but once a superfit is found, pump it up even more than Zar or TSP ever did before? I'd like to hear other people's thoughts. Quote Link to comment Share on other sites More sharing options...
Guest Jlall Posted September 9, 2005 Report Share Posted September 9, 2005 nice work dude. alot of it is what i thought, mainly that zar overestimates shape when no fit is found yet. Quote Link to comment Share on other sites More sharing options...
awm Posted September 9, 2005 Report Share Posted September 9, 2005 Definitely interesting, but it's not clear to me that we should assume no fit, or only an eight card fit, until more is known about the hands. For example, holding a 7-card suit, it seems like a 9-card fit is odds on. It might also be interesting to see data on the probability of various fits given various shapes. We could then assume the "average" fit, or perhaps be a bit pessimistic until we have more information. Certainly assuming we will only have an 8-card fit when we have an 8-card suit is awfully pessimistic! Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 Definitely interesting, but it's not clear to me that we should assume no fit, or only an eight card fit, until more is known about the hands. For example, holding a 7-card suit, it seems like a 9-card fit is odds on. It might also be interesting to see data on the probability of various fits given various shapes. We could then assume the "average" fit, or perhaps be a bit pessimistic until we have more information. The average is given previously in this thread (page 5). The whole purpose of limiting it to 8 cards was to test the baseline of point methods where we don't add anything for extra trumps. Certainly assuming we will only have an 8-card fit when we have an 8-card suit is awfully pessimistic! But sometimes we know for a fact that we do only have 8. How much should the 8-card suit be worth opposite a void? If partner has something, then we add more points. Quote Link to comment Share on other sites More sharing options...
Zar Posted September 9, 2005 Report Share Posted September 9, 2005 >zar overestimates shape when no fit is found yet. < Zar Points ASSUME that there is a fit of 8+ cards (85% of all boards). For the rest of the cases (15% overall) Zar Points have a deduction of 1 Level for NO-FIT (that’s 5 Zar Points). That is what you have been missing. >it's not clear to me that we should assume no fit< If it is OK with you to assume that 15% happens more often than 85%, then you can assume that there is no fit in general ... :-) I assume the 85% and make a deduction (penalty) of 5 Zar Points or 1 Playing Level IF in realiry we fall into the 15% chance of having no 8+ card fit. Simple stuff. >It might also be interesting to see data on the probability of various fits given various shapes. < You haven’t read the Zar Points Bidding Backbone book than. BOTH fit AND Double-fit are discussed with all the numbers. >Certainly assuming we will only have an 8-card fit when we have an 8-card suit is awfully pessimistic!< Depends on where you look at it from :-) ZAR Quote Link to comment Share on other sites More sharing options...
Guest Jlall Posted September 9, 2005 Report Share Posted September 9, 2005 uhh I know it deducts but sometimes you find out about misfits too late, especially when you open 5-5 8 counts routinely. Partner will immediately know theres no fit? interesting. Especially when auctions get competitive, you cannot know your degree of fit sometimes until it is too late. That is what you have been missing. Quote Link to comment Share on other sites More sharing options...
tysen2k Posted September 9, 2005 Report Share Posted September 9, 2005 uhh I know it deducts but sometimes you find out about misfits too late, especially when you open 5-5 8 counts routinely. Partner will immediately know theres no fit? interesting. Especially when auctions get competitive, you cannot know your degree of fit sometimes until it is too late. That is what you have been missing.Exactly. I do open 5-5 8-counts at the 1-level, because my system allows for it. But it's not because I think it's as strong as a balanced 13. It's because I know the auction is going to be competitive as so I need to speak up early. Your "strength" shouldn't be the main/only reason why you choose to open or not. Quote Link to comment Share on other sites More sharing options...
Zar Posted September 9, 2005 Report Share Posted September 9, 2005 >uhh I know it deducts but sometimes you find out about misfits too late, especially when you open 5-5 8 counts routinely. Partner will immediately know there is no fit? < As immediately as with any other system actually :-) >interesting. Especially when auctions get competitive, you cannot know your degree of fit sometimes until it is too late. That is what you have been missing. < Thanx for opening my eyes :-) Just don’t see how’s this related to your idea of assuming that the 15% is what “usually” happens really. It’s also a good idea to read about the Zar Misfit Points which a directly related to your concern. >Exactly. I do open 5-5 8-counts at the 1-level, because my system allows for it. < Do you play Zar Points? >But it's not because I think it's as strong as a balanced 13. < You have a good judgment then. >It's because I know the auction is going to be competitive as so I need to speak up early. Your "strength" shouldn't be the main/only reason why you choose to open or not.< I often miss your philosophy indeed. My “strength” ... like measured in what? And how much? Quote Link to comment Share on other sites More sharing options...
inquiry Posted September 12, 2005 Author Report Share Posted September 12, 2005 Somewhere ZAR writes about the 75% priniciple. I think it occurred on hands where they preempt and you ahve modest support for partner and honors in the other two suits. ZAR adds 75% of their "Fit value". So if you had two honors in the two side suits, he argued for adding +3 fit points for the honors. I think what the double dummy solver shows is the "potential" power of fits. If you are 5440, the odds are great that you have a fit, so rather than a "mear" 14 ZAR points, this is worth closer to 17 and maybe up to 20. This is why some distributions are frequently stronger than initial ZAR count might otherwise indicate. Just food for thought. Quote Link to comment Share on other sites More sharing options...
Zar Posted September 12, 2005 Report Share Posted September 12, 2005 I pushed the research in the Optimization area actually – and the results are really good! The result is Zar Points Optimized (ZPO) where all the extra fit points are calculated to bring the minimal Standard Deviation – both the for the primary and the secondary fit. The experiments with ZPR0023 (0 points for 4333, 0 for doubleton, 2 for singleton, and 3 for void) resulted in STD = 0.94 which is worse than the initial Ruffing Power result of STD = 0.93. The experiments with ZPR0013 (0 points for 4333, 0 for doubleton, 1 for singleton, and 3 for void) resulted in STD = 0.91 which is worse than the results for ZPR0012. Meaning that when I assigned 0 for super-trump to both 4333 and doubleton, 1 point for Singleton, and 2 points for Void, the STD dropped to 0.90 !!! No, with that value a proceeded to iterations regarding the SECONDARY fit. There if we assign: - 0 points for having an 8-card side suit fit,- 1 point for having a 9-card side-suit fir,- 2 points for having a 10+ card side-suit fit, the Standard Deviation for the First Time is brought below 0.90 – it is 0.89! For the other experiments the results were: - STD of 0.90 for 123 points assigned to secondary fit; - STD of 0.90 for 124 points assigned to secondary fit; So we finally were able to reach the numbers for the Zar Points Optimized with - Super-fit points assigned according to the 0012 scale (2 for void, 1 for singleton); - Side-fit points assigned according to the 012 scale (2 for 10+, 1 for 9-cards); Here is how the new ZPO fairs against the old scores that ypu already know about: ZPO 0.89ZPR 0.93ZPB 0.94GP 0.96BP 0.96ZP3 0.98LP 1.05WTC 1.09LTC 1.22LTM 1.23 Now I’ll run BACK the IMP macth between the 10 participants to see how the new Zar Points Optimized scores. ZAR Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.