Pro GIB v Basic GIB

1eyedjack · October 1, 2013

I would quite like to see some empirical data on just how much better Pro GIB is than Basic GIB, that goes beyond the bald statement that well, it must be much better because it uses more intelligence.

I think that we could get this data by setting up a team game, one team comprising entirely basic bots, and one team comprising entirely pro bots. Make it a nice long game, say 1000 hands. At their rate of play it would be over in a twinkling (actually there may be an upper maximum number of hands that you can play in a single team game, limited by the software).

I suspect that the software might require each team to have a human player, but then you may be able to make it a team of 5, and the human captain just puts his bots in to bat.

I don't think that I can do this all by myself, because although I have not tried it I doubt that I could rent the pro bots for $1.00 for a day and simultaneously rent the basic bots for a week for $0.25 and then also get to choose which type of bot sits (I assume that it would choose pro bot each time, even if it allowed me to rent both types).

But if I rented the $1 bots, and someone else rented the $0.25 bots, then maybe it would be achievable?

Maybe if Uday is reading this he would be able to say if this is workable.

uday · October 1, 2013

There are 4 versions of GIB floating out there.

a. The GIB-product-on-a-CD

This is a standalone desktop program and hasn't been updated in a while

b. The GIB that is distributed with the older Windows version of BBO

Likewise. Tho it is somewhat more up to date, it isn't "current"

c. The "advanced" bot that we use in tournaments and bot rentals and when you rent a bot as a partner in a pair game (there, I like to call it the "pro bot" for no good reason)

d. The "basic" bot that we use in MSN Bridge, Partnership bidding, bot rentals , the solitaire games (web4 and Video Bridge) , and their standalone web counterparts on our home page as well as in the mobile version when you play anonymously

1eyedJack asked about the difference between the basic bot and the advanced bot.

We're just wrapping up a test that might answer that.

I pulled out 10,000 hands from ACBL Robot Duplicate tournaments that met the following criteria.

- 15+ comparisons (no assigned averages)

- Matchpoints

- Best hand

- Human declares when dummy

We tested each bot against this set of hands like this

- Bot being tested sat South, just like the human

- West, North, East are the production Smartbots, just like Robot Duplicate

- I discarded contracts where North was declarer out of sheer laziness, not wanting to code "human declares" in this environment

and for each result, I matchpointed the bot's score against the original at-the-table scores

The first test, with the basic bot, generated a list of matchpoint scores that averaged out to 51.69% over 7,948 boards

The second test, with the advanced bot, is not quite done, but is averaging 57.95% over 7,158 boards

The bots used when the hands were first played might have been the current production version, or the one just before that. This isn't, to me, about comparing bots to humans as much as it is trying to answer the same question asked by 1eyedJack. It is really useful to me to me able to judge the impact of increased resources, or code tweaks on the performance of a bot, and this seems as good a way as any to measure that.

nige1 · October 1, 2013

We tested each bot against this set of hands like this
- Bot being tested sat South, just like the human
- West, North, East are the production Smartbots, just like Robot Duplicate
- I discarded contracts where North was declarer out of sheer laziness, not wanting to code "human declares" in this environment
and for each result, I matchpointed the bot's score against the original at-the-table scores

The first test, with the basic bot, generated a list of matchpoint scores that averaged out to 51.69% over 7,948 boards
The second test, with the advanced bot, is not quite done, but is averaging 57.95% over 7,158 boards
The bots used when the hands were first played might have been the current production version, or the one just before that. This isn't, to me, about comparing bots to humans as much as it is trying to answer the same question asked by 1eyedJack. It is really useful to me to me able to judge the impact of increased resources, or code tweaks on the performance of a bot, and this seems as good a way as any to measure that.

Thank you 1eyedjack and Uday. Uday is comparing humans to bots even if that isn't the purpose of his experiment. In future, perhaps, humans will be less abusive of their betters :)

1eyedjack · October 3, 2013

What Uday is doing sounds more useful than my proposal, being a 3-way comparison. I have some concerns that E/W robots in the original tourneys may be earlier versions than in the re-run. That would not invalidate the pro-bot v basic-bot comparison but it might taint the bot-v-human comparison. Much akin to the criticisms levelled at the new "instant" tourneys.

Are there any plans to replicate the exercise with IMP scored tourneys?

Possibly of at least equal interest would be the results of pitting current pro bots against an earlier version of pro bots (rather than basic bots).

I still think that my original experiment suggestion has legs. Not as a substitute, but addition.

uday · October 3, 2013

I'll doubtless run an IMP match over the next couple of days. I can eliminate the stale-version-robot issue by using only hands from tourneys that use the current version of GIB, tho that reduces the number of hands available. I can also use hand from the (nonacbl) robot duplicate.

We do run bot-vs bot team games ( those are easier to run, since we can use random hands ) already ; we don't usually expect big changes from version to version, though. the matches help us find glitches created by our changes.

This recent exercise does make me wonder about how strong the bots need to be. Maybe they're strong enough already, and just have to be less flighty

Bbradley62 · October 3, 2013

This recent exercise does make me wonder about how strong the bots need to be. Maybe they're strong enough already, and just have to be less flighty

This, of course, depends on your objectives. If GIB's primary purpose is to provide practice for the vast majority of BBO members, GIB is certainly already strong enough. If you want GIB to compete in the World Computer Bridge Championship, maybe not.

1eyedjack · October 3, 2013

There will certainly come a point at which the resources consumed to improve GIB will outweigh the potential marginal benefit.

Whether that point has already been reached is not for me to say, although speaking personally I would find it very frustrating if I were to learn that its development is to be scaled back dramatically from this point forward on the apparent grounds that GIB is "good enough" as it stands. I personally think that it is still a bit on the woeful side, but as you acknowledge this may be down to flightiness that is still under correction.

I feel at present that it is rare for a single 8-board robot tourney to complete without one hand at least cropping up which I feel obligated to report. To my mind that is still too high a hit rate.

One thing surprises me about the stats so far. On average I do pretty nicely thank you when playing in robot tourneys. Latterly I have been trying the effect of selecting a robot partner to play in random free human pairs tourneys, and I have not been doing so well at all. Assuming that I am playing with a similar level of (in)competence in both events, the logical conclusion is that the Robot is largely to blame for the differences in results, which would indicate that the Robot is still rather sub-human in capability, a conclusion that seems to be at odds with Uday's results. Of course, so far I have only played in a few human tourneys with a Robot. Nothing like Uday's population.

Bbradley62 · October 3, 2013

Assuming that I am playing with a similar level of (in)competence in both events, the logical conclusion is that the Robot is largely to blame for the differences in results...

There may be several reasons for this, outside of overall level of play. The most obvious of these reasons would be that GIB can only assume that everyone else is playing his system, which is unlikely to be true in a human tournament. If an opponent opens a strong 1♣, for example, GIB may never figure out what is actually going on that hand.

1eyedjack · May 2, 2014

Uday is comparing humans to bots even if that isn't the purpose of his experiment.

I don't understand this comment. Maybe I am being dense.

Suppose that Uday were to replay a tournament fielding a table containing 4 identical, but completely incompetent robots (version 1.0, say) in all seats. Were it not for the "best hand South" distortion, I would expect each side to score on average about 50%. If GIB is generally better at declarer play than defence I would expect N/S GIB to average slightly better than 50% due to the best hand South distortion. This is a far cry from stating that the robots would fare anything like so well had there been a human at the table (even a human who played GIB system without psyching). How, therefore, do you suggest that ANY comparison with humans is being achieved with this experiment?

I agree with Uday that the test is useful to illustrate marginal improvements in one version of GIB over its predecessor, which I believe to be the limit of Uday's intention, but I cannot see how you can extrapolate any other conclusion that that.

On initially reading this thread I jumped to the unwarranted conclusion that GIB actually plays a 57% game (and I may have said as much in some subsequent threads). I think that I should retract that conclusion. It may be accurate by coincidence (in fact I doubt it, but not based on any evidence that would stand scrutiny).

Bbradley62 · May 2, 2014

...Bot being tested sat South, just like the human- West, North, East are the production Smartbots, just like Robot Duplicate... I matchpointed the bot's score against the original at-the-table scores.

The first test, with the basic bot, generated a list of matchpoint scores that averaged out to 51.69% over 7,948 boards... The second test, with the advanced bot, is not quite done, but is averaging 57.95% over 7,158 boards
Uday is comparing humans to bots even if that isn't the purpose of his experiment...
I don't understand this comment. Maybe I am being dense...

Uday started with results gained in robodupe tournaments, where there were humans sitting South and advanced bots in the other three seats. Then, he sat basic bots in the "human" seats, and the basic bots scored 51.69% against their human South opponents. Then, he sat advanced bots in the "human" seats and the advanced bots scored 57.95% against those same human South opponents. Therefore, Uday determined that both types of bots scored better than their human opponents, but that the advanced bots did so by a greater margin. Nige is observing that Uday could have run a teams-style event, with one team having four basic bots and the other team having four advanced bots, which would have completely removed humans from the equation. I think it's nice to have the comparison to the human field, but either way would work, although one would measure IMP play and the other matchpoints.

1eyedjack · May 3, 2014

Thanks. Blind spot. Right first time.

scarletv · May 4, 2014

There are 4 versions of GIB floating out there.

a. The GIB-product-on-a-CD
This is a standalone desktop program and hasn't been updated in a while

b. The GIB that is distributed with the older Windows version of BBO
Likewise. Tho it is somewhat more up to date, it isn't "current"

c. The "advanced" bot that we use in tournaments and bot rentals and when you rent a bot as a partner in a pair game (there, I like to call it the "pro bot" for no good reason)

d. The "basic" bot that we use in MSN Bridge, Partnership bidding, bot rentals , the solitaire games (web4 and Video Bridge) , and their standalone web counterparts on our home page as well as in the mobile version when you play anonymously

What kind of GIB will be in use when I buy the advanced GIB and open a teaching or bidding table?

Pro GIB v Basic GIB

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation