Eric Farmer code: wrong split EV estimates ?

assume_R · Jun 23, 2011

London Colin said:
That's surely the point. The shoe state is not the same. In the example we know it is depleted by two non-eights, compared to if we were evaluating a single hand totalling 21.

Eric does the same optimization as you do, reusing the dealer probs whenever possible. But because of the particular splitting algorithm he is using, he is also reusing them where it is not strictly possible.

That being said, it might well be possible to replace the splitting routine with an entiely new one, without having to vastly modify the existing infrastructure, so that the shoe state is indeed always the same whenever the dealer probs are called upon.

I found that a hash table will take care of this for you. You don't have to worry about, and hard-code, in each of your algorithms whether or not you'll need to reuse the dealer probabilities.

It's simple: You have the following remaining cards (10x1 vector for each rank)

1, 4, 2, 6, 5, 2, 3, 8, 2, 8.

That hashes to a value, let's say 13621445.

You have a function which calculates the dealer probabilities. Before it does any calculations, though, it just check the hash table, see if 13621445 has already been calculated, and if it has return a 7x1 array of dealer probabilities.

I use boost's built in hash table which is actually implemented as a tree, and works extremely fast. It has a built in algorithm for hashing vectors or arrays, and for searching the tree. No coding on your part.

This way, you are always calling your CalculateProbabilities function, which decides itself whether or not it has to actually compute anything, or use an already-computed result. Very elegant and simple to code.

London Colin · Jun 23, 2011

assume_R said:
I use boost's built in hash table which is actually implemented as a tree, and works extremely fast. It has a built in algorithm for hashing vectors or arrays, and for searching the tree. No coding on your part.

This way, you are always calling your CalculateProbabilities function, which decides itself whether or not it has to actually compute anything, or use an already-computed result. Very elegant and simple to code.

One of the attractions of starting a brand new CA (not just amending or even replacing Eric's splitting code) would be the ability to used things like boost and the STL to come up with code that is easier to follow and maintain.

Of course the main drawback of starting a brand new CA is that it seems like such a colossal undertaking!

Eric himself has said that he was learning C++ as he went along to some degree, and if he were to start again he'd do a lot of things differentlly.

But the particular issue here isn't the mechanism of dealer-prob re-use. That's working fine. It's all about the splitting algorithm. It would require dealer probabilities not just in the context of a particular set of known cards having being dealt, but some unknown ones too! We have partial information, a number of cards have been dealt which are not of one particular denomination, but could be any other.

An alternative splitting algorithm, not suffering from this problem, might possibly slot in as a replacement.

MGP · Jun 23, 2011

London Colin said:
That's surely the point. The shoe state is not the same. In the [4,4,4,5] example, having split to {4,N 4,N 4,5}, we know it is depleted by two non-fours, compared to if we were evaluating a single hand of 21.

You're killing me London. I think I know how my CA works and it's already been established that my values are exact and not estimates. YOU DO NOT NEED TO RECALCULATE THE DEALER PROBS.

London Colin · Jun 23, 2011

MGP said:
You're killing me London. I think I know how my CA works and it's already been established that my values are exact and not estimates. YOU DO NOT NEED TO RECALCULATE THE DEALER PROBS.

I'm agreeing with you, if only you would take the time to notice.

iCountNTrack · Jun 23, 2011

MGP said:
You're killing me London. I think I know how my CA works and it's already been established that my values are exact and not estimates. YOU DO NOT NEED TO RECALCULATE THE DEALER PROBS.

Let's all kiss and make up, and use my algorithm where you don't even need to calculate the dealer probs to get player's evs.

assume_R · Jun 23, 2011

London Colin said:
One of the attractions of starting a brand new CA (not just amending or even replacing Eric's splitting code) would be the ability to used things like boost and the STL to come up with code that is easier to follow and maintain.

Of course the main drawback of starting a brand new CA is that it seems like such a colossal undertaking!

Eric himself has said that he was learning C++ as he went along to some degree, and if he were to start again he'd do a lot of things differentlly.

But the particular issue here isn't the mechanism of dealer-prob re-use. That's working fine. It's all about the splitting algorithm. It would require dealer probabilities not just in the context of a particular set of known cards having being dealt, but some unknown ones too! We have partial information, a number of cards have been dealt which are not of one particular denomination, but could be any other.

An alternative splitting algorithm, not suffering from this problem, might possibly slot in as a replacement.

Yeah, boost and stl really helped me in my CA, especially with hash maps, and boost also has a built in "serialize" library which can save any c++ object to a file. I use this to save strategies and game rules and bonuses to an xml file. Also as an aside, if you do ever make one, I recommend wxWidgets as a cross-platform GUI, which I use and my CA runs on both Windows and Linux.

If you want to try my CA out, PM me and I'll send you a link with binaries as well as the source code and a wiki help website.

My splitting algorithm is coded recursively (my entire CA is coded the same way: recursively), as I described to MGP in a previous post, and except for that darned 10,10 vs 6 resplit 3 times my answers exactly match all of his for every case in his screenshot. So in mine partial information never comes into play. I never looked at Eric's code, so unfortunately I can't help you with that!

MangoJ · Jun 23, 2011

London Colin said:
That's surely the point. The shoe state is not the same. In the [4,4,4,5] example, having split to {4,N 4,N 4,5}, we know it is depleted by two non-fours, compared to if we were evaluating a single hand of 21.

Eric does the same optimization as you do, reusing the dealer probs whenever possible. But because of the particular splitting algorithm he is using, he is also reusing them where it is not strictly possible.

That being said, it might well be possible to replace the splitting routine with an entirely new one, without having to vastly modify the existing infrastructure, so that the shoe state is indeed always the same whenever the dealer probs are called upon.

Ok, let's simplify notation a bit. Say that [a,b,c,d,...] is a set of dealer probabilities, where the shoe lacks of the cards a,b,c,d,...

If I read your question right, you say that you need the probability [4,4,4,5,N,M] where N and M is a non-4. All you notice in Erics code is the usage of [4,4,4,5] (and possibly [4,4,4,4,5] and [4,4,4,4,4,5] for deeper resplits.)

My statement was, you can indeed calculate [4,4,4,5,N,M] (the non-fours) by re-using the already known [4,4,..,4,5]'s.

Example: Let's calculate X = [4,4,4,5,N] the dealer probabilities with a non-four N. The naive approach for Single deck would be to calculate
X = 4/47 [4,4,4,5,A] + 4/47 [4,4,4,5,2x] + 4/47 [4,4,4,5,3] +3/47 [4,4,4,5,5] + 4/47 [4,4,4,5,6] + 4/47 [4,4,4,5,7] + 4/47 [4,4,4,5,8] +4/47 [4,4,4,5,9] + 16/47 [4,4,4,5,T]
where the addition has statistical weights accordingly to the correct probability of drawing the next card (provided it is a non-four).
There is a better approach, which doesn't involve the calculation of 9 dealer probabilities:
First we notice, that
4/48 [4,4,4,5,A] + 4/48 [4,4,4,5,2] + 4/48 [4,4,4,5,3] +1/48 [4,4,4,5,4] 1/48 [4,4,4,5,5] + 4/48 [4,4,4,5,6] + 4/48 [4,4,4,5,7] + 4/48 [4,4,4,5,8] +4/48 [4,4,4,5,9] + 16/48 [4,4,4,5,T]
= [4,4,4,5]
since the last card drawn (including 4), averaged over the drawing probabilitites (including 4) is just an additional burn card - which even if known to the dealer - does not change dealer "strategy".

We can use that relation to calculate X:
X = 48/47 ([4,4,4,5] - 1/48 [4,4,4,5,4])

Since [4,4,4,5,4] = [4,4,4,4,5] is known from the resplit algorithm beforehand, there is no need to calculate other probabilities besides the set of [4,4,...,4,5].

The same can be done for [4,4,4,5,N,M]. Formula will be more complex, but nothing impossible.

I don't know if Eric uses such an algorithm for the shifted dealer probabilities. But if I had to do it, I would do it this way as it doesn't cost anything (just computing the coefficients, no new dealer probabilities are needed).

@Caches:
For my CA (2nd version) I use a precalculated database of dealer probabilities including total-dependent stand, hit and double values (for players!), which is about 360MB (per dealer upcard). It has a depth of 14 cards (more is possible of course) and fits nicely into memory. From there, traveling through nested hands is extremely fast.
That means, I can handle multi-hand splits reaching 14 cards total entirely from database (where the last=deepest split-hand's hit/stand/double doesn't need computational time). Of course 14 is not much of a limit, but 20 is also doable.

London Colin · Jun 23, 2011

Mango,

I may still be misunderstanding you, but you seem to be suggesting that it should be possible to infer the probability of the dealer drawing to a particular total (say 17) with the shoe in one particular state, from knowledge of that probability with the shoe in other states, by a process of averaging ??

That can't be right, can it? If we could do that then we wouldn't have to calculate so many of the darned things in the first place.

caveat: once again I feel the need to repeat my "I am not a mathematician" mantra.

MangoJ · Jun 23, 2011

London Colin said:
Mango,

I may still be misunderstanding you, but you seem to be suggesting that it should be possible to infer the probability of the dealer drawing to a particular total (say 17) with the shoe in one particular state, from knowledge of that probability with the shoe in other states, by a process of averaging ??

Not for general shoe states. But for the simplest state of a "non-four".

Thats conditional analysis:
[1] p(X) = sum_N p(X|YN) p(YN)

If X: "Dealer draws to 17 from the deck*", YN: "N removed from the deck*" (where deck* is a Single deck with 4,4,4,5 already removed by simplicity), then

p(X) is the stand17 entry of [4,4,4,5]
p(X̣|YN) is the stand17 entry of [4,4,4,5,N]
and p(YN) the probability of drawing card N from deck*

If you are looking for "probability of standing on 17 with 4,4,4,5 and a non-four removed", let's write it as p(X|Y~4), you can rewrite relation [1] into
[2] p(X) = p(X|Y~4) p(Y~4) + p(X|Y4) p(Y4)

And hence you can solve for p(X|Y~4) by computing
p(X) from your stand17-entry of [4,4,4,5],
p(X|Y4) from your stand17-entry of [4,4,4,4,5]
and p(Y4) from the deck state itself (= 1/48),
and p(Y~4) = 1 - p(Y4).

to get
[4,4,4,5,non-4] = ([4,4,4,5] - 1/48 [4,4,4,4,5]) / (47/48)
= 48/47 [4,4,4,5] - 1/47 [4,4,4,4,5]
for each entry (so stand17-stand21, BJ and bust)

The reason it works for the dealer, and not for the player: Dealers "strategy" is to draw to 17, and does not depend on the cards already dealt, nor the players total.

MangoJ · Jun 23, 2011

Sorry for double post.

London Colin said:
If we could do that then we wouldn't have to calculate so many of the darned things in the first place.

One does not save much computation time by this relation.
If you would have [a,b,c,A], [a,b,c,2],...,[a,b,c,T], then you can indeed infer
[a,b,c] from that by weighting probabilities. So if you know all N-card states, you can calculate all smaller M-card states by that.
This would be a leaf-to-root approach in the drawing tree for dealer probabilities. However it is not very efficient, for a single state you would need to know 10 more larger states. So this algorithm doesn't make sense unless you already know the larger states by some other means.
It may however save ~11-20% (just a guess) of time if you need to fill a database of dealer probabilities. Start with the leafs, and work up your way.

In the split case the scenario is slightly different, you would need the state including a non-card. Here you have the option of calculating 9 different states (for the 9 options of a non-card) vs. 2 states (without the card, and with the card). Even better, those 2 states are already known from previous calculation.

As a defense, I must state that I currently don't perform resplit calculations in my CA, hence this drawing talk is purely theoretical :laugh: for me^^

London Colin · Jun 24, 2011

Thanks Mango.

My intuition was that such averaging calculations were not possible, which is why I wanted to double check that you were saying what I thought you were saying.

Just goes to show, I shouldn't trust my intuition!

MGP · Jun 24, 2011

MangoJ said:
Sorry for double post.

One does not save much computation time by this relation.
If you would have [a,b,c,A], [a,b,c,2],...,[a,b,c,T], then you can indeed infer
[a,b,c] from that by weighting probabilities. So if you know all N-card states, you can calculate all smaller M-card states by that.
This would be a leaf-to-root approach in the drawing tree for dealer probabilities. However it is not very efficient, for a single state you would need to know 10 more larger states. So this algorithm doesn't make sense unless you already know the larger states by some other means.
It may however save ~11-20% (just a guess) of time if you need to fill a database of dealer probabilities. Start with the leafs, and work up your way.

In the split case the scenario is slightly different, you would need the state including a non-card. Here you have the option of calculating 9 different states (for the 9 options of a non-card) vs. 2 states (without the card, and with the card). Even better, those 2 states are already known from previous calculation.

As a defense, I must state that I currently don't perform resplit calculations in my CA, hence this drawing talk is purely theoretical :laugh: for me^^

See my previous post.

k_c · Jun 24, 2011

iCountNTrack said:
I realized that my CA doesn't record spl2 and spl3 EVs. I tried to tinker with my code to get those values but i think the combination of sun and alcohol has made my mind slower . Anyway the EV of the first split of 9,9 vs 6 for 1D split 3 times DAS calculated optimally is +41.53314%. This value includes the
possibilty of a second and third splits (whenever they are optimal).

However as i have mentioned earlier i am not sure how useful my figure is, because it is like comparing apples to oranges, besides i feel it is always tedious to analyze somebody's else code especially when you dont fully understand the algorithm. We need to standardize the comparison (fixed, optimal, total...)

The rules determine how splitting is limited.

spl1 means that 1 split is all that is allowed by the rules.
spl2 means that 2 splits are all that are allowed by the rules.
spl3 means that 3 splits are all that are allowed by the rules.
.
.
spln means that n splits are all that are allowed by the rules.

Programs generally compute for limitations of 1, 2, and 3 splits.

iCountNTrack · Jun 24, 2011

k_c said:
The rules determine how splitting is limited.

spl1 means that 1 split is all that is allowed by the rules.
spl2 means that 2 splits are all that are allowed by the rules.
spl3 means that 3 splits are all that are allowed by the rules.
.
.
spln means that n splits are all that are allowed by the rules.

Programs generally compute for limitations of 1, 2, and 3 splits.

iCountNTrack said:
First split means the overall ev of the hand. That is because the first split will include all the subsequent group of optimal decisions(hit, stand, double, resplit(hit, stand,double, resplit(hit, stand, double) ) taken based on all the cards seen. I actually thought about it a little more and figured out why I decided not to include split2 and split 3 that is because when we have an optimal changing strategy, you cannot have ONE value for the ev of split2 and split3, that is because the ev of split2 will depend one the cards in hand 1 and the ev of split 3 will depend on the cards in hand 1 and hand 2.
There is of course also the possibility of resplitting on the first hand which most CAs report in general but again that will be only reporting one of the many many possible ev values for a second split and third split.

See above

London Colin · Jun 24, 2011

iCountNTrack said:
First split means the overall ev of the hand. That is because the first split will include all the subsequent group of optimal decisions(hit, stand, double, resplit(hit, stand,double, resplit(hit, stand, double) ) taken based on all the cards seen. I actually thought about it a little more and figured out why I decided not to include split2 and split 3 that is because when we have an optimal changing strategy, you cannot have ONE value for the ev of split2 and split3, that is because the ev of split2 will depend one the cards in hand 1 and the ev of split 3 will depend on the cards in hand 1 and hand 2.
There is of course also the possibility of resplitting on the first hand which most CAs report in general but again that will be only reporting one of the many many possible ev values for a second split and third split.

Click to expand...

See above

I think maybe the issue here is whether SPL1, SPL2, SPL3 values mean the EV of a fixed strategy which will split that number of times, or the EV available under those splitting rules when following any given strategy.

I suppose those two things will usually be identical, but if you are generating optimal strategy, then for example if your SPL3 shows a gain in EV compared to SPL2 that just shows that splitting a third time is sometimes the best choice. The EV reported will be a composite of sometimes splitting, sometimes not.

Does that make sense?

MangoJ · Jun 24, 2011

London Colin said:
I think maybe the issue here is whether SPL1, SPL2, SPL3 values mean the EV of a fixed strategy which will split that number of times, or the EV available under those splitting rules when following any given strategy.

I suppose those two things will usually be identical, but if you are generating optimal strategy, then for example if your SPL3 shows a gain in EV compared to SPL2 that just shows that splitting a third time is sometimes the best choice. The EV reported will be a composite of sometimes splitting, sometimes not.

In that case, for a fixed strategy you would resplit as deep as possible, and one could simply define SPL3 to split exactly 3 times.
For optimum play in SPL3 ruleset, one would just calculate SPL1, SPL2, and SPL3 values, and take the maximum for EV. This way, you keep the option of not resplitting even if allowed.

London Colin · Jun 24, 2011

MangoJ said:
In that case, for a fixed strategy you would resplit as deep as possible, and one could simply define SPL3 to split exactly 3 times.

Although a fixed strategy might in principle involve resplitting less than is allowed by the rules. It's a question I flirted with in an earlier post - what is the precise definition of a CDP strategy? It allows for post-split hands to be played differently from pre-split, but does that include potentially playing the pairs differently?

If it did, and there ever was a situation in which resplitting should not be carried out as often as is allowed, then that must mean you get the same EV reported for SPLn+1 as for SPLn.

MangoJ said:
For optimum play in SPL3 ruleset, one would just calculate SPL1, SPL2, and SPL3 values, and take the maximum for EV. This way, you keep the option of not resplitting even if allowed.

That's what I mean; you extract the maximum EV at every decision point, and the net result will be different depending on the SPLn rules, so there is something meaningful to be logged against the headings SPL1, SPL2, etc.

iCountNTrack · Jun 24, 2011

London Colin said:
I think maybe the issue here is whether SPL1, SPL2, SPL3 values mean the EV of a fixed strategy which will split that number of times, or the EV available under those splitting rules when following any given strategy.

I suppose those two things will usually be identical, but if you are generating optimal strategy, then for example if your SPL3 shows a gain in EV compared to SPL2 that just shows that splitting a third time is sometimes the best choice. The EV reported will be a composite of sometimes splitting, sometimes not.

Does that make sense?

MangoJ said:
In that case, for a fixed strategy you would resplit as deep as possible, and one could simply define SPL3 to split exactly 3 times.
For optimum play in SPL3 ruleset, one would just calculate SPL1, SPL2, and SPL3 values, and take the maximum for EV. This way, you keep the option of not resplitting even if allowed.

That is not how optimum post-split strategy actually works. As i have mentioned there is not one ev value for split2 and split3 when finding post split optimal strategy that is because you have gazillion possibilities. Let me give an example

so we have a 8,8 vs 9.
so we split the hand,

we get a deuce on the first eight,

we have an 8,2 , now we need to find the optimal playing decision (hit, stand, double (if DAS is allowed)), taking into account the 8 on the hand as well as dealer's 9, optimal decision in this case is to double, so we double and we get a 9 ,first hand is now 8,2,9.
Now we have to play the second hand, we have an 8, and we get another 8
so we now 8,8 on second hand. We need to find the optimal strategy for second hand (hit, stand, double (if DAS is allowed), split (ie split 2)) taking into account the cards seen from first hand (8,2,9) as well as dealer's 9. Splitting (resplitting , split2, whatever you want to call it) would be the optimal decision however the VALUE of EV will not be unique because it will depend on the cards dealt to the first hand, so for instant had we gotten 8,2,2 on the first hand, we would have had a different value for split2 EV.

With fixed strategies you wouldn't have this problem and you only get one value for split2 and one value for split3. However as shown in the above split2 and split3 for post-split optimal strategy is ill-defined without knowing the composition of the split hands meaning. In other words, if i have 8,8 vs 9
I cannot determine a single value for split2, however if we have
8,2,8 on first split hand and 8,8 on second split hand vs a dealer's 9 . I CAN calculate a unique split2 EV.

Also one last word about the first split EV with post-split optimal strategy. The first split EV would represent the overall EV of the round if one decides to split, that is because this calculated EV is the collection of all the optimal decisions(hit, stand, double, split2(hit, stand, double, split3(hit, stand, double))) on the child sequences.
A perfect analogy is when one calculates the EV of hitting, which includes the collection of optimal decisions(hit, stand) on the 10 child sequences.

assume_R · Jun 24, 2011

iCountNTrack said:
That is not how optimum post-split strategy actually works. As i have mentioned there is not one ev value for split2 and split3 when finding post split optimal strategy that is because you have gazillion possibilities. Let me give an example

...

With fixed strategies you wouldn't have this problem and you only get one value for split2 and one value for split3. However as shown in the above split2 and split3 for post-split optimal strategy is ill-defined without knowing the composition of the split hands meaning. In other words, if i have 8,8 vs 9
I cannot determine a single value for split2, however if we have
8,2,8 on first split hand and 8,8 on second split hand vs a dealer's 9 . I CAN calculate a unique split2 EV.

...

For this exact reason I chose to make my CA use a fixed strategy. I don't understand the whole point of calculating optimal strategies if they won't ever be used. I know you said once it's about knowing the "optimal EV" of a game, but again, shouldn't our CA help us in reality in some way?

I can see if a person is willing to learn 2CD strategies, then maybe that could be implemented.... but I don't use 2CD strategies, and I don't know anybody else who does. Everybody I know uses fixed TD strategies, or count-based TD strategies (i.e. indices), or n-card-TD strategies (which my CA handles for Spanish).

So why go through all the trouble of programming it otherwise? Is it just an exercise in theory with no practical application? Or is there some other end-goal which possibly require the use of optimal strategies?

MangoJ · Jun 24, 2011

iCountNTrack said:
That is not how optimum post-split strategy actually works. As i have mentioned there is not one ev value for split2 and split3 when finding post split optimal strategy that is because you have gazillion possibilities. Let me give an example

so we have a 8,8 vs 9.
so we split the hand,

we get a deuce on the first eight,

we have an 8,2 , now we need to find the optimal playing decision (hit, stand, double (if DAS is allowed)), taking into account the 8 on the hand as well as dealer's 9, optimal decision in this case is to double, so we double and we get a 9 ,first hand is now 8,2,9.
Now we have to play the second hand, we have an 8, and we get another 8
so we now 8,8 on second hand. We need to find the optimal strategy for second hand (hit, stand, double (if DAS is allowed), split (ie split 2)) taking into account the cards seen from first hand (8,2,9) as well as dealer's 9. Splitting (resplitting , split2, whatever you want to call it) would be the optimal decision however the VALUE of EV will not be unique because it will depend on the cards dealt to the first hand, so for instant had we gotten 8,2,2 on the first hand, we would have had a different value for split2 EV.

This is correct, but not much of a "problem" in terms of concept, but only from computational effort and time. You "just" cycle recursively (or by any other method) through all first hands, and for evaluation of "stand" you don't convolute against dealer probabilities, but you just cycle then through the next hand (for each standing first hand, taking into account cards consumed by first hand). If the computational effort for a single hand (with no split allowed) is N, then for no resplit (split1) it is N². Likewise for any deeper levelit is N³ (split2) and N^4 (split3).

This is the thing to do if you want optimum play EV. With proper caching technique not only on dealer, but also on player hands, the deepest hand can be precomputated, which reduces the problem to N³, which is doable nowadays.

I don't think optimal strategy is in any way ill-defined. Sure it is costly to compute, but far from impossible.

Eric Farmer code: wrong split EV estimates ?

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member