Your algorithm seems to be working fine. As you mentioned later in this thread, my latest version will compute both CDP and CDP1 for comparison.
I think this highlights the challenge we encounter when trying to *optimize* a strategy, as opposed to the relative ease with which we can *evaluate* some particular strategies that are easy to specify.
Although it is often (almost always?) the case, it is *not* true in general that the split EV for CDP will *always* be >= CDP1. This may seem counter-intuitive, especially when you consider that the "component" expected values that make up the split EV *do* satisfy CDP >= CDP1. (By "component" EVs I mean, for example, EV[X;a,0] in my earlier notation, which is the expected value of playing out a single hand, without resplitting, given that a additional pair cards have been removed from the deck.)
The problem is that the overall split EV is just some linear combination of these component EVs... and the coefficients of those components are not necessarily positive. For example, in the 10-10 vs. 7 case above (actually, vs. any non-10 up card) for SPL3, the overall split EV may be expressed as:
Code:
- 10/9*EVp[0, 0] - 20/17*EVp[1, 0] + 15/17*EVp[2, 0] - 7/51*EVp[3, 0]
+ 2*EVx[0, 0] + 20/9*EVx[1, 0] + 40/17*EVx[2, 0] - 30/17*EVx[3, 0]
+ 14/51*EVx[4, 0]
In this case (indeed, in most cases, based on a few spot checks using the formula in the paper), EVx[3,0] is the culprit; although this component value is greater for CDP (0.36037129537129536) than CDP1 (0.35433011433011435), this actually hurts the overall split EV.
This is not confined to these small, "pathological" shoes. The same phenomenon occurs for 10-10 vs. 7 in single deck, it's just not as pronounced.
As I said at the start, this may seem counter-intuitive, since we are "using more information" in CDP1. But we are only using more information to optimize a "component" of expected value, a collection of which together contribute to the *overall* value that we unjustifiably expect to also be optimized.