Software simulator and unexpected results

London Colin · Sep 18, 2009

k_c said:
I think the algorithm is OK. The assumption, of course, is that random numbers are generated by rand() in C++.

Did you look at this part of the Wikipedia article? ...

Care needs to be taken in implementing the Knuth shuffle; even slight deviations from the correct algorithm will produce biased shuffles. For example, working your way through the pack swapping each card in turn with a random card from any part of the pack is an algorithm with n^n different possible execution paths, yet there are only n! permutations, and a counting argument based on using the pigeonhole principle on assignments of execution paths to outcomes shows that this algorithm cannot produce an unbiased shuffle, unlike the true Knuth shuffle, which has n! execution paths which match up one-to-one with the possible permutations.

For an example of how this pigeonhole argument works on the n^n shuffle, consider the simple case of shuffling only three cards, so that n=3. Then there are n^n = 27 different possible execution paths (these are the 27 things to put in the pigeonholes). But there are only 3! = 6 ways to permute three cards (these are the 6 pigeonholes to put the things in). 27 is odd, and 6 is even, so 27 things cannot be divided evenly into 6 pigeonholes. Therefore, if we assume that the execution paths are equiprobable, then the outcomes cannot be, since at least two of the different outcomes must have different probabilities. Therefore the n^n shuffle cannot be unbiased for n=3. Similar reasoning can be applied for other values of n > 2.

k_c said:
The other problem is that even if it is guaranteed that the numbers are random, if the number of possible random numbers is not a multiple of 52 then a bias will exist. For example if we wanted to generate a random number in the range 0 to 51 and we are given a guaranteed random number from the range 0 to 77 instead of 0 to 32767:

Define r() = random number 0 to 77

num = r() % 52 is always a number 0-51

Here the bias is more obvious though. Since r() is in the range 0-77, num is twice as likely to be 0-25 when compared to 26-51. The problem would be fixed if we could somehow redefine r() to be a guaranteed random number in the range 0-51, but that would solve the problem without having to do anything else. :laugh:

Indeed. The purpose of my randint(n) function is to produce an unbiased, random number in the range 0..n-1. This is needed for use by the Knuth (aka Fisher-Yates) shuffle ...

For a 52-card deck, cards[0] to cards[51] -

swap cards[51] with a random card from cards[0 to 51]
swap cards[50] with a random card from cards[0 to 50]
....
swap cards[01] with a random card from cards[0 to 01]

sagefr0g · Sep 18, 2009

fabietto said:
Hi Sage,

thanks for your interest. Please find the sources attached to this message. Unfortunately I haven't found the time tonight to clean up the code, which is a little bit messy. I'll keep working on it during next days, implementing new functions and making it more user-friendly. I think that as it is it might be anyway a good starting point.

Up to date I've tested it only on Mac OS X 10.6, compiled using g++ 4.2.1. Since it doesn't rely on any particular library or OS-related function, in principle it should compile smoothly on any platform.

Before compiling, all the parameters you can play around with are stored inside the defaultParameters.h file.

Any kind of feedback or advice about possible improvements would be greatly appreciated.

well, thank you very much fabietto!
i'm probably not competent to give any feedback but i believe there are plenty of others on this forum who are. i've long forgotten the c and basic i once had some limited ability with but i will probably be able to stumble my way through most of your code, currently i'm trying to hack together a simulator in excel. lmao, so far it once took me ten days to figure a fix for one problem.
it's good sharing ideas around though, i think. thanks again.
great way to learn. like the post of k_c's and colin's were certainly illuminating.

ExhibitCAA · Sep 18, 2009

Skimming over the thread, here are the various issues that popped out in my mind:

1. It seems that you are allowing the player to hit multiple times and even double down on split Aces. This is NOT typical in modern blackjack. Most casinos restrict split Aces to receiving one and only one card on each. So, if a player split AA, resulting in A5 and AK, he would have a total of 16 on one hand and a total of 21 (NOT a natural blackjack) on the other. The player is stuck with the A5, and is not allowed to hit or double down on it. Since most published numbers on the edge for various sets of rules impose this restriction without explicitly mentioning it, your numbers would not match, and you might infer that you have a bug. When they do mention this rule, it is acronymized as SPA1 (SPlit Aces receive only 1 card each).

2. As Ken Smith pointed out, you must make sure that if the player splits Aces and receives a Ten, the resulting hand is only a regular 21, not a natural blackjack. The regular 21 will only push (tie) if the dealer draws a 21. Though you say you check for BJs first, that is not the issue. It is correct to check for dealer BJ and player BJ first; if there are none, then the code should not allow any subsequent payoffs of 3:2, NOR SHOULD IT allow a split-created 21 (AA ---> A5 and AK) to beat a dealer's hit-created 21. More on this bug can be found at (Dead link: http://www.beyondcounting.com/bb/viewtopic.php?t=150)

3. You should check for dealer BJ (as well as player BJ) before playing out any hand. If you fail to check for dealer BJ right away, you open up the door for a bug allowing the player's hit-created 21 to push the dealer BJ, which is wrong.

4. Don't pay busted hands!

sagefr0g · Sep 18, 2009

ExhibitCAA said:
Skimming over the thread, here are the various issues that popped out in my mind:

1. It seems that you are allowing the player to hit multiple times and even double down on split Aces. This is NOT typical in modern blackjack. Most casinos restrict split Aces to receiving one and only one card on each. So, if a player split AA, resulting in A5 and AK, he would have a total of 16 on one hand and a total of 21 (NOT a natural blackjack) on the other. The player is stuck with the A5, and is not allowed to hit or double down on it. Since most published numbers on the edge for various sets of rules impose this restriction without explicitly mentioning it, your numbers would not match, and you might infer that you have a bug. When they do mention this rule, it is acronymized as SPA1 (SPlit Aces receive only 1 card each).

2. As Ken Smith pointed out, you must make sure that if the player splits Aces and receives a Ten, the resulting hand is only a regular 21, not a natural blackjack. The regular 21 will only push (tie) if the dealer draws a 21. Though you say you check for BJs first, that is not the issue. It is correct to check for dealer BJ and player BJ first; if there are none, then the code should not allow any subsequent payoffs of 3:2, NOR SHOULD IT allow a split-created 21 (AA ---> A5 and AK) to beat a dealer's hit-created 21. More on this bug can be found at (Dead link: http://www.beyondcounting.com/bb/viewtopic.php?t=150)

3. You should check for dealer BJ (as well as player BJ) before playing out any hand. If you fail to check for dealer BJ right away, you open up the door for a bug allowing the player's hit-created 21 to push the dealer BJ, which is wrong.

4. Don't pay busted hands!

great heads up points imho.
also i like the program to recognize player bj & dealer bj pushes.

k_c · Sep 19, 2009

London Colin said:
Did you look at this part of the Wikipedia article? ...

Indeed. The purpose of my randint(n) function is to produce an unbiased, random number in the range 0..n-1. This is needed for use by the Knuth (aka Fisher-Yates) shuffle ...

For a 52-card deck, cards[0] to cards[51] -

swap cards[51] with a random card from cards[0 to 51]

swap cards[50] with a random card from cards[0 to 50]

....

swap cards[01] with a random card from cards[0 to 01]

A few years ago I took a few classes in programming C++. One of the textbooks I had was C++ How To Program by Deitel and Deitel. Mine was the Fourth Edition. In my opinion this was a very good treatment of C++ and it had various algorithms for various problems. Among them was the shuffling algorithm. To me it makes sense. You just go through the shoe and swap the card in each card slot with a random card. This is not a physical shuffle but a computer generated random shuffle. I think for all intents and purposes it is good enough to use to deal a random simulation as long as the random numbers generated are representative of the true probabilities. I trust that the C++ rand() function does that.

On one of the pages you provided links for was this:

Code:

Implementation of Durstenfeld's algorithm in ObjectiveC is:

        int n = [someMutableArray count];
	while ( n > 1) {
		int rnd = arc4random() % n;
		int i = n - 1;
		[someMutableArray exchangeObjectAtIndex:i withObjectAtIndex:rnd];
		n--;
	}

Maybe this is a better algorithm

. In the first iteration the last card can be swapped with any card, including itself. In the next iteration the next to last card can be swapped with any card including itself but excluding the last card. Each succeeding iteration eliminates more and more cards until the initial card is encountered and in that case the only possibility is that that card must be swapped with itself.

In the algorithm I gave:
In the first iteration the first card can be swapped with any card including itself. In each successive iteration cards from any position can be swapped with the current position and also the current position may be swapped with itself. In this algorithm the whole shoe is always open to be swapped with the current position whereas in the above algorithm once a position is filled that card is what will be there at the end of the shuffle and it can't be moved by later iterations of the algorithm.

I think this is what you wanted to point out. I don't know why more possible swaps would make for a less random shuffle, though.

QFIT · Sep 19, 2009

k_c said:
I think for all intents and purposes it is good enough to use to deal a random simulation as long as the random numbers generated are representative of the true probabilities. I trust that the C++ rand() function does that.

Whoa. You won't see me often use acronyms like ROTFLOL. But where did you get that impression? Compiler RNGs are famously awful. I mean truly horrid. Good for maybe 30,000 numbers, not billions.

k_c · Sep 19, 2009

QFIT said:
Whoa. You won't see me often use acronyms like ROTFLOL. But where did you get that impression? Compiler RNGs are famously awful. I mean truly horrid. Good for maybe 30,000 numbers, not billions.

Glad I could give you a good laugh. :grin:

Anyway what I'm working to simulate would be exact calculations for random shoe compositions so I wouldn't need an overwhelming number of trials.

QFIT · Sep 19, 2009

k_c said:
Glad I could give you a good laugh. :grin:

Anyway what I'm working to simulate would be exact calculations for random shoe compositions so I wouldn't need an overwhelming number of trials.

If you mean representative compositions, this gives inaccurate results.

London Colin · Sep 19, 2009

k_c said:
I think this is what you wanted to point out. I don't know why more possible swaps would make for a less random shuffle, though.

The example of a deck of just three cards makes it clear.

Say we have three cards labelled A, B and C; our deck starts in sorted order : ABC. After we shuffle it, there are 6 possible orderings which we want to be equiprobable : ABC ACB BAC BCA CAB CBA.

As stated in Wikipedia, your algorithm gives 27 possible execution paths. That's just about small enough to enumerate them all, in order to demonstrate the problem ...

I'll number the positions 1,2,3; so that the starting point is 1=A, 2=B, 3=C. And I'll use notation such as (2,3) = ACB to mean swap position 2 with position 3 to give the new order : ACB.

Thus the 27 possible outcomes are -

Code:

[B]Swap #1        Swap #2         Swap #3    Final Ordering[/B]
(1,1) = ABC    (2,1) = BAC     (3,1) =    CAB
                               (3,2) =    BCA
                               (3,3) =    BAC
               (2,2) = ABC     (3,1) =    CBA
                               (3,2) =    ACB
                               (3,3) =    ABC
               (2,3) = ACB     (3,1) =    BCA
                               (3,2) =    ABC
                               (3,3) =    ACB

(1,2) = BAC    (2,1) = ABC     (3,1) =    CBA
                               (3,2) =    ACB
                               (3,3) =    ABC
               (2,2) = BAC     (3,1) =    CAB
                               (3,2) =    BCA
                               (3,3) =    BAC
               (2,3) = BCA     (3,1) =    ACB
                               (3,2) =    BAC
                               (3,3) =    BCA

(1,3) = CBA    (2,1) = BCA     (3,1) =    ACB
                               (3,2) =    BAC
                               (3,3) =    BCA
               (2,2) = CBA     (3,1) =    ABC
                               (3,2) =    CAB
                               (3,3) =    CBA
               (2,3) = CAB     (3,1) =    BAC
                               (3,2) =    CBA
                               (3,3) =    CAB

As was stated, the fact that 6 doesn't divide into 27 tells us that there has to be a bias somewhere; now we can add up the numbers for each final permutation and see where that bias falls -

ABC: 4, ACB: 5, BAC: 5, BCA: 5, CAB: 4, CBA: 4

So, even with a perfect RNG, the algorithm makes three of the possible results occur with a probability of 4/27, and the other three 5/27.

The Knuth shuffle, on the other hand, has 6 execution paths, corresponding to the 6 possible results -

Code:

(3,1) = CBA    (2,1) = BCA
               (2,2) = CBA

(3,2) = ACB    (2,1) = CAB
               (2,2) = ACB

(3,3) = ABC    (2,1) = BAC
               (2,2) = ABC

k_c · Sep 19, 2009

London Colin said:
The example of a deck of just three cards makes it clear.

Say we have three cards labelled A, B and C; our deck starts in sorted order : ABC. After we shuffle it, there are 6 possible orderings which we want to be equiprobable : ABC ACB BAC BCA CAB CBA.

As stated in Wikipedia, your algorithm gives 27 possible execution paths. That's just about small enough to enumerate them all, in order to demonstrate the problem ...

I'll number the positions 1,2,3; so that the starting point is 1=A, 2=B, 3=C. And I'll use notation such as (2,3) = ACB to mean swap position 2 with position 3 to give the new order : ACB.

Thus the 27 possible outcomes are -

Code:

[B]Swap #1 Swap #2 Swap #3 Final Ordering[/B] (1,1) = ABC (2,1) = BAC (3,1) = CAB (3,2) = BCA (3,3) = BAC (2,2) = ABC (3,1) = CBA (3,2) = ACB (3,3) = ABC (2,3) = ACB (3,1) = BCA (3,2) = ABC (3,3) = ACB (1,2) = BAC (2,1) = ABC (3,1) = CBA (3,2) = ACB (3,3) = ABC (2,2) = BAC (3,1) = CAB (3,2) = BCA (3,3) = BAC (2,3) = BCA (3,1) = ACB (3,2) = BAC (3,3) = BCA (1,3) = CBA (2,1) = BCA (3,1) = ACB (3,2) = BAC (3,3) = BCA (2,2) = CBA (3,1) = ABC (3,2) = CAB (3,3) = CBA (2,3) = CAB (3,1) = BAC (3,2) = CBA (3,3) = CAB

As was stated, the fact that 6 doesn't divide into 27 tells us that there has to be a bias somewhere; now we can add up the numbers for each final permutation and see where that bias falls -

ABC: 4, ACB: 5, BAC: 5, BCA: 5, CAB: 4, CBA: 4

So, even with a perfect RNG, the algorithm makes three of the possible results occur with a probability of 4/27, and the other three 5/27.

The Knuth shuffle, on the other hand, has 6 execution paths, corresponding to the 6 possible results -

Code:

(3,1) = CBA (2,1) = BCA (2,2) = CBA (3,2) = ACB (2,1) = CAB (2,2) = ACB (3,3) = ABC (2,1) = BAC (2,2) = ABC

You have me convinced, There are definitely 27 possibilities in your 3 card example and 6 equally likely orders that are needed. It's impossible that 27 equally likely possibilities can be representative of 6 equally likely orders.

When I get time I am going to switch my algorithm in the programs that do shuffling. Thank you for pointing this out.

That said, I think it has been shown by others that even non-random shuffles such as are encountered in brick and mortar casinos when compared to random shuffles yield virtually identical results in the long run when there is no intrinsic bias in the non-random shuffle.

My original point in this thread was to suggest the possibility of a shuffling bias to the original poster due to his statement that he restarted his sim 10,000 times. Both my algorithm and the Knuth algorithm would be susceptible to this bias if only the rand() function and modulus division are used each time the shoe is reordered for the reasons I stated.

k_c · Sep 19, 2009

QFIT said:
If you mean representative compositions, this gives inaccurate results.

Well my programs can compute EVs for any composition so all I want is a reasonably random shuffle so I can record the valid data for a lot a data points.

fabietto · Sep 19, 2009

ExhibitCAA said:
Skimming over the thread, here are the various issues that popped out in my mind:

Hi mate... let's see then...

ExhibitCAA said:
1. It seems that you are allowing the player to hit multiple times and even double down on split Aces. This is NOT typical in modern blackjack. Most casinos restrict split Aces to receiving one and only one card on each. So, if a player split AA, resulting in A5 and AK, he would have a total of 16 on one hand and a total of 21 (NOT a natural blackjack) on the other. The player is stuck with the A5, and is not allowed to hit or double down on it. Since most published numbers on the edge for various sets of rules impose this restriction without explicitly mentioning it, your numbers would not match, and you might infer that you have a bug. When they do mention this rule, it is acronymized as SPA1 (SPlit Aces receive only 1 card each).

Thanks for the hint. As I mentioned before, I'm not an expert of the game, so this kind of information about how the "typical" games are played are very important to me. I've modified the simulator, letting the user to decide how to mange the split of aces (considering it as every other kind of split or ending his turn). I've also added a maximum number of splits allowed (4). I guess anyway that the likelihood of splitting more than 4 hands is very low. Of course it will happen in the long run, but its effect has to be marginal at best.

ExhibitCAA said:
2. As Ken Smith pointed out, you must make sure that if the player splits Aces and receives a Ten, the resulting hand is only a regular 21, not a natural blackjack. The regular 21 will only push (tie) if the dealer draws a 21. Though you say you check for BJs first, that is not the issue. It is correct to check for dealer BJ and player BJ first; if there are none, then the code should not allow any subsequent payoffs of 3:2, NOR SHOULD IT allow a split-created 21 (AA ---> A5 and AK) to beat a dealer's hit-created 21. More on this bug can be found at (Dead link: http://www.beyondcounting.com/bb/viewtopic.php?t=150)

3. You should check for dealer BJ (as well as player BJ) before playing out any hand. If you fail to check for dealer BJ right away, you open up the door for a bug allowing the player's hit-created 21 to push the dealer BJ, which is wrong.

Yes, that's what I'm doing. I check for naturals (and eventually pay them 3-to-2) before the hand is played, so there's no risk that a 21 made with more than two cards would be payed 3-to-2. The only thing I'm not sure about (and I was discussing with John) is who to check first. In the current version of the simulator I check first for a blackjack dealt to the player, then for a blackjack dealt to the dealer. I don't know it this might be an issue. If I'm not wrong, the order should not significantly affect the game (if the dealer has a natural and the player doesn't, the hand finishes anyway; the same if the player has a natural and the dealer no).

ExhibitCAA said:
4. Don't pay busted hands!

Lol... would be funny to do that from time to time... I could pretend to take into account dealers' mistakes...

QFIT · Sep 19, 2009

fabietto said:
In the current version of the simulator I check first for a blackjack dealt to the player, then for a blackjack dealt to the dealer. I don't know it this might be an issue. If I'm not wrong, the order should not significantly affect the game (if the dealer has a natural and the player doesn't, the hand finishes anyway; the same if the player has a natural and the dealer no).

It effects the cards eaten in a round as more cards are used if the dealer does not peak.This can affect the outcome.

fabietto · Sep 19, 2009

QFIT said:
It effects the cards eaten in a round as more cards are used if the dealer does not peak.This can affect the outcome.

Yes, of course. But this doesn't look to me like a systematic bias advantaging the player or the dealer. Or is it?

QFIT · Sep 19, 2009

Depends on circumstances. It could certainly bias a single-deck game.

fabietto · Sep 19, 2009

London Colin said:
Incidentally, the C++ standard library has a random_shuffle() function which can be used. By default it handles the random number generation internally

That's the function I'm using in my software. According to the documentation (http://stdcxx.apache.org/doc/stdlibref/random-shuffle.html), the elements in the array are shuffled according to a uniform distribution. Seems to me a good and easy way to obtain a proper shuffle. Might be interesting to see if calling multiple times (i.e., at the beginning of each trial) the std::rand() function makes any differences compared to just calling it once at the beginning of a simulation.

iCountNTrack · Sep 19, 2009

Some reflections

To further clarify what is being correctly mentioned here.

The expectation value calculated from basic strategist/counter is only dependent on the deck/shoe composition and the house rules. The sequence of the cards does not really matter per say, because the ev of a particular composition is an average value for millions of sequences of that composition. However, for the results of a simulation to match those of a combinatorial analysis (for CA as k_c mentions only ONE shuffle/sequence is needed) , millions of shuffles and an ideal RNG is a must.
To illustrate this point, the ev of a 6D DAS S17 game with optimum play(BS) using CA is -.402%

The following image has two sims of the same game except that the first one uses the RNG of Cvdata to generate random shuffles and the second is very non-random shuffle where two 156 card grabs are riffled once. Please note how close the ev (IBA) of the first sim (-0.398%) to the value obtained from CA, and how off is the ev obtained from the second sim (-0.560%).

QFIT · Sep 19, 2009

This neat thing about combinatorial analysis is that the answer is exact, assuming a random shuffle and fixed number of rounds. The problem is that CA must assume a random shuffle and fixed number of rounds. Few games have a fixed number of rounds, and as we all know, EV is worse with a cut-card.

sagefr0g · Sep 19, 2009

iCountNTrack said:
To further clarify what is being correctly mentioned here.

The expectation value calculated from basic strategist/counter is only dependent on the deck/shoe composition and the house rules. The sequence of the cards does not really matter per say, because the ev of a particular composition is an average value for millions of sequences of that composition. However, for the results of a simulation to match those of a combinatorial analysis (for CA as k_c mentions only ONE shuffle/sequence is needed) , millions of shuffles and an ideal RNG is a must.
To illustrate this point, the ev of a 6D DAS S17 game with optimum play(BS) using CA is -.402%

The following image has two sims of the same game except that the first one uses the RNG of Cvdata to generate random shuffles and the second is very non-random shuffle where two 156 card grabs are riffled once. Please note how close the ev (IBA) of the first sim (-0.398%) to the value obtained from CA, and how off is the ev obtained from the second sim (-0.560%).

ok icnt, i see the difference.

but what's the why of it?
or at least the how of it?
i mean what's the source and reason for these differences, errhh ehhmm, i mean the mechanics of whats going on.

:whip:

just i would imagine that the mechanics of that 156 card grab and riffle is unrealistic for the long run sort of thing.

:whip: but it could happen.

probably what i'm asking is what's it all mean, or i'd just like to hear some rational with respect to what you layed out there.

part of it is i guess your showing maybe a random shuffle is unbiased while a non random repeated same way shuffle could be over the long run? the implication being the closer to random the better for sims?

:whip:

QFIT · Sep 19, 2009

Short run and long run are the same thing as far as EV.

It simply means that a very poor shuffle has a substantial effect. Not a problem since CVData can sim shuffles.

Software simulator and unexpected results

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

New Member

Well-Known Member

New Member

Well-Known Member

New Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member