04.06.06

Random Diversions on Shuffling Cards

Posted in Programming and People at 1:45 pm by Brooks

I’ve been thinking a lot about the idea of practicing programming lately, after reading Steve Yegge’s “rant” on the subject. His claim is that programming is a lot like playing guitar – just doing a lot of it doesn’t make you any better; you have to specifically do things that will improve them.

Dave Thomas suggests in his Code Kata post that one way of practicing programming is by solving small-but-complex algorithmic problems, and he has a list of suggestions.

In the shower this morning, I was thinking about a problem that would fit nicely in that list. I’m writing a complicated bit of code that takes the triangles in a triangular mesh and loads them into a data structure. It would be nice to test it by taking a given mesh and loading the triangles in various random orders, and that immediately leads to the classic card-shuffling problem: given a list of objects, how does one efficiently sort them into a random order?

With cards, this is relatively easy at first glance: Put the cards in a stack (i.e., a linked list). Pick a random card from the current stack, and use it to start a new stack. Pick another random card, and put it on the stack. Repeat until the original stack is empty.

In practice, this doesn’t work so well: people aren’t very good at picking really random cards. Or anything else random; this is why random play doesn’t win rock-scissors-paper contests. But that’s a digression; we’ll assume that we have a “good enough” random number generator to use, just like a computer would.

Having a random number generator doesn’t solve the problem, though. It picks numbers, not cards. There’s still the problem of mapping the number to the card.

The trivial solution is to number the cards in the stack from 1 to 52 (thereby making it an indexed array), get a random number from 1 to 52, and pull out the corresponding card. That works nicely for the first card, but then we’ve got a stack of cards that are numbered from 1 to 52, but one of them’s missing. Now what?

We could renumber the cards after the one we pulled out, reducing each one’s number by one so that the deck is numbered from 1 to 51, but that’s still a lot of work – for the whole deck, it’s O(n2).

(Now that I’m writing this out, I see an obvious solution here that I didn’t see earlier. I’ll get to that later.)

Alternately, we could give up on renumbering the cards, and just leave them in a linked list. Then, when the random number generator gives us a number, we count down that many cards from the top of the deck. Again, for the whole deck, we end up counting O(n2) cards.

So, consider an alternate approach: Take the top card off the stack, and put it in a random place in the new stack. This looks like the same problem – how is locating a random place easier than locating a random card – but there’s a twist. Because we’re doing this in a computer, we can keep the cards in two orders at once. We can store them in one order that makes them easy to find, and simultaneously in another order that’s randomized.

First, consider the new deck as a linked list; we pick a card randomly, and insert the new card after it. On a pointer level, this means that the new card points to what the old card had pointed to, and the old card now points at the new card. The trick is that, when picking a random card, we don’t have any reason to care what order they’re in in the linked list. So, if we’re actually storing the cards in an indexed array, and putting each new card on the end of the array, then all we need to do is pick a random index and go straight to that card. Once all the cards are in the new deck, then we can traverse the linked list and pull them out in the shuffled order.

There’s a bit of nice optimization, too; we could pull cards off the bottom of the old deck just as easily as off the top, and since we’re putting them on the top of the new deck (insofar as actual storage location), there’s no need to actually move the cards at all; everything below the current index is the new deck, and above it is the old deck. If the cards are just numbers, there’s no need to even store the cards at all; we just store the pointers.

So, how does that actually get implemented? Start at the beginning of an indexed array of pointers (which are really just numbers). Increment the index. Pick a random bin below the current index. Take the number from the random bin and put it in the bin corresponding to the current index. Put the current index as a pointer in the random bin. Repeat until done.

So, back to that parenthetical note from earlier. There’s really no need to renumber all the cards, because they don’t have to stay in the same order. We could just take last card on the deck, and renumber it to fit in the gap, and be done. So, we take a random card out of the deck, put it at the first spot of the new deck, and then move the last card from the deck into the hole that got left. Or, to do this in the same space, we could just pick a card, set it aside, take the first card from the deck and set it in the hole, and put the picked card in the now-empty first spot. Now, the “new deck” is the first card, and the “old deck” is the rest, and so forth as we go along.

How does that get implemented?

Start at the beginning of an indexed array of card numbers (which are essentially pointers). Increment the index. Pick a random bin above the current index. Take the number from the random bin and put it in the bin corresponding to the current index (i.e., the end of the new deck). Take the current index (i.e., the card from the end of the old deck) and put it in the random bin.

An interesting similarity, isn’t it?

Leave a Comment