CS W3139-02 Data Structures and Algorithms -- Spring 1999

Recitation 10 - Radix Sort

Radix Sort

Radix Sort

The Radix Sort algorithm sorts a series of keys by sorting on the corresponding digits of the keys, beginning with the least significant digit up to the most significant digit.
We think of a key as an n-digit number (i.e. 365 is a 3-digit number).
In the past, Radix Sort was used on electro-mechanical, punched-card sorting equipment.
To understand the basic idea of how Radix Sort works, lets examine how this used to be done.
Suppose we have a set of 3-digit numerical keys. We will assume we are sorting a deck of cards, each card having 3-digit key on it.
The deck of cards is placed into a card-sorter machine, and cards to be sorted are fed from the deck.
We can set the sorting machine to sort either the first digit, the middle digit, or the last digit, of the 3-digit key.
Suppose we have a card that is currently being read by the sorting machine.
If the digit currently selected in the card is 0, then the card is placed in a bin containing a pile of all the cards having 0's digits. This is called the 0's pile.
Similarly, if the card's currently selected digit, d, is 1 through 9, it is placed in the respective bin numbered d, which contains the respective digit-d-pile.
When the entire deck is processed, the piles of cards in the digit bins are stacked to re-constitute a new deck, with the 9's pile on the bottom, the 8's pile next to the bottom, and so on, until the 0's pile is placed on the top of the deck.
This deck is then fed through the sorting machine another time, with a the digit of greater significance selected for sorting.
We can sort the entire deck in this manner, making three passes.
Lets look at an example of Radix Sort where we use 3-digit numbers for keys.

So Radix Sort is going to make 3 passes through this list. One pass for each of the 3 digits.
On the first pass through the list, the third (or least significant) digit is selected for sorting.
Taking keys one at a time from the list, you can see that the keys will end up either in the 6's pile or the 3's pile, since the keys end in either 3 or 6.
```
     223       
     323         516
     413         416
     723         626
     813         616
  3's pile    6's pile
```
After the entire deck is processed, the keys in the 3's pile are placed on top of the keys in the 6's pile, to give the new list of numbers shown below.
```
     223
     323
     413
     723
     813
     516
     416
     626
     616
```

On the second pass, the middle digit is selected, and the keys are sorted so that the keys with a middle 1's digit are on top of those with a middle 2's digit.

                                  413      
			          813
			          516
			          416
     413		          616
     813         223	          223
     516	 323	          323
     416	 723	          723
     616	 626	          626
  1's pile    2's pile     new resulting list
                             after 2nd pass

On the third and final pass, the first digit (most significant) is selected for sorting, and the keys are sorted in the order of their first digit. The result from this last pass is shown below, after the separate digit piles are reassembled into the final sorted list.
```
                 223
                 323
		 413
		 416
		 516
		 616
		 626
		 723
		 813
         resulting list after 
         final pass on first 
       (most significant) digit
```
You can see that after sorting on the most significant digit, the resulting list is in sorted order.
Radix Sort is essentially a O(n) sorting process because is makes exactly k linear passes through the list on n keys when the keys have k digits.
In Homework 4 you are required to implement the Radix Sort algorithm to sort various sets of integers. Lets discuss how to go about doing this.
The first thing we note is that the keys we are sorting are integers.
In Java, 32-bit integers range from -2147483648 to 2147483647. The assignment tells us we can assume nothing about the size of the integers (except that they are positive, why?), so we will have to implement Radix Sort with 10-digit keys because that is the maximum number of digits an integer will have.
So for this implementation of Radix Sort we will have to make ten passes for each of the 10-digits in the key (integer), and once again we will have 10 piles (or buckets) for each digit 0-9.
To represent the 10 piles, we can use an array of 10 queues. This way each array index will be a queue that represents a pile (or bucket) for one of the 10 digits (0 through 9). Feel free to utilize the queue implementation code from the Weiss book. NOTE: We would like to utilize a regular queue, not a priority queue. Why?
When you first generate the N integers you will be storing them in an array. Each number will initially be sorted into its proper queue (i.e. pile) on its least significant key digit. This corresponds to the initial pass in Radix Sort.
After this initial pass, you will have an array of 10 queues with integers sorted by the least significant digit. So for example, all integers that end in 0 will be in the queue at index 0. All integers that end in 7 with be in the queue at index 7.
It is possible to do a radix sort with only two such arrays of queues. This is done by moving the keys (the integers) from one array of queues to the other array of queues for each digit pass. Note: the source queues should to be re-initialized so that they can be reused as the destination queues on the next pass.
We will now make a second pass on the next least significant digit (the second digit). We must start with the pile of integers in the queue at location 0 in the array in order to maintain the proper ordering. We will move through each pile of integers, removing them from the top of the queue until the queue is empty and then moving to the queue at the next array index. Each integer should be sorted into our destination array of queues on the value of the second digit. (i.e. after this pass the integer 4393821 will be in the queue at array location 2).
Notice that by moving through the array of queues in this order, we are essentially making passes through the list of integers as if they had been reassembled as in the example above.
We will make a total of 10 passes, the 10th pass being the final pass on the most significant digit.
Say we wanted to write the sorted list to a file. Then after this pass we can move through are queues, once again beginning in the queue at index 0, and write the integers out to a file because they will be in sorted order.
Now consider the fact that we will undoubtedly be sorting integers of different size. In other words, not all of our keys will be ten digits in length.
Since we are sorting integers this is not a problem. Conceptually, we can think of the solution as padding the integers of length less than 10 with leading zeros. However, as we will see, it won't be necessary to actually alter the integers in any way.
Say the numbers 256 and 1024 are being sorted on the 4th digit. Using our padding analogy, the 4th digit of 256 is 0 and the 4th digit of 1024 is 1. So clearly 256 will be sorted as being less than 1024.

Here is an outline of our Radix Sort algorithm:

// read N integers from an array (initial pass)
for i=0 to N-1
   calculate index i for integer corresponding to least significant digit
   // this array of queues will be the original source array of queues 
   insert integer into queue at array index i
                                        
// make a pass for each digit d starting from 2nd least significant 
for each digit d in integer (2nd digit to 10th)
   // radix sort function 
   for each index i in the array (go from 0 to 9)
       while source queue at index i is not empty
          remove integer from queue
          calculate index i of integer key (based on the value of digit d)
          insert into destination array of queues at index i
       // when source queue is empty, initialize it so it can 
                     // reused as destination queue next time 
       initialize source queue at index i 
    switch destination and source array of queues for next pass

In the algorithm above, we will need to calculate the proper index for a integer before inserting it into the array of queues.
The index will be based on the value of the digit d (i.e. assuming we start counting from d=0, if d=2, the array index (i.e. pile) of digit d for the integer 3156 is 1). Remember that each queue represents a pile (or bucket).
Assuming we pass the integer being sorted as the key parameter and the digit d that we are sorting on (NOTE: the method assumes d ranges from 0 to 9), we can calculate the proper index of an number with the following method:
```
// return the appropriate digit d
public static int getDigit(int key, int d)
{
  return ( ( key / (int)Math.pow(10, d) ) % 10 );
}
```
This function divides the key by the appropriate power of 10 (10 raised to the dth power) and then mods then result by 10 to obtain the dth digit's value.
Now we can see why we don't need to be concerned with the fact that not all integers are 10 digits long. Any time we attempt to compute a digit greater than the number of digits in the key (the integer), our method will return 0.
For example, suppose we attempt to compute the 4th digit of the number 256. We call getDigit(256, 3) and the function returns 0 as we would like (i.e. ((256/10^3) % 10) = 0).

[CS3139 Home]

CS W3139-02 Data Structures and Algorithms -- Spring 1999

Recitation 10 - Radix Sort

Contents

Radix Sort