- The Radix Sort algorithm sorts a series of keys by sorting on the corresponding digits of the keys, beginning with the least significant digit up to the most significant digit.
- We think of a key as an n-digit number (i.e. 365 is a 3-digit number).
- In the past, Radix Sort was used on electro-mechanical, punched-card sorting equipment.
- To understand the basic idea of how Radix Sort works, lets examine how this used to be done.
- Suppose we have a set of 3-digit numerical keys. We will assume we are sorting a deck of cards, each card having 3-digit key on it.
- The deck of cards is placed into a card-sorter machine, and cards to be sorted are fed from the deck.
- We can set the sorting machine to sort either the first digit, the middle digit, or the last digit, of the 3-digit key.
- Suppose we have a card that is currently being read by the sorting machine.
- If the digit currently selected in the card is 0, then the card is placed in a bin containing a pile of all the cards having 0's digits. This is called the 0's pile.
- Similarly, if the card's currently selected digit, d, is 1 through 9, it is placed in the respective bin numbered d, which contains the respective digit-d-pile.
- When the entire deck is processed, the piles of cards in the digit bins are stacked to re-constitute a new deck, with the 9's pile on the bottom, the 8's pile next to the bottom, and so on, until the 0's pile is placed on the top of the deck.
- This deck is then fed through the sorting machine another time, with a the digit of greater significance selected for sorting.
- We can sort the entire deck in this manner, making three passes.
- Lets look at an example of Radix Sort where we use 3-digit numbers for keys.
- Here is our initial unsorted list:
516
223
323
413
416
723
813
626
616
- So Radix Sort is going to make 3 passes through this list. One pass for each of the 3 digits.
- On the first pass through the list, the third (or least significant) digit is selected for sorting.
- Taking keys one at a time from the list, you can see that the keys will end up either in the 6's pile or the 3's pile, since the keys end in either 3 or 6.
223
323 516
413 416
723 626
813 616
3's pile 6's pile
- After the entire deck is processed, the keys in the 3's pile are placed on top of the keys in the 6's pile, to give the new list of numbers shown below.
223
323
413
723
813
516
416
626
616
- On the second pass, the middle digit is selected, and the keys are sorted so that the keys with a middle 1's digit are on top of those with a middle 2's digit.
413
813
516
416
413 616
813 223 223
516 323 323
416 723 723
616 626 626
1's pile 2's pile new resulting list
after 2nd pass
- On the third and final pass, the first digit (most significant) is selected for sorting, and the keys are sorted in the order of their first digit. The result from this last pass is shown below, after the separate digit piles are reassembled into the final sorted list.
223
323
413
416
516
616
626
723
813
resulting list after
final pass on first
(most significant) digit
- You can see that after sorting on the most significant digit, the resulting list is in sorted order.
- Radix Sort is essentially a O(n) sorting process because is makes exactly k linear passes through the list on n keys when the keys have k digits.
- In Homework 4 you are required to implement the Radix Sort algorithm to
sort various sets of integers. Lets discuss how to go about doing this.
- The first thing we note is that the keys we are sorting are integers.
- In Java, 32-bit
integers range from -2147483648 to 2147483647. The assignment tells us we can
assume nothing about the size of the integers (except that they are positive,
why?), so we will have to implement Radix Sort with 10-digit
keys because that is the maximum number of digits an integer will have.
- So for this implementation of Radix Sort we will have to make ten passes for
each of the 10-digits in the key (integer), and once again we will have 10
piles (or buckets) for each digit 0-9.
- To represent the 10 piles, we can use an array of 10 queues. This way each
array index will be a queue that represents a pile (or bucket) for one of the
10 digits (0 through 9). Feel free to utilize the queue implementation code
from the Weiss book.
NOTE: We would like to utilize a regular queue, not a
priority queue. Why?
- When you first generate the N integers you will be storing them in an array.
Each number will initially be sorted into its proper queue (i.e. pile)
on its least
significant key digit. This corresponds to the initial pass in Radix Sort.
- After this initial pass, you will have an array of 10 queues with integers
sorted by the least significant digit. So for example, all integers that end in
0 will be in the queue at index 0. All integers that end in 7 with be in the
queue at index 7.
- It is possible to do a radix sort with only two such arrays of queues. This
is done by moving the keys (the integers) from one array of queues to the other
array of queues for each digit pass. Note: the source queues should to be
re-initialized so that they can be reused as the destination queues on the next
pass.
- We will now make a second pass on the next least significant digit (the
second digit). We must start with the pile of integers in the queue at location
0 in the array in order to maintain the proper ordering. We will move through
each pile of integers, removing them from the top of the queue until the queue
is empty and then moving to the queue at the next array index. Each integer
should be sorted into our destination array of queues on the value of the second
digit. (i.e. after this pass the integer 4393821 will be in the queue at array
location 2).
- Notice that by moving through the array of queues in this order, we are
essentially making passes through the list of integers as if they had been
reassembled as in the example above.
- We will make a total of 10 passes, the 10th pass being the final pass on the
most significant digit.
- Say we wanted to write the sorted list to a file. Then after this pass we
can move through are queues, once
again beginning in the queue at index 0, and write the integers out
to a file because they will be in sorted order.
- Now consider the fact that we will undoubtedly be sorting integers of
different size. In other words, not all of our keys will be ten digits in length.
- Since we are sorting integers this is not a problem. Conceptually, we
can think of the solution as padding the integers of length less than
10 with leading zeros. However, as we will see, it won't be necessary to actually
alter the integers in any way.
- Say the numbers 256 and 1024 are being sorted on the 4th digit. Using our
padding analogy, the 4th digit of 256 is 0 and the 4th digit of 1024 is 1. So
clearly 256 will be sorted as being less than 1024.
- Here is an outline of our Radix Sort algorithm:
// read N integers from an array (initial pass)
for i=0 to N-1
calculate index i for integer corresponding to least significant digit
// this array of queues will be the original source array of queues
insert integer into queue at array index i
// make a pass for each digit d starting from 2nd least significant
for each digit d in integer (2nd digit to 10th)
// radix sort function
for each index i in the array (go from 0 to 9)
while source queue at index i is not empty
remove integer from queue
calculate index i of integer key (based on the value of digit d)
insert into destination array of queues at index i
// when source queue is empty, initialize it so it can
// reused as destination queue next time
initialize source queue at index i
switch destination and source array of queues for next pass
- In the algorithm above, we will need to calculate the proper index for a
integer before inserting it into the array of queues.
- The index will be based
on the value of the digit d (i.e. assuming we start counting from d=0,
if d=2, the array index
(i.e. pile) of digit d for the integer 3156 is 1). Remember that each
queue represents a pile (or bucket).
- Assuming we pass the integer being sorted as the
key
parameter
and the digit d
that we are sorting on (NOTE: the method assumes
d
ranges from 0 to 9), we can calculate the proper index of an number
with the following method:
// return the appropriate digit d
public static int getDigit(int key, int d)
{
return ( ( key / (int)Math.pow(10, d) ) % 10 );
}
- This function divides the key by the appropriate power of 10 (10 raised to
the dth power) and then mods then result by 10 to obtain the
dth digit's value.
- Now we can see why we don't need to be concerned with the fact that not
all integers are 10 digits long. Any time we attempt to compute a digit greater
than the number of digits in the key (the integer), our method will return 0.
- For example, suppose we attempt to compute the 4th digit of the number
256. We call
getDigit(256, 3)
and the function returns 0 as we
would like (i.e. ((256/10^3) % 10) = 0).