In this class, if you aren't writing code you are playing cards. That's because most of these algorithms, even complex ones, can often be seen being used subconsciously in everyday life. Not only computers would like to finish their work early!
Consider sorting half a deck of cards. We use the half of deck including all cards from the red diamond suit and the black clubs suit. Let's order them: any red comes before any black, ace is one, and the pictures cards Jack, Queen, King are 11, 12, 13, as is typical.
How I would sort these cards is to first make two piles, one of red and the other of black; sort the piles individually; and complete the task by placing the pile of sorted red cards on top of the pile or sorted black cards. Lets look mathematically at what has happened.
To sort 26 cards by an n2/2 sort would cost 338 units of activity. The grand total for the above proces is 196 units of activity. A significant reduction in work. The reason for that is since a square function is convex upwards, cutting the size of the problem in half reduces the amount of work by strictly more than half. It takes less effort to solve twice as many problems of half the size compare to solving the entire problem at once. However, is two separate small solutions good enough? In our case it was, because first we had a cheap method to separate into two problems and second we had a cheap method to join the sperate problem into the solution of the original problem.
Shuffle | Split | Sort red | Sort black | Finished |
The principal is: two problems of half size are, even in total, easier than the one original problem; and the two half solutions can be brought together quick enough to so solve the full solution without spending away all the savings gained.
If splitting once is good, then maybe splitting again, and again, is better. In these recursive n log n sorts, each sub-problem is split into further sub-problems, until the sub-problem is small enough to be solved (sorted) with very little effort, a constant amount of effort. Often the splitting continues until there is only a one card to sort.
So how quick is quick when it comes to the job of separation into subproblems and joining the subproblems back into a full solution? A work proportional to the size of the problem will do. In our example, the separation was achieved by passing through the n cards and performing a simple amount of work on each card, for a work of something proportional to n. In our example the job of bringing the cards together was almost no work. In Mergesort the job of bringing the subproblems together will cost a work factor portional to n, where n is the number of cards. The question then becomes:
Quicksort and mergesort are sorts that work according to the cheaper by halves principal. In quicksort, the dataset is split into small and large values, and each is separately sorted. The entirety is sorted by simply bringing together the two sorted halves. Here we placed the reds (small) separate for the blacks (large). In quicksort, finding a place to split is not as clear. A pivot item is selected at random from the set of numbers to sort, and its value becomes the boundary between values considered large and those considered small. Often the split is then not exactly in the middle, since the pivot need not be the median value of the set.
Mergesort operates by:
Added August, 2009.
What sort of algorithms benefit from dividing the problem into several smaller
fragments? It must be that one gains more by solving two halves than loses to the
business of dividing into and recombining the halves. In the simplest case:
How about n log n? We sort of know that n log n is very good for a sorting algorithm, at least we are presented such algorithms and no one tries to go on and do better. What would dividing into small problems do for a program with growth n log n?