Saturday, October 1, 2016

Big data -- sorting this and that

Running a database project? Want to be agile and try different stuff quickly? Here's some stuff you can use, and it only takes 6 minutes to learn a few things

It's all about sorting "big data", to wit: millions of records. We've all sorted data on a spreadsheet -- a few thousand records at most -- but that happens at the blink of an eye. Big data takes a bit longer.

Here's a graphic video demonstration of 15 different sorting methodologies all applied to the same data set, one method at a time (and, set to music). The method is in the small type at the upper left of the video screen.
Sorts random shuffles of integers, with both speed and the number of items adapted to each algorithm's complexity.
The algorithms are:
  • selection sort, insertion sort, quick sort,
  • merge sort, heap sort, radix sort (LSD),
  • radix sort (MSD), std::sort (intro sort), std::stable_sort (adaptive merge sort),
  • shell sort, bubble sort, cocktail shaker sort,
  • gnome sort, bitonic sort and
  • bogo sort (30 seconds of it).

It's fun to watch -- but there's information here also. Take note of how fast one method works compared to another, and note the intermediate data patterns that emerge.

After each method completes, it will be obvious; a big green triangle appears -- you have to see it to know what I'm talking about.

Anyway, it's a fun 6 minutes:

Read in the library at Square Peg Consulting about these books I've written
Buy them at any online book retailer!
Read my contribution to the Flashblog