=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Andrew Tridgell CSLab, Research School of Physical Sciences Andrew.Tridgell@anu.edu.au Australian National University (x3064) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This is a preliminary release of a parallel sorting algorithm written by Andrew Tridgell. A technical report discussing the algorithm has also been released and is in postscript form in this directory. Basic features: ============== - comparison based (user supplied comparison function) - requires O(sqrt(N/P)) storage - high parallel efficiency Contact: Andrew.Tridgell@anu.edu.au Usage: ===== The file main.c provides an example for using this code for sorting of integers and strings. In the Makefiles note that the define STATS controls the output of timing statistics on each stage of the sorting process. Also note that the define INTEGER controls whether the code should be inlined to be specific to integer sorting. Porting: ======= It should be a simple matter to port this code to machines that support a general message passing model. To do this you should only need to change the Makefile and the header file mimd.h. If you do port the code to any other system or make any significant improvements to the code then please notify the author. The code is ANSI C so you can't use a plain K&R C compiler like cc on SunOS. Use acc or gcc. Speedups: ======== In the version as released here the gnu_qsort replacement for qsort is not used. Neither is the amemmove sparc assembly memcpy() used. This has been done to make the code both portable. These two changes make a considerable difference to the performance of the code so if you want to re-instate them you can. Defines: ======= There are a number of #defines and #ifdefs in the code which control the way sorting is done. In particular: In Makefile.*: -DINTEGER=1 will turn on the inline integer code. This will mean you can only sort integers. -DSTATS=1 will make the code record timing information about each phase of the algorithm and print it when finished. In main.c: #define SAVE_SORT 1 will cause the main program to dump a before and after copy of the elements to disk in files "before" and "after" #define SORT_TEST 1 will cause the sorting to be verified by sorting on one node - if it has the memory! #define BALANCED 1 will cause each node to initially have the same number of elements. #define CHECKSUM 1 will enable a primative checksum to test the sort is behaving correctly. In particular it detects "lost" elements from bounds errors. #define STRING 1 will make the code sort strings instead of integers. Change the string length in the line "typedef string char[16]". In par_sort.c: #define GUARANTEE_SORT 1 will turn on a paranoid final sorting phase. This is not needed if the sort is working. #define HYPER_SORT 1 turns on the primary merge phase. This is not strictly needed but tends to make the sort much faster. #define UNBALANCED_MERGE 1 will turn on the special case unbalanced merge. This gives a performance boost. #define COMPLETE_BALANCE 1 will turn on the complete_balance phase that simplifies the merging procedure. This is not a complete list. The others should probably be left alone unless you want to break the algorithm. Environment Variables: ===================== There are three environment variables that control the sorting. Two are specific to the AP1000. NOKILL: On the AP1000 if the environment variable NOKILL is not set to anything then the program will kill itself after 5 minutes. This is useful as the AP1000 is a single user machine. NCELLS: On the AP1000 this will control the number of cells to be used in the program. There is no equivalent control on the CM5. SEED: sets the seed for random number generation. This is useful for reproducing runs. If unset it will use a random seed from the pid and time.