CPSC 330 Fall 2007 Project 3 Due by exam time (1:00 pm) on Tuesday, December 11 Write a short (three to five page) report on the design and use of the Sun T1000 (Niagara). Include actual benchmark runs. On-line resources: Sun Fire T1000 Server http://www.sun.com/servers/coolthreads/t1000/ look through the tabs: * Overview * Features * Tech Specs * Perspectives (link to CoolThreads Interactive Tour video) Sun T1000/T2000 Architecture White Paper http://www.sun.com/servers/coolthreads/t1000-2000-architecture-wp.pdf Improving Application Efficiency Through Chip Multi-Threading http://developers.sun.com/solaris/articles/chip_multi_thread.html Darryl Gove, Coding for Multiple Threads on a CMT System http://www.cs.clemson.edu/~mark/330/sun_gove_slides.pdf For example, run the array program below on niagara.cs.clemson.edu and see if you get the same type of throughput increase as graphed in Gove's slides on page 15 of the pdf. (For best results, make sure you are the only user on the system when you run your benchmarks - if anyone else is on the system, wait and try again after a few minutes.) (Be sure to include any sources you use in a bibliography. Include the URL for any graphic that you get from a paper or the web and use in your paper; put the URL in a caption underneath the graphic.) /* example multithreaded program using Solaris threads (see "man threads") * * compile this using 'gcc -lthread' */ #include #include #include #include #define N 10000000 #define T 100 int a[N]; int partition_length; int partial_sum_results[T]; void *thread_code(void *thread_id){ int i,tid,local_sum; tid = *((int *) thread_id); local_sum = 0; for( i = tid*partition_length; i < (tid+1)* partition_length; i++ ){ local_sum = local_sum + a[i]; } partial_sum_results[tid] = local_sum; } int main( int argc, char * argv[] ){ thread_t t[T]; int tid[T]; int i,n; int final_sum; hrtime_t t_start,t_end; if( argc < 2 ){ printf("usage: where 1 <= n <= T\n"); exit(0); } n = atoi( argv[1] ); if( ( n < 1 ) || ( n > T ) ){ printf("usage: where 1 <= n <= T\n"); exit(0); } partition_length = N/n; if( ( N - (n*partition_length) ) != 0 ){ printf("number of threads doesn't divide evenly\n"); }else{ printf("number of threads is %d, partition length is %d\n", n,partition_length); } t_start = gethrtime(); for( i = 0; i < n; i++ ){ tid[i] = i; thr_create(NULL, 0, thread_code, (void *) &tid[i], (long) 0, &t[i]); } while(thr_join(NULL,NULL,NULL)==0); final_sum = 0; for( i = 0; i < n; i++ ){ final_sum = final_sum + partial_sum_results[i]; } t_end = gethrtime(); printf("program with %d thread runs %7.2f secs\n",n, (t_end-t_start)/1000000000.0); }