***************** * E X P R E S S * ***************** In 1982 a team of researchers at the California Institute of Technology built the original "Cosmic Cube" - the world's first practical parallel computer system. Shortly after this the same group began building a software development package that culminated in the "Crystalline Operating System". In 1987 members of the original research group started ParaSoft Corporation, a Company whose goal was to make the lessons learned from this research available to all members of the parallel and distributed computing worlds. The result is Express; an integrated software package that addresses all phases of the application development cycle in an architecture independent manner. Parallel or distributed programs developed under Express run on all supported platforms which range from supercomputers from Cray, IBM, Alliant, INTEL and nCUBE through networks of personal workstations all the way down to PC's. All Express programs are source code compatible across all platforms irrespective of architecture. The same Express program will run on distributed and shared memory multi-processors, vector supercomputers and workstation networks allowing the freedom to choose development platforms for convenience and then the code can be migrated to a target machine for performance. Over 1000 sites world-wide use Express. Alliant, INTEL and nCUBE all ship portions of Express as "standard" development tools on their platforms. The applications developed range from research topics such as quantum chemistry and high energy physics to commercial products such as oil reservoir simulation, fluid dynamics and the tracking of stolen cars in Los Angeles. The Express package is the only parallel processing system which addresses all four aspects of parallel program development; algorithm design, implementation, debugging and performance analysis. - Algorithm design Express supports all the common parallel and distributed pro- gramming models including data and functional decompositions, client/server, distributed database, etc... It includes libraries of utilities to support each paradigm. - Implementation Express contains tools to visualize and analyze sequential and parallel code and a parallelizer that automatically creates distributed programs from sequential C and Fortran. - Debugging Express provides NDB, a powerful distributed debugger. - Performance analysis Four different types of performance analysis are built into the system allowing the behavior of the parallel or distributed program to be analyzed, visualized and optimized. In addition to these features Express 4.0 contains utilities to support static and dynamic load balancing, fault tolerance and support for heterogeneous systems built from multiple hardware types! While other toolsets address some of these issues, Express is the only one that has all of these items in a single tightly integrated package. *********************************** * The E X P R E S S Toolkit * *********************************** The four stages of the program development cycle are addressed with different portions of the Express toolkit. Some of the highlights are described below: Before working on the parallel version of a program it is necessary to understand what the sequential version does. - INSIGHT INSIGHT allows the progress of sequential algorithms to be displayed in a dynamic graphical manner. Updates and refer- ences to individual data structures can be displayed to explicitly demonstrate algorithm structure and provide the detailed knowledge necessary for parallelization. - FTOOL FTOOL provides in-depth analysis of a program including vari- able use analysis, flow structure and feedback regarding potential parallelization. FTOOL operates on both sequential and parallel versions of an application. The designing phase of parallel programming is not effective unless there exist tools to implement the selected strategy. - ASPAR An automated parallelizer that converts sequential C and FOR- TRAN programs for parallel or distributed execution using the Express programming models. - COMMUNICATION LIBRARIES Express provides a large library of interprocessor communica- tion functions to simplify the programming task. The layered system supports several different models including: - point-to-point communication through direct message passing. - "global communication" primitives to perform operations on distributed data in a collective fashion. - "distribution" functions which allow data to be moved without explicit message passing. Simple instructions distribute and re-distribute data in any configuration. - PARALLEL I/O Normal sequential Input/Output operations are not adequate and need to be modified to fit within a parallel environment. Express is the only software system which addresses this fact with a runtime I/O system which extends the functionality of the standard C and FORTRAN libraries to include useful multi- plexing and parallel interfaces that make distributed programs more intuitive and easier to develop, use, and maintain. - PARALLEL GRAPHICS Linking the X library into a parallel program does not insure that sensible graphical output will appear from the program. Express contains a simple but effective graphics library that supports the same type of extensions as the I/O system and displays on most types of devices including the X window sys- tem, PC's (including Windows), Macintoshes, PostScript and many others. In the early days of program development, debugging was regarded as a "black art". With the advent of high level interactive debuggers, developers now take debugging for granted. Until the Express parallel debugger (ndb) came along, parallel debugging was also regarded as a "black art". - NDB NDB is a parallel debugger that goes much further than simply providing "a window for each node". Commands, which are based on the popular "dbx" interface, can be issued to single processors or groups of nodes simultaneously. Programs can be debugged after they are distributed over networks, even inhomogeneous networks. Performance analysis is probably the most underrated area of parallel and distributed processing. With the practice of parallel/distributed programming in the infancy stage, developers may not understand the issues that affect performance of the application. - CTOOL Analyzes high level overhead issues such as the relative amount of time spent computing, performing I/O and in interprocessor communication. - ETOOL Shows the relationships between various computing elements. Used to understand "why" the overheads observed by CTOOL occur by examining the cause and effect relationships between actions in different processors. - XTOOL Used to understand the way that CPU time is spent on each processor. Supports analysis at the subroutine level or down to individual source lines of machine instructions. - DTOOL Coming in version 4.0 is a "distribution" animation package that facilitates watching the data distributed by the program move through the nodes. This is the key to understand- ing the relationship between the parallel and sequential ver- sions of an application and also to optimizing the distributed code at the algorithmic level. **************************************** * E X P R E S S I N F O R M A T I O N * **************************************** 1) INSTALLATION, TRAINING, and USE: Express installation varies depending on the specific platform type. - On the WORKSTATION, network installation is very simple: copy the object code from tape to the directories, execute the "exsetup" program, and set the environment variable which points to the directory where Express is configured. This process has three easy steps and does not require them to be performed by a system administrator. Any user can usually do this. - On SUPERCOMPUTERS, installation is even simpler. It has only two steps: copy files from the tape, set the environment variables which point to the Express version. - On TRANSPUTER and i860 NETWORKS, where Express also works, installation may require installation of a UNIX driver to support attached processors to the machine. TRAINING on usage of Express can be done in several steps: - TUTORIAL- Each Express documentation contains a manual which is an Express tutorial. This manual contains 10 different programs describing different programming models and constructs. This is available to every Express user. - CLASSES- In addition to this tutorial, ParaSoft provides three levels of Express and Parallel programming classes. The classes are taught either at customer locations or at our ParaSoft headquarters. At least once a quarter we teach a class at ParaSoft. The first level class lasts for 2 days. The second and third level are each 3 day classes. We charge $500.00 for the first level and $700.00 for the second and third level per student, with a minimum class size of 10 students. If the class is taught at the customer's location, we charge $5000.00 for the first level and $7,000.00 for second and third level classes, The number of students is not limited, however to obtain full benefit the classes should not be greater than 20-30 persons. The class is a real programming class, with hands on experience and a lot of materials. EASE OF USE In order to write programs in Express there are six (6) basic system calls. At the simplest level, this is: starting programs, sending and receiving messages, and reading system configuration information. While most systems stop there, Express goes much further and provides many other needed utilities: global communication, global data movement operations, parallel I/O, parallel graphics, automatic decomposition routines, automatic dynamic load-balancing tools, ( For instance Express can detect that one of the workstations in the users setup has just started to be used, and on-the-fly can move some work to other workstations so that the users program can execute as efficiently as possible) and much more. 2) MODEL INTEGRATION and PORTABILITY: Express can be used on any machine we support. Express is the only currently commercially available system which supports such a wide range of available hardware. Express versions can even be mixed on the same machine. For instance, on a SUN machine which is a front end for an nCUBE, the same user can use Express on it as a workstation version and as an nCUBE front-end. To accomplish this, the environment variable simply needs to be switched to point to the desired version of Express. Express programs are completely portable between different versions. What this means is that not a single line of source code needs to be changed when moving program from network of workstations to CRAY, ipsc, Ncube, transputer network, etc. The only change is to recompile and even recompilation is the same. We provide drivers for compilers on all these machines so even the Makefile does not need to change. Express is a large system but has been very carefully designed so that all of it is portable. This is very important. A lot of systems, PVM for instance, are not so designed and cannot support portable programming between supercomputers and workstations. 3) INTERACTION WITH DISTRIBUTED COMPUTING ENVIRONMENTS/ HETEROGENEOUS COMPUTING ENVIRONMENTS. Express is designed for and works for both homogeneous and non-homogeneous parallel and distributed computing. Express Runtime is a library which needs is linked to user programs. Express programs are normal programs which are built on specific machines. There are no restrictions on usage of other systems. For instance, Express programs on the ipsc can call all NX routines. In the same fashion, Express programs on a network of workstations can call all RPC routines. Even more, PVM programs can be compiled and run under Express. We can provide an interface library which has all the PVM calls. Express by itself supports Heterogeneous programming. We have programs which run on CRAY and Delta at the same time. At this stage, we are not aware of anybody else who can do this. Our programs are running at the same time in parallel on CRAY and inside the DELTA and communicate through regular Express calls. Of course workstation types can be mixed and matched. Express and XDR functions can also be called. Express also provides a large library to convert data types. In-house we have a tool which can automatically convert a regular parallel program which does not have any heterogeneous constructs in it into a program which can run on heterogeneous types of machines. This is a new technology and we have not yet made a formal production release. 4) MODULARITY/ EASE OF ENHANCEMENT Express is designed to be and is very modular. It has basically three layers: The I/O layer, common Express layer and the architecture specific layer. All this is described in our source code manual. If Express was not designed this way it would be very difficult to run on all the machines which we support now and will support in the future. The other important feature which we have is specific to the network version of Express. This version has extra flexibility. It allows users to write their own Express drivers for new and diverse hardware types. Normally the network version of Express uses TCP/IP and UDP/IP protocols. However, there is a trend to support other ways of connecting machines. Express has this ability and currently we support V7, Bit3, and Junior networks. These networks allow for dramatic reduction of latency and increase in transmission rate. These networks are supported through the above mentioned drivers mechanism. Normally the driver is a piece of code on the order of 200 lines of C code, which is linked to Express code. The driver is at the user level. It allows Express users, in a matter of hours, to have diverse networks running. For instance, it is very simple to get HIPPI running in a bare protocol through these drivers. Again, the documentation we provide for our customers is a short manual which is very much like a "cook-book" that describes how to write these drivers. Of course Express supports ULTRANET, token ring, FTDI and all other networks which support TCP/IP by default. 5) LANGUAGES, MACHINES, NETWORKS SUPPORTED Express 3.2.5 is available in both C and Fortran, with ADA under development. PLATFORMS SUPPORTED are: INTEL - iPSC2, iPSC/i860, iWARP, DELTA, CRAY- X-MP, Y-MP, IBM 370/3090, ES9000, nCUBE - nCUBE2, 2E and 2S, Workstations include: HP9000/700, IBM RS/6000, SGI, SUN and DEC. We are currently in process of porting to Paragon, CM5, KSR-1, Convex MPP, Cray's MPP, and several non-domestic supercomputing platforms. Recently completed ports also include IBM's PVS (Power Visualization System), and the HPSSL Multiprocessor. Another feature which Express provides is the ability to support forwarding, and what comes with it is the ability to support inhomogeneous communication hardware. It is possible to run programs on Express and have part of the system communicate through TCP/IP and the other part communicate through a V7 network at the same time. The same holds for HIPPI. Express is the only parallel system that we know of which can do this now! 6) NETWORK INTELLIGENCE - Express automatically detects the network interface on the system. It knows whether to use TCP/IP and V7 or BiT3 or HIPPI without human intervention! - Express also has encoded in it knowledge about network performance. When it generates network traffic it is careful not to generate traffic which will degrade performance of the program. - Express has also the ability to detect whether the machine on which it runs is overloaded and smoothly move work to other machines without any user intervention. No other parallel system we are aware of provides this level of technology. 7) DOCUMENTATION Express documentation is very extensive. The basic manual set is provided in two volumes: User Guide (approximately 300 pages) and Reference Guide (also approximately 300 pages). Each language has its own set of manuals, which means that there is a separate User Guide and Reference Guide for C and Fortran. The same set is in preparation for ADA. Express has a tutorial of about 50 pages, which also includes separate manuals for C and Fortran. In addition to this there are other manuals which are common for all Express systems. Each particular implementation has an Introductory Guide of about 30 pages. Each described manual has an extensive and common index. We provide the manuals in hard copy and we also ship postscript files of the manual with each system. This allows users to print additional copies of the manuals on their postscript printers. In addition each system comes with the Reference Guide on-line in the form of manual pages. As mentioned before, Express has a Source Code manual, a Drivers manual, and some additional smaller manuals. 8) PRICING The cost of the Express system depends on several factors including the hardware type and whether it is for commercial, government or academic use. There are several very attractive pricing programs for each. WORKSTATIONS: - UNIVERSITY - We now offer a site license for ALL versions of workstations (DEC,HP,IBM,SGI and SUN at this time), unlimited usage (users, machines, and types) for a yearly annual fee of $2,000. ($3,000 overseas) What this program includes is: support, updates, manuals, both versions of Express for C and FORTRAN, on machines: Sun, IBM/rs6000, HP, SGI, Dec. or any combination of the above to run in a heterogeneous environment. - GOVERNMENT - A similar program for these agencies is being considered for a somewhat higher annual fee, which will be announced in '93. - COMMERCIAL - The regular price of Express for workstations starts at just $1,500.00 per language per architecture. (4 workstations & a single language for a single architecture). For a version of Express with two languages, the price is $3,000.00 for up to 4 workstations. To add an architecture, add the price for the C and/or FORTRAN licenses ($1,500.00 or $3,000.00.) If more than 4 workstations are being used, additional workstation licenses may be added at a cost of $500.00 per additional workstation. SUPERCOMPUTERS - For Supercomputers, Express is priced differently. The prices start at only $3,600.00 per language per machine. For instance the price of Express on Intel machine for both languages is $7,200.00. SOURCE CODE - The runtime source code is available for $5,000.00. This includes a generic version of Source code and documentation ready to lead the user through the port. The Source code is sold in such a form that it is ready to be ported to new architectures. However the source license does not entitle the user to sell the ported Express version. This needs to be negotiated. Also runtime source code purchases do not include the source to the debugger and performance monitor tools. These are separate source codes. 9) MAINTENANCE/SUPPORT Each Express user is entitled to buy a maintenance contract which is priced at 20% of the current price of the system. The contract is for one year and can be renewed annually. During the contract the customer receives updates, phone/fax/email support, bug fixes, and help in parallelizing code. Every customer who has not been on a maintenance contract can purchase an upgrade of the system for 30% of the current cost of the system. Universities and Government agencies have the option to be on annual contracts which provide all described services. Our support policy is to respond to customer problems within 24 hours. For further information about the Express software development package contact: Arthur Hicken ParaSoft Corporation Phone : (818) 792-9941 2500, E. Foothill Blvd., Suite 205 FAX : (818) 792-0819 Pasadena, CA 91107 E-mail: ahicken@parasoft.com