FORGE 90 (as it is now called) is a product of Applied Parallel Research (main office 550 Main St.,Placerville CA 95667 916 621-1600). It is an interactive vectorizer/parallelizer for both shared and distributed memory systems. Originally developed by Pacific-Sierra Research, APR was formed by the developers of FORGE as a new startup in January. APR has offices in Placerville, Berkeley, and Topanga, CA. FORGE 90 is a layered product that runs on most Unix workstations under X-windows, sunview, openwindows, etc. The Baseline level of FORGE 90 is a global program explorer that provides a number of detailed views of a complete Fortran program, including: o call tree o COMMON Block grid (routine vs block vs variable) o Data and control flow analysis (how can I get to this statement; where is this variable set before it is used at this statement, etc..) o Performance analyzer down to the DO loop level. o Source code reformatter (tidy) The view is global thru subprogram calls and variable aliasing because FORGE 90 is based on a database representation of the entire program as a complete entity. Add on features include : o vectorizer for Cray-type vector processors; o DO loop parallelizer for shared memory systems o DO loop parallelizer for distributed memory (previously called "MIMDizer") The vector and parallelizers can handle loops that contain CALLS due to the global nature of their analysis. FORGE 90 generates compiler directives to accomplish parallelization on a number of systems. The shared-memory parallelizer analyzes DO loop array and scalar dependencies across subprogram calls and scopes variables in a loop automatically as PRIVATE or SHARED, and LOCAL or GLOBAL and displays to the user the results of its analysis. Both ordered and critical regions are displayed and the array references that cause them. FORGE 90 inserts the appropriate compiler directives for assigning variable scope and synchronization of regions. FORGE 90's distributed memory parallelizer module allows the user to choose cyclic, block, or replicate decomposition schemes for arrays. FORGE 90 generates a parallelized version of the user program to run on every node. Spread DO loops are modified in accordance with the chosen decomposition, with all node communications automatically inserted. Programs parallelized for distributed memory systems interface APR's own run-time library which in turn interfaces any of the standard message passing libraries: PVM, Express, and even IBM's new EUI library. New features announced at SuperComputing 92 include a parallel profiler and performance predictor which can gather performance statistics at run-time and predict performance on other systems. For more information, email to me or to forge@netcom.com -- /\=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=/\ \/Richard Friedman (510)528-7055 | rchrd@netcom.com \/ /\Applied Parallel Research (Berkeley)| /\ \/=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\/ Newsgroups: comp.parallel Path: kale.cs.uiuc.edu!kale From: kale@cs.uiuc.edu Subject: CHARM : more information Sender: news@cs.uiuc.edu Organization: Dept. of Computer Sci - University of Illinois Date: Mon, 4 Jan 1993 22:23:28 GMT Lines: 173 Apparently-To: comp-parallel@uunet.uu.net Thanks for the response to our announcement of the CHARM (v3.2) parallel programming system. This posting includes: 1. A correction to the previous note: The name of the directory containing CHARM at a.cs.uiuc.edu is: pub/CHARM (and not pub/charm as posted earlier.) 2. Information about a new mailing list for CHARM related discussions, and bug reports. 3. As some of you requested information about what CHARM is, I am including a brief description of some its features. -------------------------------------------------------------- A mailing list for CHARM related discussions has been created. To become members of this list, send email to listserv@cs.uiuc.edu, containing the following line in the body of the message: subscribe charm To send mail to everyone on the list: send it to: charm@cs.uiuc.edu. Bug reports should be sent to charmbugs@cs.uiuc.edu. -------------------------------------------------------------- Brief description of CHARM features: CHARM is a machine independent parallel programming system. Programs written using this system will run unchanged on MIMD machines with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications. The programs are written in C with a few syntactic extensions. It is possible to interface to other languages such as FORTRAN using the foreign language interface that C provides. Platforms: The system currently runs on Intel's iPSC/860, iPSC/2, NCUBE, Encore Multimax, Sequent Symmetry, ALLIANT FX/8, single-processor UNIX machines, and networks of workstations. It is being ported to a CM-5, Parsytec GC-el, and Alliant FX/2800. We plan to port it to other parallel machines as they become available. The design of the system is based on the following tenets: 1. Efficient Portability: Locality of data is the cost measure that applies across all MIMD machines. The system design induces better data locality. This is how it can support machine independence without losing efficiency. 2. Latency Tolerance: Latency of communication - the idea that remote data will take longer to access - is a significant issue common across most MIMD platforms. Message-driven execution, supported in CHARM, is a very useful mechanism for tolerating or hiding this latency. In message driven execution (which is distinct from just message-passing), a processor is allocated to a process only when a message for the process is received. This means when a process blocks, waiting for a message, another process may execute on the processor. It also means that a single process may block for any number of distinct messages, and will be awakened when any of these messages arrive. Thus, it forms an effective way of scheduling a processor in the presence of potentially large latencies. 3. Dynamic Load Balancing: Dynamic creation of work is necessary in many applications programs. CHARM supports this by providing dynamic (as well as static) load balancing strategies. 4. Specific Information Sharing Modes: A major activity in a parallel computation is creation and sharing of information. Information is shared in many specific modes. The system provides six information sharing modes, each of which may be implemented differently and efficiently on different machines. 5. Reuse and Modularity: It should be possible to develop parallel software by reusing existing parallel software. CHARM supports this with a well-developed ``module'' construct, and associated mechanisms. These mechanisms allow for compositionality of modules without sacrificing the latency-tolerance. With them, two modules, each spread over hundreds of processors, may exchange data in a distributed fashion. The Programming Model: Programs consist of potentially small-grained processes (called chares), and a special type of replicated processes. These processes interact with each other via messages and any of the other information-sharing abstractions. There may be thousands of small-grained processes on each processor, or just a few, depending on the application. The ``replicated processes'' can also be used for implementing novel information sharing abstractions, distributed data structures, and intermodule interfaces. The system can be considered a concurrent object-oriented system with a clear separation between sequential and parallel objects. Libraries: The modularity-related features make the system very attractive for building library modules that are highly reusable because they can be used in a variety of data-distributions. We have just begun the process of building such libraries, and have a small collection of library modules. However, we expect such libraries, contributed by us and other users, to be one of the most significant aspects of the system. Regular and Irregular Computations: For regular computations, the system is useful because it provides portability, static load balancing, and latency tolerance via message driven execution, and facilitates construction and flexible reuse of libraries. The system is unique for the extensive support it provides for highly irregular computations. This includes management of many small-grained processes, support for prioritization, dynamic load balancing strategies, handling of dynamic data-structures such as lists and graphs, etc. The specific information sharing modes are especially useful for such computations. The system has been used for many applications. For example, many VLSI-CAD applications (including test-pattern generation, circuit extraction, etc.) were implemented by B. Ramkumar and P. Banerjee using CHARM. An implementation of Actors language was carried out by Chris Houck under the direction of Gul Agha. In addition, several CFD algorithms, AI algorithms, a molecular dynamics program, etc. have been written using CHARM. Distribution: The system is available (for research use and evaluation) by anonymous ftp, from a.cs.uiuc.edu, under the directory pub/CK. The files there include a manual, installation scripts, a README file that explains how to install the system on your machine, and the requisite source/object files. Associated Tools: 1. Dagger: allows specification of dependences between messages and sub-computations within a single process, provides a pictorial view of this dependence graph, and simplifies management of message-driven execution. 2. Projections: is a performance visualization and feedback tool. The system has a much more refined understanding of user computation than is possible in traditional tools. Thus it can provide much specific feedback. Future versions will be able to provide recommendations and suggestions for improving performance. Future Plans: Some of the ongoing projects based on CHARM include--- 1. DP-Charm: A Data Parallel Language, providing a subset of HPF features. 2. Other base languages: The basic ideas and techniques used in CHARM are independent of the base language used. We plan to incorporate them in other languages including FORTRAN, C++, and Scheme. 3. Debugging Tools: Based on the specificity of information available to the runtime system, we plan to develop both trace-based and run-time debugging tools for CHARM. Contact: L.V. Kale kale@cs.uiuc.edu Department of Computer Science, (217) 244-0094 University of Illinois at Urbana Champaign 1304 W. Springfield Ave., Urbana, IL-61801