Technical Program

Monday, 6/21

9:30 a.m. -- 9:45 a.m.
Conference Opening:
Theodore Papatheodorou, University of Patras

9:45 a.m. -- 10:00 a.m.
Address Program Committee:
Constantine D. Polychronopoulos, University of Illinois

10:00 a.m. -- 11:00 a.m.
High Performance Computing Using Standard Building Blocks
Richard Wirt, Intel

11:00 a.m. -- 11:30 a.m.
Coffee Break

11:30 a.m. -- 1:00 p.m.
Session 1: Fine-grain Parallelism
Session Chair: Arvind, MIT
Adding a Vector Unit to a Superscalar Processor
Francisca Quintana, Jesus Corbal, Roger Espasa, Mateo Valero, Universitat Politecnica de Catalunya

Exploiting SIMD Parrallelism in DSP and Multimedia Algorithms Using the AltiVec Technology
Huy Nguyen, Lizy Kurian John, University of Texas at Austin

Improving the Performance of Speculatively Parallel Applications on the Hydra CMP
Kunle Olukotun, Lance Hammond, Mark Willey, Stanford University

1:00 p.m. -- 2:30 p.m.
Lunch Break

2:30 p.m. -- 4:00 p.m.
Session 2A: Cache Memories
Session Chair: Mario Furnari, CNR
The Pool of Subsectors Cache Design
Jeffrey B. Rothman, Alan Jay Smith, University of California, Berkeley

Symmetry and Performance in Consistency Protocols
Peter J. Keleher, University of Maryland

A Locality Sensitive Multi-Module Cache with Explicit Management
Jesus Sanchez, Antonio Gonzalez, Universitat Politecnica de Catalunya

Session 2B: Scheduling & Communication
Session Chair: Vivek Sarkar, IBM
A New "Quad-Tree-Based" Sub-System Allocation Technique for Mesh-connected Parallel Machines
Jeeraporn Srisawat, Nikitas A. Alexandridis, George Washington University

On the Complexity of List Scheduling Algorithms for Distributed-Memory Systems
Andrei Radulescu, Arjan J.C. van Gemund, Delft University of Technology

Communication Conscious Radix Sort
Daniel Jimenez-Gonzalez, Josep-L. Larriba-Pey, Juan Navarro, Universitat Politecnica de Catalunya

4:00 p.m. -- 4:30 p.m.
Coffee Break

4:30 p.m. -- 6:00 p.m.
Session 3A: OS & Runtime Support
Session Chair: Efthymios Housos, University of Patras
Eliminating Synchronization Bottlenecks in Object-Based Programs Using Adaptive Replication
Martin Rinard, MIT, Pedro Diniz, University of Southern California

Mechanisms and Policies for Supporting Fine-Grained Cycle Stealing
Kyung Dong Ryu, Jeffrey K. Hollingsworth, Peter J. Keleher, University of Maryland

Responsiveness without Interrupts
Dejan Perkovic, Peter J. Keleher, University of Maryland

Session 3B: Branch and Value Prediction Techniques
Session Chair: Milind Girkar, Intel
Reducing Branch Misprediction Penalties Via Dynamic Control Independence Detection
Yuan Chou, Jason Fung, John Paul Shen, Carnegie Mellon University

Software Trace Cache
Alex Ramirez, Josep-L. Larriba-Pey, Carlos Navarro, Universitat Politecnica de Catalunya, Josep Torrellas, University of Illinois at Urbana-Champaign, Mateo Valero, Universitat Politecnica de Catalunya

Cyclic Dependence Based Data Reference Prediction
Chi-Hung Chi, Chin-Ming Cheung, Jun-Li Yuan, National University of Singapore

7:00 p.m. --
Conference Reception

Tuesday, 6/22

8:30 a.m. -- 9:30 a.m.
Invited Talk:
Instruction Level Distributed Processing
James Smith, University of Wisconsin, Madison

9:30 a.m. -- 11:00 a.m.
Session 4: Adaptive Caches & Optimization
Session Chair: Gurindar Sohi, University of Wisconsin, Madison
CACHET: An Adaptive Cache Coherence Protocol for Distributed Shared-Memory Systems
Xiaowei Shen, Arvind, Larry Rudolph, MIT

Adapting Cache Line Size to Application Behavior
Alexander V. Veidenbaum, Weiyu Tang, Rajesh Gupta, Alexandru Nicolau, Xiamoei Ji, University of California, Irvine

Reducing Cache Misses Using Hardware and Software Page Placement
Timothy Sherwood, Brad Calder, University of California, San Diego, Joel Emer, Compaq Computer Corporation

11:00 a.m. -- 11:30 a.m.
Coffee Break

11:30 a.m. -- 1:00 p.m.
Session 5: Shared Virtual Memory Systems
Session Chair: Masaru Kitsuregawa, University of Tokyo
Application Scaling under Shared Virtual Memory on a Clusters of SMPs
Dongming Jiang, Brain O'Kelley, Xiang Yu, Sanjeev Kuman, Princeton University, Angelos Bilas, University of Toronto, Jaswinder Pal Singh, Princeton University

Shared Virtual Memory with Automatic Update Support
Liviu Iftode, Rutgers University, Matthias Blumrich, IBM T.J. Watson Research Center, Cezary Dubnicki, NEC Research Institute, David L. Oppenheimer, University of California, Berkeley, Jaswinder Pal Singh, Kai Li, Princeton University

Realizing the Performance Potential of the Virtual Interface Architecture
Evan Speight, Hazim Abdel-Shafi, John K. Bennett, Rice University

1:00 p.m. -- 2:30 p.m.
Lunch Break

2:30 p.m. -- 4:00 p.m.
Session 6A: Interconnection Networks
Session Chair: Christos Nicolaou, University of Crete
Low-level Router Design and its Impact on Supercomputer System Performance
Valentin Puente, Jose A. Gregorio, Universidad de Cantabria, Cruz Izu, University of Adelaide, Ramon Beivide, Fernando Vallejo, Universidad de Cantabria

Improving the Performance of Bristled CC-NUMA Systems Using Virtual Channels and Adaptivity
Jose F. Martinez, Josep Torrellas, University of Illinois at Urbana-Champaign, Jose Duato, Universidad Politecnica de Valencia

A New Method to Make Communication Latency Uniform: Distributed Routing Balancing
Daniel Franco, Indhira GarcÚs, Emilio Luque, Universitat Autonoma de Barcelona

Session 6B: Parallelizing Compilers
Session Chair: Utpal Banerjee, Intel
New shape analysis techniques for automatic parallelization of C codes
Francisco Corbera, Rafael Asenjo, Emilio L. Zapata, University of Malaga

An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication
Amy W. Lim, Gerald I. Cheong, Monica S. Lam, Stanford University

A Graphic Parallelizing Environment for User-Compiler Interaction
C. R. Calidonia, M. Giordano, Mario Mango Furnari, Instituto di Cibernetica C.N.R.

4:00 p.m. -- 4:30 p.m.
Coffee Break

4:30 p.m. -- 6:00 p.m.
Session 7A: Distributed & Cluster Computing
Session Chair: Kai Li, Princeton University
Dynamic Remote Memory Acquisition for Parallel Data Mining on ATM-Connected PC Cluster
Masato Oguchi, University of Tokyo / Aachen University of Technology, Masaru Kitsuregawa, University of Tokyo

Parallel I/O for Scientific Applications on Heterogeneous Clusters: A Resource-utilization Approach
Yong E. Cho, Marianne Winslett, Szu-wen Kuo, Jonghyun Lee, Univ. of Illinois, Ying Chen, IBM Almaden Research Center

The Design and Evaluation of High Performance Communication using a Gigabit Ethernet
Shinji Sumimoto, Hiroshi Tezuka, Atsushi Hori, Hiroshi Harada, Toshiyuki Takahashi, Yutaka Ishikawa, Real World Computing Partnership

Session 7B: Parallel Systems: Modeling, Design & Performance
Session Chair: Monica Lam, Stanford University
The Scalability of Multigrain Systems
Donald Yeung, University of Maryland

A Comparative Analysis of Four Parallelisation Schemes
Nandini Mukherjee, John R. Gurd, University of Manchester

A Design Analysis of a Hybrid Technology Multithreaded Architecture for Petaflops Scale Computation
Thomas Sterling, Larry Bergman, Jet Propulsion Laboratory, California Institute of Technology

7:00 p.m. --
Piano Concert at the Palace of the Knights

Wednesday, 6/23

8:30 a.m. -- 9:30 a.m.
Invited Talk:

9:30 a.m. -- 11:00 a.m.
Session 8: Parallel Programming Models
Session Chair: James Browne, University of Texas at Austin
Thread Fork/Join Techniques for Multi-level Parallelism Exploitation in NUMA Multiprocessors
Xavier Martorell, Eduard Ayguade, Nacho Navarro, Julita Corbalan, Marc Gonzalez, Jesus Labarta, Universitat Politecnica de Catalunya

SMARTS: Exploiting Temporal Locality and Parallelism through Vertical Execution
Suvas Vajracharya, Steve Karmesin, Peter Beckman, James Crotinger, Los Alamos National Laboratories, Allen Malony, Sameer Shende, University of Oregon, Rod Oldehoeft, Stephen Smith, Los Alamos National Laboratories

Problem Space Promotion and Its Evaluation as a Technique for Efficient Parallel Computation
Bradford Chamberlain, E Christopher Lewis, Lawrence Snyder, University of Washington

11:00 a.m. -- 11:30 a.m.
Coffee Break

11:30 a.m. -- 1:00 p.m.
Session 9: Performance Evaluation
Session Chair: Elias Houstis, Purdue University
A Quantitative Architectural Evaluation of Synchronization Algorithms and Disciplines on ccNUMA Systems: The Case of the SGI Origin2000
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, University of Patras

A Comparison of MPI, SHMEM and Cache-coherent Shared Address Space Programming Models on the SGI Origin2000
Hongzhang Shan, Jaswinder Pal Singh, Princeton University

Comparing the Memory System Performance of the HP V-Class and SGI Origin 2000 Multiprocessors using Microbenchmarks and Scientific Applications
Ravi Iyer, Nancy Amato, Lawrence Rauchwerger, Laxmi Bhuyan, Texas A&M University

1:00 p.m. -- 2:30 p.m.
Lunch Break

2:30 p.m. -- 7:00 p.m.
Excursion on Rhode's historical sites

7:00 p.m. --
Conference Banquet

Thursday, 6/24

8:30 a.m. -- 9:30 a.m.
Invited Talk:

9:30 a.m. -- 11:00 a.m.
Session 10: Multithreaded & ILP Architectures
Session Chair: Kemal Ebcioglu, IBM
Increasing Effective IPC by Exploiting Distant Parallelism
Ivan Martel, Daniel Ortega, Eduard Ayguade, Mateo Valero, Universitat Politecnica de Catalunya

Improving Virtual Function Call Target Prediction via Dependence-Based Pre-Computation
Amir Roth, Andreas Moshovos, Gurindar S. Sohi, University of Wisconsin-Madison

Clustered Speculative Multithreaded Processors
Pedro Marcuello, Antonio Gonzalez, Universitat Politecnica de Catalunya

11:00 a.m. -- 11:30 a.m.
Coffee Break

11:30 a.m. -- 1:00 p.m.
Session 11: Cluster Computing
Session Chair: Yoichi Muraoka, Waseda University
Fast Cluster Failover Using Virtual Memory-Mapped Communication
Yuanyuan Zhou, Princeton University, Peter Chen, University of Michigan, Kai Li, Princeton University

Performance Impact of Proxies in Data Intensive Client-Server Applications
Michael D. Beynon, Alan Sussman, Joel Saltz, University of Maryland

A comparison of two Approaches for Independent Scaling up of Processing and Communication Capacities in Multicomputer Networks
Adolfo Ferre-Vilaplana, Jose M. Bernabeu-Auban, Universidad Politecnica de Valencia

1:00 p.m. -- 2:30 p.m.
Lunch Break

2:30 p.m. -- 4:00 p.m.
Session 12A: Instruction-Level Parallelism
Session Chair: Nick Carter, University of Illinois
Classifying Load and Store Instructions for Memory Renaming
Glenn Reinman, Brad Calder, Dean Tullsen, University of California, San Diego, Gary Tyson, University of Michigan, Todd Austin, Intel Corporation

Reorganizing Global Schedules for Register Allocation
Gang Chen, GTE Laboratories, Inc., Michael D. Smith, Harvard University

Resource Usage Models for Instruction Scheduling: Two New Models and a Classification
V. Janaki Ramanan, R. Govindarajan, Indian Institute of Science

Session 12B: Memory Hierarchies
Session Chair: Wolfgang Nagel, Dresden University of Technology
Improving Memory Hierarchy Performance for Irregular Applications
John Mellor-Crummey, Rice University, David Whalley, Florida State University, Ken Kennedy, Rice University

High-Level Semantic Optimization of Numerical Codes
Vijay Menon, Keshav Pingali, Cornell University

Nonlinear Array Layouts for Hierarchical Memory Systems
Siddhartha Chatterjee, Vibhor V. Jain, University of North Carolina, Alvin R. Lebeck, Duke University, Shyam Mundhra, University of North Carolina, Mithuna Thottethodi, Duke University

4:00 p.m. -- 4:30 p.m.
Coffee Break

4:30 p.m. -- 6:00 p.m.
Session 13A: Memory & Processor Architectures
Session Chair: Alex Veidenbaum, University of California, Irvine
Microservers: A New Memory Semantics for Massively Parallel Computing
Jay B. Brockman, Peter M. Kogge, Vincent W. Freeh, Shannon K. Kuntz, University of Notre Dame, Thomas L. Sterling, California Institute of Technology

Efficient Management of Memory Hierarchies in Embedded DRAM Systems
Ashley Saulsbury, Su-Jaen Huang, Sun Microsystems Laboratories, Fredrik Dahlgren, Chalmers University of Technology

Dynamic Removal of Redundant Computations
Carlos Molina, Antonio Gonzalez, Jordi Tubella, Universitat Politecnica de Catalunya

Session 13B: Memory Hierarchy Optimizations
Session Chair: Josep Torrellas, University of Illinois at Urbana-Champaign
An Experimental Evaluation of Tiling and Shackling for Memory Hierarchy Management
Induprakas Kodukula, Keshav Pingali, Cornell University, Robert Cox, Dror Maydan, Silicon Graphics Inc.

A Tile Selection Algorithm for Data Locality and Cache Interference
Jacqueline Chame, Sungdo Moon, University of Southern California

An Integer Linear Programming Approach for Optimizing Cache Locality
Mahmut Kandemir, Prith Banerjee, Alok Choudhary , Northwestern University, Jagannathan Ramanujam, Louisiana State University, Eduard Ayguade, Universitat Politecnica de Catalunya