Richard Brent

Oxford University

*Computational Number Theory*

Number theory provides an infinite supply of computational problems. Some of these are of great practical interest because they are relevant to computer security, authentication, public-key cryptography, etc. For example, a popular public-key algorithm, the RSA algorithm, uses exponentiation modulo a large composite number. To speed up encryption and decryption, the exponentiation should be performed as efficiently as possible. The size of the composite number depends on what level of security is desired. It should not be possible for someone to find the prime factors of this number, either now or in the near future, using a fast computer with the best known algorithms. In practice this means that the composite number should have at least 150 decimal digits. Alternatives to the RSA algorithm involve the discrete logarithm problem. In this talk I will discuss the integer factorization and discrete logarithm problems and their applications, and outline the current state of algorithms for their solution.

Paul Coddington

University of Adelaide

*Improved Algorithms and Tests for Random Number Generators*

Developing fast, high-quality, general-purpose random number generators is a notoriously difficult task, and over the years many widely-used algorithms for generating random numbers have proven to be inadequate for use with more powerful computers and new applications. Developing stringent tests for random number generators is also a challenge. Recently some random number generators have been shown to be inadequate for a number of computational science applications, even though they pass standard statistical tests. This has led to the development of empirical tests based on these applications. In this talk we discuss recent progress in the development of new algorithms and tests for random number generators, particularly on parallel computers, which is an even more challenging problem.

Berwin Turlach

ANU

*Applications of constrained optimisation in statistics*

Tibshirani [1996, JRSS B, Vol. 58, No. 1: 267--288] proposes the ``Least Absolute Shrinkage and Selection Operator'' (lasso). This methods seems to be a very powerful variable selection tool. Essentially, to obtain the estimate we have to solve a least squares problem while putting a constraint onto the L1-norm of the vector of parameters. In this talk we shall present an algorithm that allows to calculate the lasso efficiently. If time allows we shall illustrate the application of this methodology to the problem of knot selection for regression splines.

Steve Roberts

ANU

*Finite Element Thin Plate Splines for Data Mining Applications*

Thin plate splines have been used successfully to model curves and surfaces. A new application is in data mining where they are used to model interaction terms. These interaction splines break the ``curse of dimensionality'' by reducing the high-dimensional nonparametric regression problem to the determination of a set of interdependent surfaces. However, the determination of the corresponding thin plate splines requires the solution of a dense linear system of equations of order $n$ where $n$ is the number of observations. For data mining applications $n$ can be in the millions, and so standard thin plate splines, even using fast algorithms may not be practical. A finite element approximation of the thin plate splines will be described. The method uses $H^1$ elements in a formulation which only needs first order derivatives. The resolution of the method is chosen independently of the number of observations which only need to be read from secondary storage once and do not need to be stored in memory. The formulation leads to a saddle point problem. Convergence and solution of the method and its relationship to the standard thin plate splines will be discussed.

Roger Sidje

University of Queensland

*Alternatives for Parallel Krylov Basis Computation*

Numerical methods related to Krylov subspaces are widely used in large sparse numerical linear algebra. Vectors in these subspaces are manipulated via their representation onto orthonormal bases. Nowadays, on serial computers, the method of Arnoldi is considered as a reliable technique for constructing such bases. However, although easily parallelisable, this technique is not as scalable as expected for communications. In this work we examine alternative methods aimed at overcoming this drawback. Since they retrieve upon completion the same information as the Arnoldi's algorithm does, they enable us to design a wide family of stable and scalable Krylov approximation methods for various parallel environments. We present timing results obtained from their implementation on two distributed-memory multiprocessor supercomputers: the Intel Paragon and the IBM Scalable POWERparallel SP2.

Lutz Grosz

ANU

*Black-Box Implementation of the Algebraic Multilevel Iteration*

When solving a system of linear equation Ax=b with large scale and sparse coefficient matrix A a preconditioner is used to accelerate the convergence of an iterative solver and to improve its robustness. The talk discusses the implementation of an algebraic multi-level preconditioner for in the framework of a numerical software library on a vector and parallel-vector computer using a data-parallel programming language. Especially the case of an M-matrix stored in a diagonal storage scheme (e.g. arising from the finite difference discretisation on a rectangular grid) is considered. We will discuss the calculation of the coarse level matrices by an alternating direction technique and the independent selection of the optimal number of levels to get a minimal computing time. Some examples will show the flexibility, robustness and efficiency of the method.

Markus Hegland

ANU

*Fast Fourier Transforms*

Fast Fourier transforms are one of the "staple tools" of computational science and engineering. They are relevant to weather forecast, quantum chemistry, electric and flow field computations and image and signal processing. Many of these applications, like weather forecast, are time critical and so computational performance of FFT algorithms is of prime importance. The FFT performance is one of the indicators used, for example in the NAS kernels, to benchmark supercomputer performance. In order to achieve high performance, FFT algorithms need to carefully consider the hardware architecture, floating point operation counts, rounding errors and data movement. In this talk I will discuss some of these issues and recent developments of FFT algorithms and implementations.

Michael Stewart

ANU

*Stability Issues for Large Toeplitz Systems*

Toeplitz systems of equations arise in many problems in systems theory and signal and image processing. There are several different approaches for exploiting the structure of such systems to obtain fast algorithms. In recent years, the stability properties of these algorithms have come to be well understood with the exception of the fastest $O(n \log^2(n))$ direct methods. In this work, we examine the stability properties of these algorithms and their potential applicability to large, ill-conditioned problems arising in image and signal processing.

David Abramson

*
Very High Performance Parametric Modelling on the Global Grid
*

This talk will discuss the design and implementation of a system which allows users to pose very large parametric modelling experiments, and execute these on a global meta-computer. The system, called Nimrod/G, is based on a commercially available tool, Clustor, and is implemented on top of the Globus tool kit. The talk will give some case studies of how Clustor is used, followed by a discussion of scheduling issues on a global metacomputer using Globus. More information on Nimrod/G is available from http://www.dgs.monash.edu.au/~davida/nimrod.html, Clustor from http://www.activetools.com, and globus from http://www.globus.org. The Nimrod/G project is supported by the Distributed Systems Technology CRC.