basin_list: the design goals of basin remote

From: Bruce Char <bchar@cs.drexel.edu>
Date: Thu May 03 2007 - 12:59:46 EDT

I'm hanging out at Argonne this month so I won't be able to make the
basin meeting. However, I've been checking out basin-related things
while I've been here.

1. I read about pMatlab:

@techreport{
bliss-2006,
author={Bliss,Nadya and Kepner,Jeremy},
year={2006},
title={pMatlab Parallel Matlab Library},
url={http://www.citebase.org/abstract?id=oai:arXiv.org:astro-ph/0606464},

I wasn't aware that Jeremy Kepner started out as an astrophysics student
at Princeton before he

got involved in high performance matlab at Lincoln Labs. It sounds like
his original motivations for the work were similar to that of basin, though.

At any rate the pmatlab system is interesting to me because it's another
example of trying to make HPC programmers more productive by giving them
an interpreter which has access to parallel computation on the back
end. Like basin, parallel global arrays are a key way to get the
distributed computation handled. However,they seem to take a logical
step in making it possible to set up the global arrays through the
programming done in the interpreter, through the concept of "maps" which
describe the data distribution, and then having the various
interpreter-level functions be able to take a map as an additional
parameter to mathematical functions. Another feature that I liked was
that they made it possible to switch back and forth fairly easily
between distributed/parallel and sequential/non-distributed execution,
making it easier to do debugging.

They tout the fact that they believe that their user base was able to
set up parallel versions of their codes very quickly once they had the
sequential versions running in Matlab. Performance compared to C+MPI
was not necessarily so wonderful.

Is this the kind of purpose-case that we want to make for basin-remote?

2. I also had time to look at the "high performance python" paper I
mentioned a few days ago:

@article{
PLW2007,
author={Luszczek,Piotr and Dongarra,Jack},
year={2007},
month={Summer},
title={*High Performance Development for High End Computing with Python
Language Wrapper (PLW)*},
journal={The International Journal of High Performance Computing
Applications},
volume={21},
number={2},
http://www.cs.utk.edu/~luszczek/pubs/plw-200605.pdf}

The paper, like Basin, touts python as a rapid way of developing HPC
code, but tries to push the high performance aspect a bit further. It
says "if you are satisfied with performance in Python (with calls to
C/C++), you can leave things as you wish, but if not you should consider
compiling the Python program into C". The advantages are a) better
performance, b) portability to HPC platforms where Python does not run.
For example, they say that BlueGene/L does not support dynamic
libraries, while Cray XT3 has a lightweight OS kernel missing features
that Python assumes exist in the OS. So the compilation to C completely
removes Python from the picture.

In order to get compilation to happen,they annotate Python methods with
type declarations of
e.g. real and float. The annotations are just Python comments, and
they are used in a fashion similar to # pragmas in the C/C++ world.

While compilation into C and user-level global arrays are two different
things, I think they do come from an underlying concern -- that if using
the interpreter is such a productivity aid, that
users want to deal with it more and do separate C++ programming less.
What is the basin
philosophy about the role of the interpreter in "doing science with basin"?
Received on Thu May 3 12:54:11 2007

This archive was generated by hypermail 2.1.8 : Fri Aug 08 2008 - 19:25:03 EDT