SourceForge



USERS GUIDE

Introduction

This package is a set of Fortran 90 routines designed to enable Adaptive Mesh Refinement(AMR) for a parallel distributed memory machine.

The basic AMR strategy used in this package was described in the introduction. To recap briefly, the computational domain is covered with a hierarchy of numerical sub-grids. These sub-grids which form the nodes of a tree data-structure (quad-tree in 2D) and (oct-tree in 3D) are distributed amongst the processors. When more spatial resolution is required at some location, the highest resolution sub-grid covering that point spawns child sub-grids (2 in 1D, 4 in 2D and 8 in 3D), which together cover the line, area or volume of their parent, but now with twice its spatial resolution. We refer to the block with the highest refinement level at a given physical location as a `leaf block'.

All sub-grids have identical logical structure (ie the same number of grid points in each dimension, the same aspect ratios, the same number of guard cells, etc ). They differ from each other in their sizes and locations, and in their list of neighbors, parents and children.

The package assumes that the application will use logically cartesian grids (ie grids that can be indexed i, j, k). It works for 1D, 2D and 3D.

The package provides routines which perform all the communication required between sub-grids and between processors. It manages the refinement and de-refinement processes, and can balance the workload by reordering the distribution of blocks amongst the processors. It also provides routines to enforce conservation laws at the interfaces between grid blocks at different refinement levels.

The objective is to provide the user with an environment which manages the parallelism and AMR burden. The user is required to provide code to advance the solution on a generic grid block.

In this guide we discuss in much more detail the structure of the package and the various steps involved in using it to construct an application.

The following topics are covered in the remainder of this users guide :

We will use capital letters when referring to Fortran or Pre-Processor variables, blue for fortran and green for pre-processor.

Basic Steps in Developing an Application.

The programming task facing the user can be broken down into a series of straightforward steps.

Step 1.
Edit the header or input files which define the models dimensionality, the properties of a typical grid block, the storage limits of the block-tree, and the number of data words required in each grid cell for each of the packages data-structures. The header files are extensively commented to make this step easy. Wherever the user is required to choose a value for a parameter or preprocessor variable the character string `---< USER EDIT >---' appears on the line above.

Step 2.
Construct a main program. Most time-dependent fluid models will be able to use the example ( 1D hydrodynamics code) as a template for their main program. This can be easily modified to suit the users requirements. The sequence of calls to the `upper level' routines in the amr package should not need to be altered. The user will need to customize the construction of an initial grid, establish a valid initial solution on this grid, set the number of timesteps and limits on the allowed range of refinement, and add any I/O required. Sample code for all these tasks is illustrated in this guide or in the accompanying example code.

Step 3.
Provide a routine which advances the model solution on all the required grid blocks through a timestep (or iteration). This step is much simpler than it appears. The routine can be constructed by taking the equivalent code from the user's existing application which advances the solution on a single grid, and inserting it inside a loop over the selected blocks on the local processor. Inside this loop, the solution data for the current grid block must be copied from the package's data-structures into the equivalent local variables in the user's code segment. Then the user's code segment executes to update the solution on that block. Finally the solution is copied back from the user variables to the PARAMESH data-structures, before the loop moves on to repeat the same sequence for the next block. If conservation constraints must be satisfied, a few extra lines must be added inside this routine to capture fluxes (or flux densities) and/or cell edge data at block boundaries.

Step 4.
Provide a routine to compute the model's timestep. Again this can be easily constructed from an existing template by inserting the appropriate code segment from the user's existing application, in the manner described in step 3. The existing timestep routine template has all the control and inter-processor communications required to properly compute the global minimum timestep or to enable longer timesteps on coarser blocks if the user chooses that option.

Step 5.
Provide a routine to establish the initial state on the initial grid. A template has been provided which can be easily tailored to suit the user's model.

Step 6.
Provide a routine to set data values in guard cells at physical boundaries in order to implement the user's choices of boundary conditions. Once again a template routine has been provided which can be very easily modified.

Step 7.
Provide a function to test a single block to determine if any refinement or de-refinement is appropriate. This function is called during the refinement testing operation. Again, a template exists which can be easily modified by the user.

More detailed `How To' instruction and illustration is provided below on all these steps. However, in preparation for that, it is necessary to describe the package's grid and data-structures, and the sequence of data movements that are required.

Grid Structure

All the grid blocks have an identical logical structure. They are assumed to be logically cartesian (or structured). By this we mean that the grid cells can be indexed as though they were cartesian. If a cell's first dimension index is i, then it lies between cells i-1 and i+1. The actual physical grid geometry can be cartesian, cylindrical, spherical, polar(in 2D), or any other metric which enables the physical grid to be mapped to a cartesian grid. The metric coefficients which define quantities such as cell volumes are assumed to be built into the user's algorithm. PARAMESH does not need to know these.

Each grid block has (NXB,NYB,NZB) interior cells, and there are NGUARD guard cells at each grid block boundary. As a result the cell centered data is indexed from 1 : NXB+2*N_GUARD in the x direction. The values of NXB,NYB,NZB and NGUARD are defined as parameters by the user, when they set values for the pre-processor variables NX_B, NY_B, NZ_B in the header file PARAMESH_PREPROCESSOR.FH. The values of NXB, NYB and NZB must be even. Note, NGUARD should be chosen without regard to the issue of whether grid blocks have memory permanently allocated for guard cell data storage (see section Guard cell Storage).  If the LIBRARY flag is defined in PARAMESH_PREPROCESSOR.FH the values for NXB, NYB, NZB and NGUARD are set in the input file called 'amr_runtime_parameters'.  For more information on running PARAMESH as a LIBRARY see How to Install and Compile PARAMESH as LIBRARY.

The physical dimensionality of the model is set by defining the value of the pre-processor variable N_DIM in PARAMESH_PREPROCESSOR.FH. If N_DIM is 1 then a 1D model is set up and NYB and NZB are automatically set to 1. If N_DIM = 2 then a 2D model is set up if L_2P5DIM = 0 , and a 2.5D model if L_2P5DIM = 1, and NZB is automatically set to 1. Finally if N_DIM = 3 a 3D model is set up.  Again, if the LIBRARY flag is set all the above parameters are set in the amr_runtime_parameters file.

The location and size of a grid block are specified in two different ways. The block location is stored in the array COORD. For grid block LB, say, on a particular processor, COORD(1,LB) is the location of the mid-point of the block along the first coordinate axis. Similarly COORD(2,LB) and COORD(3,LB) define the position along the other coordinate axes. A similar convention applies to the array BSIZE which stores the coordinate interval spanned by the grid block along each axis. Note that this is not always equivalent to a physical length. The second way we store this information is in the form of bounding box coordinates. The array BND_BOX stores these. The lower and upper coordinates of the block boundaries along the IC-th coordinate axis of block LB are stored as BND_BOX(1,IC,LB) and BND_BOX(2,IC,LB) respectively. The reason we choose to store this information in two different and seemingly redundant ways is that we would like child blocks to have exact copies of those bounding box coordinates which they may share with their parent and siblings. If these quantities were computed as needed they would suffer roundoff error and slight differences between the coordinates assigned to the same physical location on different blocks. Because the appropriate data in the arrays COORD, BSIZE and BND_BOX are inherited rather than recomputed when children are created, we do not suffer these inconsistencies.

All the grid blocks are related to one another as the nodes of a tree. When a leaf block is designated for refinement, it spawns 2 child blocks in 1D, 4 child blocks in 2D or 8 child blocks in 3D. These child blocks cover the same physical line, area or volume as their parent but with twice the spatial resolution. The J-th child of parent block LB is located at block CHILD(1,J,LB) on processor CHILD(2,J,LB). The children of a parent are numbered according to the Fortran array ordering convention, ie child 1 is at the lower x, y and z corner of the parent, child 2 at the higher x coordinate but lower y and z, child 3 at lower x, higher y and lower z, child 4 at higher x and y and lower z, and so on. The parent of block LB is located at PARENT(1:2,LB), where, as before the first data word is the local block number and the second word is the processor number.

The locations of a block's neighbors are stored in the array NEIGH. The neighbor on the lower x face of block LB is at NEIGH(1:2,1,LB), the neighbor on the upper x face at NEIGH(1:2,2,LB), the lower y face at NEIGH(1:2,3,LB), the upper y face at NEIGH(1:2,4,LB), the lower z face at NEIGH(1:2,5,LB) and the upper z face at NEIGH(1:2,6,LB). If any of these values are set to -1 or lower, there is no neighbor to this block at this refinement level. However there may be a neighbor to this block's parent. If the value is -20 or lower then this face represents an external boundary, and the user is required to apply some boundary condition on this face. The exact value below -20 can be used to distinguish between the different boundary conditions which the user may wish to implement.

The grid refinement process is constrained to guarantee that no leaf blocks ever vary by more than one refinement level from any of their neighboring leaf blocks. In this respect, a block's neighbors also includes leaf blocks which touch it only along a single edge or at a single corner.

Data Structures

There are five data structures which are set up and maintained by the package. These are described in detail in the reference guide. These data structures are declared in the modules PHYSICALDATA.F, TREE.F, and WORKSPACE.F in the headers subdirectory. The user sets the properties of these data structures by assigning appropriate values to preprocessor variables in PARAMESH_PREPROCESSOR.FH or in the file amr_runtime_parameters (if the LIBRARY flag is set in PARAMESH_PREPROCESSOR.FH see 'How to install and compile PARAMESH as a LIBRARY').

The Solution Data-Structure

The data which constitutes the solution can include data located at The array UNK stores the cell centered data. The face centered data is stored in FACEVARX for faces perpendicular to the x-axis, FACEVARY for faces perpendicular to the y-axis, and FACEVARZ for faces perpendicular to the z-axis. The edge centered data is stored in UNK_E_X for the edge aligned along the x-axis, UNK_E_Y for the edge aligned along the y-axis and UNK_E_Z for the edge aligned along the z-axis. Finally the corner data is stored in UNK_N. This is shown below for cell (I,J,K) in block LB, for the default case in which IFACE_OFF=0.

IFACE_OFF is a user-set parameter which defines how the indeces associated with cell centered and non-cell-centered data are related. For example, the default value (0) means that FACEVARX(:,I,:,:,:) and FACEVARX(:,I+1,:,:,:) bound UNK(:,I,:,:,:). If IFACE_OFF=-1 then FACEVARX(:,I-1,:,:,:) and FACEVARX(:,I,:,:,:) bound UNK(:,I,:,:,:).

The leading dimension of each of these arrays is used to store the different data words of that type within the grid cell. The number of words of each type in each grid cell is defined in PARAMESH_PREPROCESSOR.FH by the variables N_VAR for the cell centered data, N_FACEVAR for the face-centered data, N_VAR_EDGE for edge centered and N_VAR_CORN for corner data. Here are some examples of how various models might be set up.  These variables can also be set in the file amr_runtime_parameters file if the LIBRARY flag is set (see 'How to install and compile PARAMESH as a LIBRARY').

Example 1. A 3D hydro code with mass density, momentum and energy density all specified at grid cell center :

                           N_VAR      == 5
                           N_FACEVAR  == 0
                           N_VAR_EDGE == 0
                           N_VAR_CORN == 0 

                     unk(1,:,:,:,:) - mass density
                     unk(2,:,:,:,:) - x-component of momentum
                     unk(3,:,:,:,:) - y-component of momentum
                     unk(4,:,:,:,:) - z-component of momentum
                     unk(5,:,:,:,:) - energy density
Example 2. A 3D hydro code with mass density and energy density specified at grid cell center, and the x-component of momentum on the x-face, y-component of momentum on the y-face and z-component of momentum on the z-face :
                           N_VAR      == 2
                           N_FACEVAR  == 1
                           N_VAR_EDGE == 0
                           N_VAR_CORN == 0 

                     unk(1,:,:,:,:)      - mass density
                     unk(2,:,:,:,:)      - energy density
                     facevarx(1,:,:,:,:) - x-component of momentum
                     facevary(1,:,:,:,:) - y-component of momentum
                     facevarz(1,:,:,:,:) - z-component of momentum
Example 3. A 3D MHD code with mass density, momentum and energy density all specified at grid cell center, and with the x-component of magnetic field on the x-face, y-component of magnetic field on the y-face and z-component of magnetic field on the z-face : :
                           N_VAR      == 5
                           N_FACEVAR  == 1
                           N_VAR_EDGE == 0
                           N_VAR_CORN == 0 

                     unk(1,:,:,:,:) - mass density
                     unk(2,:,:,:,:) - x-component of momentum
                     unk(3,:,:,:,:) - y-component of momentum
                     unk(4,:,:,:,:) - z-component of momentum
                     unk(5,:,:,:,:) - energy density
                     facevarx(1,:,:,:,:) - x-component of magnetic field
                     facevary(1,:,:,:,:) - y-component of magnetic field
                     facevarz(1,:,:,:,:) - z-component of magnetic field
Example 4. A 2.5D MHD code with mass density, x and y momentum and energy density all specified at grid cell center, and with the x-component of magnetic field on the x-face, y-component of magnetic field on the y-face and z-component of magnetic field at cell center :
                           N_VAR      == 5
                           N_FACEVAR  == 1
                           N_VAR_EDGE == 0
                           N_VAR_CORN == 0 

                     unk(1,:,:,1,:) - mass density
                     unk(2,:,:,1,:) - x-component of momentum
                     unk(3,:,:,1,:) - y-component of momentum
                     unk(4,:,:,1,:) - energy density
                     unk(5,:,:,1,:) - z-component of magnetic field
                     facevarx(1,:,:,1,:) - x-component of magnetic field
                     facevary(1,:,:,1,:) - y-component of magnetic field
In this case the parameter L_2P5DIM in PARAMESH_PREPROCESSOR.FH (or in amr_runtime_parameters if you are using PARAMESH in LIBRARY mode) must be set to 1.

There are two ways that the user can establish the connection between his variables and the packages data-structures. The first is by explicitly copying them. The second is through the use of Fortran 90 pointers since all PARAMESH arrays which hold user data are declared with the Fortran 90 'target' attribute.

Multi-level Time Integration
Time integration schemes which simultaneously require knowledge of the solution at two or more levels are not uncommon. For example, many hydro algorithms achieve second order accurate time integration using a simple predictor-corrector strategy. They advance the solution from time t to t + dt/2,recompute the source terms at t + dt/2 and use these for a time-centered advance of the original solution from time t to t + dt. These schemes need storage space in which to save the solution at the beginning of the timestep, otherwise the predictor step overwrites this data which is needed at the start of the corrector step.

There are two ways to address this problem. The most general solution is to increase the number of variables used (ie N_VAR, N_FACEVAR, etc) to include the additional time levels. In our predictor-corrector problem above, if we had 5 cell-centered variables ( density, momentum 3-vector and energy density) in a 3D hydro code, we would set N_VAR=10. Then UNK(1:5,....) could store the working solution and UNK(6:10,....) could save the solution at the beginning of the timestep. Clearly this approach can be generalised to accomodate more than 2 time levels. It has the advantage that the guard cell filling operation can provide guard cell data for the solution at any of the time levels. It has the disadvantage that the guard cell filling operation will provide guard cell data for the solution at all time levels even if you only need it for one of them.

For the 2 level predictor-corrector problem, there is an alternative, but only if you are allocating permanent guard cell storage space. You can define the preprocessor variable PRED_CORR, which causes a second copy of the basic data-structure, T_UNK, T_FACEVARX, etc, to be set up. This can then be used in place of doubling the number of variables.

Finally we should mention that PARAMESH does permit different timesteps to be used on blocks with different spatial resolution, subject to certain constraints. Coarser blocks can use longer timesteps than finer blocks, provided that all blocks at a given resolution use the same timestep, the timestep is a monotonic increasing function of the grid spacing, and that all the timesteps are a power of 2 multiples of the timestep being used on the finest blocks. This feature makes use of the T_UNK, T_FACEVARX, etc arrays. To enable it you need to define the preprocessor variable VAR_DT in PARAMESH_PREPROCESSOR.FH (or in the file amr_runtime_parameters if you are using LIBRARY mode see 'How to install and compile PARAMESH as a LIBRARY').

Be warned! The flowchart for a variable timestep code is quite complicated. This is discussed in more detail with the example we have provided. Example.

Workspace Arrays

Some applications will need to define a large number of variables in each grid cell. However there may be times when a computation is needed on just one of those variables, but the computation needs guard cell data. To fill the guard cells for all the variables would result in a lot of unnecessary data communication. To avoid this problem we have included a workspace data-structure which enables the PARAMESH data movement operations (guard cell filling, prolongation, restriction) to be applied to a subset of the variables. The idea is that you would copy only the data variable(s) of interest into WORK, and then apply the data movement to WORK.

The array WORK is similar in many ways to UNK. It is a 5 dimensional array which is assumed to store only cell-centered quantities. However it differs from UNK in the following ways:

Typically, the PARAMESH routines which can operate on the array, WORK, have an argument, IOPT. If this is 1, then it is assumed that the routine will act on UNK, FACEVARX, etc. If IOPT is 2 or higher it will operate on WORK. The part of WORK which is modified is the layer with the fifth dimension index set to IOPT-1.

WORK is configured by assigning appropriate values for N_GUARD_CELLS_WORK and N_VAR_WORK in the header file PARAMESH_PREPROCESSOR.FH (or in the amr_runtime_parameters file if using the LIBRARY mode).

Arrays for Enforcement of Conservation

A number of arrays are setup which can store fluxes or quantities defined on cell edges, at the boundaries of grid blocks. These are made available for use in enforcing conservation laws, or solenoidal constraints such as occur in MHD models. Detailed discussion of this topic is postponed to the section on conservation.

The Grid Block Tree

This is defined in headers/tree.F90.

The Internal Working Block

A separate datastructure was been added as of version 2.0, which represents the solution for the `current working block'. The user can construct their integration algorithm to integrate this working block. Then, by cycling over the list of blocks in turn, copying the current solution for the current block into this working block, advancing the solution, and then copying the new solution back to permanent storage before proceeding to the next block, the whole solution domain can be integrated forward in time. The big advantage of this approach is that guardcell memory allocation is only required for this working block, not the complete solution datastructure. This working block is declared in the module headers/physicaldata.F90. The user is not required to make any modifications in this file.

Guard cell Storage

PARAMESH can operate in either of two modes.

In the first mode (Strategy I), memory is permanently allocated for storage of guard cell data for all grid blocks. In the second mode (Strategy II) memory is not allocated for guard cell storage.

The user is required to choose which of these strategies they will use. The choice has significant implications for the flowchart of their application. Strategy I is easier to understand and program, and so, is the approach we recommend to new PARAMESH users. Strategy II uses memory more efficiently.

Our recommendation is that you begin by developing your application using strategy I. Then when you are confident that you have a clear picture of how PARAMESH operates, you can `reshape' your code to use strategy II, if efficient memory use is a priority.

Strategy II acquires guard cell data only when it is needed. The idea is that a `working block' is established, and only this `working block' has memory allocated for guard cell data storage. To advance the solution, we loop over grid blocks, in turn copying each block into the `working block', filling the `working block' guard cells with the appropriate data, advancing the `working block' solution, and finally copying this solution without guard cells back to the block's original location.

The memory inefficiency in strategy I is easy to see. Consider an application in 3D which will use grid blocks with

nxb = nyb = nzb = 4
nguard = 2
The ratio of interior data words to total data words for this block is
nxb x nyb x nzb / ( (nxb + 2 nguard) x (nyb + 2 nguard) x (nzb + 2 nguard) ) 
= ( 4/8 )**3
= 1/8
If memory is allocated for guard cell storage at the boundary of every grid block, then 7/8 ths of the basic data-structure is reserved for guard cell storage. This is a severe penalty to pay.

If we choose not to reserve this guard cell memory we avoid this problem. However there is a trade-off!

If we reserve the guard cell storage space then we can make a single subroutine call

         call amr_guardcell(.....)

to perform a global guardcell filling operation. Then we can advance the solution by looping over grid blocks, with no further inter-block communication required inside this loop. In pseudo-code the algorithm looks like this,
      ! do a global guardcell filling operation
            call amr_guardcell(.....)

      ! advance the solution on all grid blocks
            call advance_global_solution(.....)
and inside advance_global_solution,
            do lb = 1, no_of_blocks

              advance_this_block

            end do
For strategy II however, the equivalent pseudo-code looks like this.
      ! make a backup copy of the complete solution
            call amr_1blk_copy_soln (.....)

      ! advance the solution on all grid blocks
            call advance_global_solution(.....)
but now inside advance_global_solution,

            ! call PARAMESH pre-communications routine
            call mpi_amr_comm_setup(....)

            do lb = 1, no_of_blocks

      ! copy block lb into the working block and fill it's guard cells
              call amr_1blk_guardcell(..,lb,..)

              advance_working_block_solution

      ! save the updated solution back to lb's permanent storage
              call amr_1blk_to_perm(..,lb,..)

            end do
Strategy II is not that much more complicated than strategy I. However, there are some subtle issues which are hidden beneath the pseudo-code above. Strategy II uses an additional data-structure, (UNK1, FACEVARX1, etc) which mimics the `master' data-structure (UNK, FACEVARX, etc), except that it is only defined for the working block, and has guardcell storage space. The routine `advance_working_block_solution' updates this data-structure, not the `master' data-structure.

Selective Guardcell Filling

You can be selective about the variables which are included in a guardcell filling operation. The masking arrays GCELL_ON_CC, GCELL_ON_FC, GCELL_ON_EC, GCELL_ON_NC are used to make your selections. For example, suppose you have 4 cell centered variables stored in UNK(1:4, :,:,:,:). Then, if you set GCELL_ON_CC as follows:


            gcell_on_cc(1) = .true.
            gcell_on_cc(2) = .false.
            gcell_on_cc(3) = .false.
            gcell_on_cc(4) = .true.

you will include the first and fourth variable but exclude the second and third. Default behavior is to include all variables, but if the user modifies this behavior by setting any of these masking arrays, those selections are applied to all subsequent guardcell operations, until the user next modifies these masks. (If NO_PERMANENT_GUARDCELLS is defined, then you must not modify these arrays between the call to MPI_AMR_COMM_SETUP which precedes guardcell filling and the subsequent calls to AMR_1BLK_GUARDCELL - see MPI Pre-fetching for more details.) Face centered, edge centered and corner data can be selectively included in similar fashion using the masks in GCELL_ON_FC, GCELL_ON_EC, GCELL_ON_NC .

Coordinate Systems

A number of different curvilinear coordinate systems can be used. These include cartesian, cylindrical, spherical, and 2D polar. To enable a non-cartesian coordinate system, define the preprocessor variable CURVILINEAR in PARAMESH_PREPROCESSOR.FH, and then define one of the variables, CARTESIAN, CYLINDRICAL, SPHERICAL, or POLAR., Note if CURVILINEAR is not defined, the default coordinate system is cartesian.  Alternatively, these variables can be set in the file amr_runtime_parameters and are defined as logical variables if you are using the LIBRARY mode.

NOTE: You get cartesian coordinates by default whether you set CURVILINEAR or not.  The extra CARTESIAN flag is included mainly for testing purposes to ensure that all the logic of setting curvilinear coordinates works properly.

The variables for each type of coordinate system are mapped as follows to the PARAMESH coordinate arrays:

CARTESIAN:
    variable 1: x
    variable 2: y
    variable 3: z

CYLINDRICAL (Note: this choice supports a 2-d, polar coordinate system which is symmetric about the z axis):
    variable 1: r
    variable 2: z
    variable 3: theta

POLAR (Note: choosing this coordinate system means implicitly that you are using 2-D.  If you wish to use 3D polar coordinates then choose CYLINDRICAL above.):
    variable 1: r
    variable 2: theta
    variable 3: z ( Each block has only 1 grid cell in this direction)

SPHERICAL:
    variable 1: r
    variable 2: theta   (altitudinal coordinate from 0 to pi)
    variable 3: phi     (azimuthal coordinate from 0 to 2 pi)

You can also select an additional varaible called CURVILINEAR_CONSERVE.  If this is selected then PARAMESH will guarantee that the prolongation and restricition operations are conservative.  Currently, this means that the prolongation operation will switch to a first order, direct injection scheme.  So, using this flag might decrease the order of accuracy of your algorithm.

By setting these variables you enable PARAMESH to correctly compute cell volumes, areas and lengths and apply these to the prolongation, restriction and flux conservation routines, as appropriate.

Data Movement

One of the most important functions that PARAMESH provides is in moving data between grid blocks and processors whenever necessary. These data movement patterns can be quite complex. The good news however is that the user needs to do very little to manage this process. Nevertheless we believe that it is important that the user have a clear overview of the data movement that is happening beneath the high level amr calls, so that they will have more confidence in building their application.

The significant data movement tasks which PARAMESH performs are

Guard Cell Filling

Guard cell filling at all internal (ie between grid blocks) and external (ie physical) boundaries is handled by the routine AMR_1BLK_GUARDCELL. This treats a single selected block. The selected grid block is identified by passing its address to this routine in the argument list when the routine is called.

How you use AMR_1BLK_GUARDCELL depends on whether or not you allocate permanent storage space for guardcell data.

If you do not, then you must call AMR_1BLK_GUARDCELL directly for any block, when it needs guardcell data.  In this case you must also directly call the routine MPI_AMR_COMM_SETUP which prefetches required off-processor data in a few large messages.  See MPI Pre-fetching for a more detailed disussion of this.

If you do allocate permanent storage space for guardcell data then you can perform a global guardcell filling operation by calling AMR_GUARDCELL. This fills guardcell data on all blocks. It actually works by cycling through the complete list of blocks, calling AMR_1BLK_GUARDCELL for each block in turn and storing the data in the appropriate place.

When a grid block requires guard cell data from a neighboring block which is less refined, some interpolation is required. The interpolation function used here is the same function used during the prolongation operation.

The guard cell filling routines support numerical stencils which contain diagonal elements. A diagonal element is a neighboring cell whose index address differs from that of the curent cell in more than one spatial index. Examples of diagonal stencils are shown below. Algorithms which use diagonal elements require additional data at block boundaries. If your algorithm requires diagonal elements then define the preprocessor variable DIAGONALS in PARAMESH_PREPROCESSOR.FH (or in amr_runtime_parameters if using LIBRARY mode), to ensure that this extra guard cell data is provided.

Prolongation and Restriction

The second of the fundamental data movement tasks is prolongation. When a block is refined it's children must be filled with valid data. This is done by interpolating from the parent's data. PARAMESH will use the default linear interpolation to do this unless the user chooses one of the alternate functions provided or replaces the interpolation function with their own function. AMR_PROLONG is the high level routine called to do prolongation. This is a global operation.

The inverse operation is restriction, when a child block passes data back to its parent. Restriction is performed with a call to AMR_RESTRICT.

Data Movement During Tree Building

Finally we should mention AMR_REDIST_BLK. It redistributes the solution data amongst the processors when morton ordering and the load balancing routines demand that the list of grid blocks be rearranged. Again this routine will probably never be called directly by the user. Nevertheless, there is a very good reason why the user should know about it and the way it operates. One of the more difficult aspects of constructing an amr application is controlling automatic refinement and de-refinement without exceeding the machines memory limit. If the user's amr application should ask for too many blocks, it is inside AMR_REDIST_BLK that the code will hit the wall.

The refinement process occurs in 3 steps. First the tree is rebuilt, then a new ordering of the grid blocks is computed and the tree information redistributed. Next, in AMR_REDIST_BLK, the solution data in blocks which existed before this refinement step began, is moved to its new location. Finally AMR_PROLONG is called to fill new children in their new locations.

Their are 2 different limits to the number of blocks that are allowed during this whole process. The ultimate limit is MAX_BLOCKS, which is the maximum number of grid blocks allowed on any one processor. The storage allocated for the solution arrays is declared using this limit. However the tree data-structure works with a limit MAX_BLOCKS_TR which must be greater than or equal to MAX_BLOCKS, and will normally be set to 2 or 4 times MAX_BLOCKS. When a parent is refined, is children are initially added to the list of blocks on the same processor as the parent. If enough blocks on a particular processor are refined then the list of blocks may exceed MAX_BLOCKS. This may often happen even when the average number of blocks per processor for the whole processor array would be less than MAX_BLOCKS, and we would have plenty of storage space once the load balancing routines kicked in. Therefore to avoid this restrictive limit we can allow the tree rebuild to proceed with a larger block limit, MAX_BLOCKS_TR, in the hope that after redistribution, no processors will still exceed the MAX_BLOCKS limit.

There is one more aspect of AMR_REDIST_BLK that we wish to highlight. When it moves the solution data on different blocks to their new locations, it needs some empty space at the end of the list of blocks which acts as `elbow room'. The basic problem is that when a grid block is to move to some new location, that location may have data for another block which in turn needs to be moved to a new location of its own. The first block cannot move until the second gets out of the way, and so on. If there are less than MAX_BLOCKS blocks on a processor, then the empty slots at the end of the list of blocks can be used as storage buffers to provide AMR_REDIST_BLK with the room that it needs to manoeuvre. The first thing AMR_REDIST_BLK does is to scan the list of blocks on the local processor, and tries to move any block data which is marked for relocation into one of the empty slots at the end of the list of blocks on that processor. When this is done the original location is marked dead, and is then free to accept any data which is to be relocated there. The next stage is to loop over all the preexisting blocks and move their data if their destination slot is now available. Some blocks may be able to move now, while some may not. Those that can, do. The slots they vacate are marked dead and provide additional `elbow room' if there are blocks that still need to move, in which case the movement cycle is repeated. The movement cycle will repeat up to 10 times if necessary. If there are still blocks to be moved at that point the cycle is halted and an error message is issued. In many cases this impasse will have occurred because the list of blocks is too full and there is simply not enough `elbow room'. This may be overcome by using a larger value of MAX_BLOCKS, or by increasing the number of processors, or by altering the refinement criteria to avoid asking for so many blocks.

Setting up the Header Files (or runtime parameter file)

There are 2 header files which may require user modification. These are called PARAMESH_PREPROCESSOR.FH and TREE.F.  Equivalently, much of what is described below applies to the case where PARAMESH is compiled as a LIBRARY, In this case the variables discussed below are set in the file called 'amr_runtime_parameters' and are read in at runtime, rather than at compile time.  For a detailed discussion on how to use PARAMESH in LIBRARY mode click see 'How to install and compile PARAMESH as a LIBRARY'.

PARAMESH_PREPROCESSOR.FH

This file defines the dimensionality of the model, the properties of a grid block, the number of solution data words of each type required in each grid cell and a number of preprocessor variables which describe properties of the algorithm of which the amr package must be aware.

Some of the preprocessor variables defined here have numeric values which are then used to set fortran parameters with similar names. The correspondence between preprocessor variable and fortran variable is as follows:
Preprocessor Variable Fortran Variable
N_DIM  ndim 
NFIELD_DIVF  nfield_divf 
L_2P5DIM  l2p5d 
NX_B  nxb 
NY_B  nyb 
NZ_B  nzb 
MAX_BLOCKS  maxblocks 
N_GUARD_CELLS  nguard 
N_GUARD_CELLS_WORK  nguard_work 
N_VAR  nvar 
N_FACEVAR  nfacevar 
N_VAR_EDGE  nvaredge 
N_VAR_CORN  nvarcorn 
N_VAR_WORK  nvar_work 
N_FLUX_VAR  nfluxvar 
N_EDGE_VAR  nedgevar 
FACE_INDEX_OFFSET  iface_off 
MFLAGS  mflags 
NBOUNDARIES  nboundaries 

If you wish to use double precision then you must define REAL8 so that paramesh knows which library routines to use when communicating real variables.

Some automated error checking is enabled if AMR_ERROR_CHECKING is defined.

To use different timesteps on different grid blocks define VAR_DT . More details of the constraints on timestep control are given in Timestep Control.

The model dimension is set by defining the value for N_DIM, ie for 1D define N_DIM == 1, for 2D define N_DIM == 2 and for 3D define N_DIM == 3. If you wish to set up a 2.5D model, define N_DIM == 2 and set L_2P5DIM to 1. Otherwise set L_2P5DIM to 0.

To use curvilinear coordinates define CURVILINEAR, and then choose the particular coordinate type you want for the list provided ( CARTESIAN,CYLINDRICAL,SPHERICAL,POLAR ).

If you choose not to allocate permanent storage space for the guardcells of all blocks then define NO_PERMANENT_GUARDCELLS.

If you wish to advance the solution at all refinement levels then define ADVANCE_ALL_LEVELS. Otherwise the solution will be advanced on leaf nodes only. (Note, if VAR_DT is defined above then you must define ADVANCE_ALL_LEVELS.)

When conservation is a required property of the solution, the routine AMR_FLUX_CONSERVE will probably need to be called. This is discussed in detail later. At the heart of this routine is a function which updates fluxes (or flux densities) where refinement jumps occur. In these cases the fluxes (or flux densities) at the block boundary facing a finer block are replaced by the sum of the fluxes (or average of the flux densities) in the corresponding cells on the finer block. They indicate this choice to PARAMESH by defining one, and only one, of the preprocessor variables, CONSV_FLUXES or CONSV_FLUX_DENSITIES.

You must choose one, and only one of these choices. If you are using curvilinear coords you must assume CONSV_FLUXES, and it will automatically be selected for you here. If you do not intend calling the routine AMR_FLUX_CONSERVE, it does not matter which choice you make.

The preprocessor variable CONSERVE is available to force the standard linear interpolation routine to conserve the interpolated variable. This is described in detail in Conserve Flag.

If you have a field stored in the FACEVAR datastructure which is subject to a divergence free constraint, then you may need to define some variables here to enable routines which support this.

If you will have divergence free fields in your problem, then define NFIELD_DIVF to be the number of such fields. There are three different algorithms available for enforcing divergence-free prolongation of these fields. One of the options requires that you define the variable DIVERGENCE_FREE, which causes the routine amr_prolong_fc_divbconsist to be called during prolongation of FACEVAR.

To ensure consistency of circulation integral it may be necessary for you to store information temporarily about edge variables at block boundaries. You can choose to store this information in the form of the edge value times the edge length or just as the edge values. If you choose to use

You must choose one, and only one of these choices. If you are using curvilinear coords you must assume EDGE_VALUE_INTEG, and it will automatically be selected for you here. If you do not intend calling the routine AMR_EDGE_AVERAGE, it does not matter which choice you make.

Over time, roundoff error can introduce inconsistencies in face-centered, edge-centered and node data across block interfaces between blocks of the same refinement level. If roundoff error can introduce inconsistencies in these data values, which, in principle should remain identical, and these inconsistencies can grow because of the nature of your algorithm, you may wish to force them to be made consistent. If so, then define the pre-processor variable FORCE_CONSISTENCY_AT_SRL_INTERFACES. Currently this only works for face-centered data.

The variables NX_B, NY_B and NZ_B define the number of interior grid points (ie excluding guard cells) on a grid block. If N_DIM is less than 3 the unused higher dimensions have there values automatically set to 1 (ie if N_DIM == 2, then NZ_B will automatically be set to 1). The parameters NX_B, NY_B and NZ_B must be even valued, or set to 1. Note NX_B must be an even number greater than 1.

The number of guard cells at each block boundary is set by N_GUARD_CELLS.

The WORK datastructure has N_GUARD_CELLS_WORK guardcells which does not have to be the same as N_GUARD_CELLS.

The variable MAX_BLOCKS is the maximum number of grid blocks which are to be put on an individual processor.

The parameter N_VAR sets the number of cell-centered data words in the solution for an individual grid cell. The parameter N_FACEVAR sets the equivalent number for the cell-face centered variables. Note that if there is to be 1 data word on each of the x, y and z faces of a grid cell then N_FACEVAR should be set to 1, not 3. The parameter N_VAR_EDGE sets the number of cell-edge-centered data words and N_VAR_CORN sets the number of cell-cornered data words in each grid cell.

The number of variables per grid cell which the workspace array called WORK can handle is defined by N_VAR_WORK.

Many applications will require support for the enforcement of conservation laws. PARAMESH supplies routines which ensure that consistent fluxes and/or circulation integrals are applied on each side of a jump in spatial resolution. It uses a number of arrays to store these fluxes at block boundaries. These arrays are allocated in this header file. The user must set two parameters which define the number of variables for which flux management is required. The variable N_FLUX_VAR sets the number of cell-centered variables which require flux management. The variable N_EDGE_VAR sets the number of cell-face-centered variables which need consistent circulation integrals. These issues are discussed in detail later when we discuss conservation.

The parameter IFACE_OFF defines the indexing convention used to relate the cell-centered and face-centered data associated with a given grid cell. The most common setup with IFACE_OFF=0 would imply that the cell-centered data UNK(:,I,J,K,:) in cell (I,J,K) is physically located between the face-centered components FACEVARX(:,I,J,K,:) and FACEVARX(:,I+1,J,K,:), between FACEVARY(:,I,J,K,:) and FACEVARY(:,I,J+1,K,:), and in a 3D model between FACEVARZ(:,I,J,K,:) and FACEVARZ(:,I,J,K+1,:). If instead IFACE_OFF=-1, then UNK(:,I,J,K,:) will be located between FACEVARX(:,I-1,J,K,:) and FACEVARX(:,I,J,K,:), between FACEVARY(:,I,J-1,K,:) and FACEVARY(:,I,J,K,:), and in the 3D case between FACEVARZ(:,I,J,K-1,:) and FACEVARZ(:,I,J,K,:).

A number of preprocessor variables can be set which define properties of the user's algorithm. If the finite-difference stencil to be used requires diagonal elements, for example an update at (i,j,k) depends on data at (i+1,j+1,k+1) or (i-1,j,k+1) and so on, then the preprocessor variable DIAGONALS must be defined. If the user's algorithm has a predictor-corrector character then the variable PRED_CORR must be defined. If different timesteps are to be permitted on blocks at different refinement levels then the variable VAR_DT must be defined. Finally if interior boundaries will be used the variable EMPTY_CELLS must be defined.

WORKSPACE.F90

The basic solution data-structure, ie UNK, FACEVARX, etc, can have many data words at each grid location. Sometimes however it may be necessary to compute with only one of these data words, but guard cell data may be required. It would be wasteful to have to fill guard cell data for all data words at a grid point, when only one is really needed. For this reason PARAMESH has an array called WORK declared in the module WORKSPACE.F90. WORK has five dimensions. The first three are the spatial dimensions. The fourth is the grid block number, and the fifth is a variable number which in effect allows us to have multiple data words at each grid location. WORK can be modified by the basic data movement operations, ie guard cell filling, prolongation and restriction. There are two parameters in this file which the user can modify. The variable N_GUARD_CELLS_WORK defines the number of guard cells which the WORK array will have. The variable N_VAR_WORK establishes the number of work data variables which can be used.

TREE.F90

This is the module file which allocates the variables used in the grid block tree. It has only one parameter which the user may need to modify. A variable called BFLAGS is available to flag data blocks. This enables the user to control execution of their application with a marker which can be inherited from parents to children. A parameter MFLAGS defines the number of flags which can be associated with each block.

How to Construct the Main Program

The simplest way to construct the main program is to modify the example ( 1D hydrodynamics code). The design of this program will vary slightly depending on our choice of strategy with regard to guardcell memory allocation. We describe the main program for Strategy I here, and then point out the minor modifications which make it suitable for strategy II.

Guardcell Strategy I (Permanent Guard cells)

This begins with a sequence of include statements, which make the tree and solution data-structures visible to the main program. The mpif.h file must be included if any MPI calls are made to be made from any routine which you write.
! include file to define physical qualities of the model and mesh
      use paramesh_dimensions
      use physicaldata

! include file defining the tree
      use tree

! IMPORTANT: Expose the PARAMESH interface blocks to your code
use paramesh_interfaces
use paramesh_mpi_interfaces

! include file required for mpi library.

        include 'mpif.h'
Any local variables needed in the main routine can be declared now. The next step is to initialize the amr package.
! initialize package 
! amr package initialization - initializes all model independent stuff

        call amr_initialize


! Set values of nprocs and mype with appropriate mpi calls
call MPI_COMM_RANK(MPI_COMM_WORLD, mype, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
 
We are now ready to generate the initial grid. The simplest way to do this is to place a single grid block with refinement level 1 covering the entire computational domain. The boundary condition flags are set on this block and it is marked for refinement. Then we refine this block using the machinery of the amr package until we have a mesh which has an acceptable refinement pattern for the initial solution. It is important that you set values for the variables LREFINE_MAX and LREFINE_MIN, which establish limits on the refinement levels which can be included in your grid during the course of your calculation.
! Setup up initial grid-block tree 


! set limits on the range of refinement levels to be allowed.
! level 1 is a single block covering the entire domain, level 2 is
! refined by a factor 2, level 3 by a factor 4, etc.
        lrefine_max = 10                ! finest refinement level allowed
        lrefine_min = 6                 ! coarsest refinement level allowed


! set the no of blocks required initially to cover the computational domain
        no_of_blocks = 2**(lrefine_min-1)

! initialize the counter for the number of blocks currently on this processor
        lnblocks = 0

! begin by setting up a single block on processor 0, covering the whole domain
        if(mype.eq.0.) then

                bnd_box(1,1,1) = 0.       ! lower x bound of block 1
                bnd_box(2,1,1) = 1.       ! upper x bound of block 1

                bsize(1,1) = bnd_box(2,1,1) - bnd_box(1,1,1)
                coord(1,1) = .5*(bnd_box(2,1,1) - bnd_box(1,1,1))

                nodetype(1) = 1           ! identify block 1 as a leaf block
                lrefine(1) = 1            ! set refinement level of block 1

                neigh(1:2,1,1) = -21      ! left boundary condition
                neigh(1:2,2,1) = -21      ! right boundary condition

                refine(1)=.true.          ! mark this block for refinement

                lnblocks = 1              ! no. of blocks on this processor
        endif

        call MPI_BARRIER(MPI_COMM_WORLD,ierr)          ! parallel synchronization


! Set boundary_box and boundary_index information
! (NOTE: if you are using periodic boundary conditions, then the arrays boundary_box
! and boundary_index can be set to zero).

call mpi_amr_global_domain_boundaries() ! This routine sets grid_xmax, grid_xmin, ...

! x boundaries
boundary_box(1,2:3,1:2) = -1.e30 ! effectively this is -infinity
boundary_box(2,2:3,1:2) = 1.e30
boundary_box(1,1,1) = -1.e30
boundary_box(2,1,2) = grid_xmin ! min dimension of the comp. domain
boundary_box(1,1,2) = grid_xmax ! max " " " " "
boundary_box(2,1,2) = 1.e30

boundary_index(1) = -21
boundary_index(2) = -21

! y boundaries
if (ndim > 2) then
boundary_box(1,1,3:4) = -1.e30
boundary_box(2,1,3:4) = 1.e30
boundary_box(1,3,3:4) = -1.e30
boundary_box(2,3,3:4) = 1.e30
boundary_box(1,2,3) = -1.e30
boundary_box(2,2,3) = grid_ymin
boundary_box(1,2,4) = grid_ymax
boundary_box(2,2,4) = 1.e30

boundary_index(3) = -21
boundary_index(4) = -21
end if

! z boundaries
.
.
insert similar code as above for x and y boundaries

At this point we have a single block covering the whole domain. In this 1D example the block size is set to 1.0 and the block center is located at 0.5. BND_BOX defines the bounding box for this block. The block size and the coordinates of the block center are computed from BND_BOX and stored in BSIZE and COORD respectively. In a multidimensional case, similar initializations would be required for the other components of BND_BOX, BSIZE and COORD. This first block is placed on processor 0, and LNBLOCKS which stores the number of blocks on the local processor is set to 1 on processor 0. This first block is assigned NODETYPE = 1 which indicates that at this point it is recognized as a leaf block. Also note that the addresses of this blocks neighbors are all set to -21. Setting elements of the NEIGH array to values of -20 or less is the way external boundaries are identified to the package. Also, the arrays BOUNDARY_BOX and BOUNDARY_INDEX are set.  Setting boundary conditions are discussed in more detail below.

Now we are ready to refine this first block.

! Now cycle over blocks until `no_of_blocks' leaf blocks have been created.
        loop_count=0
        loop_count_max = int(log(real(no_of_blocks))/log(2.)+.1)
        do while(loop_count.lt.loop_count_max)

! mark ALL the blocks for refinement
                do l=1,lnblocks
                        refine(l)=.true.
                enddo


! refine grid and apply morton reordering to grid blocks if necessary
                call amr_refine_derefine

                loop_count=loop_count+1

        enddo
        call MPI_BARRIER(MPI_COMM_WORLD, ierr)
In this example we loop over the list of blocks marking all the existing blocks for refinement and then implementing the refinement with a call to AMR_REFINE_DEREFINE. We continue looping until we have reached a pre-selected refinement level. The routine AMR_REFINE_DEREFINE creates the necessary child blocks, identifying them as leaf blocks and modifying the nodetype of their parents to indicate that they are no longer leaf blocks. It also manages the inheritance of the neighbor addresses, which in this case means that the correct boundary conditions will be applied to children which are next to the external boundaries of the computational domain. This simple example sets up an initial grid-block tree which covers the computational domain with uniform refinement. However it is easy to see how the process can be modified to create more complex initial grids. This topic is discussed in more detail below.

Now we need to set up the initial solution on these initial grid blocks.

! set up initial solution on the grid blocks 
        time = 0.               
        call initial_soln(mype)


! exchange guardcell information - the call to guardcell also causes the
! guard cells at external boundaries to be filled using the user defined
! boundary conditions which the user must code into the routine amr_1blk_bcset.
        iopt = 1
        nlayers = nguard
        call amr_guardcell(mype,iopt,nlayers)


This is done
here in a user supplied routine which we have called INITIAL_SOLN. This
routine sets the initial solution on the interior grid
cells on the leaf blocks. Then the call to AMR_GUARDCELL causes NLAYERS of guard cells at
each block boundary (except
external boundaries) to be filled with the correct data from their
neighboring blocks. The call to AMR_GUARDCELL also
causes the guard cells at external boundaries to be filled according to
the boundary conditions which the user has
coded into the routine AMR_1BLK_BCSET. A template for AMR_1BLK_BCSET is
provided with the package.

Finally we are ready to advance the solution in time. The loop over timesteps is very simple. First we call the user's routine which advances the solution. In this case a MUSCL algorithm is being used to solve the fluid equations in 1D. The routine ADVECT_MUSCL includes the computation of the timestep as well as the integration of the fluid equations. We will discuss the inner workings of the users routine for advancing the solution in more detail shortly. When the solution has been advanced it is tested to see if the current refinement pattern is still appropriate. The arguments LREFINE_MAX and LREFINE_MIN to AMR_TEST_REFINEMENT set bounds on the refinement levels which the code is permitted to use. Any refinements or de-refinements which have been requested are then executed by the call to AMR_REFINE_DEREFINE. The call to AMR_PROLONG fills any newly created child blocks with data by interpolating from their parents. Finally the guard cells are updated and boundary conditions reestablished.
! set the no of timesteps
        ntsteps = 1001


! Begin time integration 
        do loop=1,ntsteps


! perform a single timestep integration
        call advect_muscl(mype,dt,time,nprocs,loop)


! test to see if additional refinement or de-refinement is necessary
! note - a call to guardcell must come before this call to ensure that
! the refinement test can be done on parents of leafs also. This avoids
! a potential refinement/de-refinement flip-flop happening on successive
! timesteps.
        call amr_test_refinement(mype,lrefine_min,lrefine_max)


! refine grid and apply morton reordering to grid blocks if necessary
        call amr_refine_derefine



! prolong solution to any new leaf blocks if necessary
        call amr_prolong(mype,iopt,nlayers)


! exchange guardcell information and boundary conditions.
        call amr_guardcell(mype,iopt,nlayers)


        enddo
! end timestep integration loop
Before we exit, we must close the amr package. The principal task performed here is to properly close the mpi package, if the application has been built to use mpi.
! Now close the amr package
        call amr_close()
That is the basic structure of the main program for a typical fluid code. It should be obvious that this requires little modification to apply in 2 or 3D. It is also essentially independent of the algorithm which is being used, since these details are submerged in the routine which the user supplies to advance the solution through a single timestep. It should also be clear that this design for the main program could be used for any calculation on this type of grid which approaches a solution through an iterative loop - it need not necessarily be a time dependent fluid code. Finally we should note that a working code would have some I/O routines, which we have left out here in the interests of simplicity.

Guardcell Strategy II (No Permanent Guard cells)

The main program is the same as for strategy I, above, except that we delete both calls to AMR_GUARDCELL.

A global guardcell filling cannot be done in this case since there is no memory allocated to store this data on each grid block. Guardcell filling must now be done inside the routines ADVECT_MUSCL and AMR_TEST_REFINEMENT, for each block in turn, as their guard cells are needed.

How to Construct a Routine to Advance the Solution

This routine should consist of a loop, or a sequence of loops, over all blocks on the local processor ADVANCE_ALL_LEVELS is defined, and over leaf blocks only if it is not defined. locks on the local processor. Inside these loops the user inserts the serial code needed to advance the solution on a typical block. At the beginning of each loop iteration, the variables in the user's code segments must be filled with data for the current block, copied from the package's main data-structures. At the end of each loop iteration, these same variables must return their updated values to the package's main data-structures.

Depending on the complexity of the algorithm, and on the guardcell strategy being used,it may be necessary to call AMR_GUARDCELL or AMR_1BLK_GUARDCELL one or more times during this sequence to fill guard cells, and it may also be necessary to capture data values near block boundaries, in order to guarantee conservation properties.

Let us now work our way through a particular example. In this case we will construct a routine to advance the fluid equations of a 3D hydro code. We will call our routine HYDRO_3D. The solution data will consist of the mass, momentum and energy densities, and we assume that these are all cell-centered quantities.

We illustrate this for each guardcell strategy, in turn.

Guardcell Strategy I (Permanent Guard cells)

Step 1 is to incorporate the necessary amr header files, include any header files required by the user's code segments, and declare any local variables needed in this routine.
        subroutine hydro_3d (mype,dt)

! Include the amr package's header files.

      use paramesh_dimensions
      use physicaldata

! include file defining the tree
      use tree

! IMPORTANT: Expose the PARAMESH interface blocks to your code
use paramesh_interfaces
use paramesh_mpi_interfaces

! include file required for mpi library.

        include 'mpi.f'


! Include header files for user's code.
        include 'user_header_file.fh'


! Declare any local variables required.
        real :: dt,dthalf

        real :: rho (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vx  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vy  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vz  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: ener(1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)

        real :: xfluxes(5,1:nxb+2*nguard+1,1:nyb+2*nguard  ,1:nzb+2*nguard)
        real :: yfluxes(5,1:nxb+2*nguard  ,1:nyb+2*nguard+1,1:nzb+2*nguard)
        real :: zfluxes(5,1:nxb+2*nguard  ,1:nyb+2*nguard  ,1:nzb+2*nguard+1)

        real :: src(5,1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)


        integer :: mype,iopt,nlayers,jface,lb
Notice that the file mpif.h must be included here.

We are going to use a predictor-corrector type time integration. The solution will be advanced through a half timestep using source terms computed at the beginning of the timestep. Then this `prediction' of the solution at the half timestep is used to re-compute the source terms at the half-timestep, which can then be used to advance the solution through the full timestep, but now using time-centered source terms. This is a very common second order accurate time integration technique.

In this illustration, the source terms will be computed in a routine called SOURCE, which is fed the current solution data as input, and the integration of the equations is performed by a routine called ADVANCE which updates the solution in place. Notice that ADVANCE has arguments called XFLUXES, YFLUXES and ZFLUXES which contain the fluxes at grid cell boundaries that are used to compute the updated solution.

We are now ready for step 2 which is the solution advance through a half-timestep. We could use Fortran 90 pointers to establish the connection between the user's variable names and our own, but in this instance we have chosen to do this with explicit copying.

! Make a temporary copy of the solution at the beginning of the timestep.
        t_unk(:,:,:,:,:) = unk(:,:,:,:,:)


! Integrate forward through a half timestep

        dthalf = dt*.5


! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then

! copy solution data for this block from PARAMESH data-structure to user's 
! local variables
                rho(:,:,:)  = unk(1,:,:,:,lb)
                vx(:,:,:)   = unk(2,:,:,:,lb)
                vy(:,:,:)   = unk(3,:,:,:,lb)
                vz(:,:,:)   = unk(4,:,:,:,lb)
                ener(:,:,:) = unk(5,:,:,:,lb)

! Compute required source terms on this block
                call source(src,rho,vx,vy,vz,ener)


! Advance the solution on this block through dt/2
                call advance(dthalf,rho,vx,vy,vz,ener, &
                            xfluxes,yfluxes,zfluxes,src)

        endif
        enddo
        endif
We can make this example a little more challenging by assuming that the calculation of source terms requires up to date guard cell data. The next segment does this, both at internal block boundaries, and at external boundaries. AMR_GUARDCELL handles external boundaries by applying the user supplied boundary condition routine, AMR_1BLK_BCSET. We discuss how to construct AMR_1BLK_BCSET and setting up boundary conditions in much more detail later in this guide.
! Fill the guard cells.
        iopt = 1
        nlayers = nguard
        call amr_guardcell(mype,iopt,nlayers)     
We now repeat step 2, but with a few modifications. This time we integrate the equations through a full timestep, but we reset the solution arrays to their original values between calling SOURCE and ADVANCE.
! Integrate forward through a full timestep


! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then

! copy solution data for this block from amr data-structure to user's local
! variables
                rho(:,:,:)  = unk(1,:,:,:,lb)
                vx(:,:,:)   = unk(2,:,:,:,lb)
                vy(:,:,:)   = unk(3,:,:,:,lb)
                vz(:,:,:)   = unk(4,:,:,:,lb)
                ener(:,:,:) = unk(5,:,:,:,lb)

! Compute required source terms on this block
                call source(src,rho,vx,vy,vz,ener)

! Restore original solution 
                unk(1,:,:,:,lb) = t_unk(1,:,:,:,lb)
                unk(2,:,:,:,lb) = t_unk(2,:,:,:,lb)
                unk(3,:,:,:,lb) = t_unk(3,:,:,:,lb)
                unk(4,:,:,:,lb) = t_unk(4,:,:,:,lb)
                unk(5,:,:,:,lb) = t_unk(5,:,:,:,lb)

! Advance the solution on this block through dt/2
                call advance(dt,rho,vx,vy,vz,ener, &
                            xfluxes,yfluxes,zfluxes,src)

! Capture fluxes at block boundaries
                flux_x(:,1,:,:,lb) = xfluxes(:,nguard,:,:)
                flux_x(:,2,:,:,lb) = xfluxes(:,nxb+nguard+1,:,:)
                flux_y(:,:,1,:,lb) = yfluxes(:,:,nguard,:)
                flux_y(:,:,2,:,lb) = yfluxes(:,:,nyb+nguard+1,:)
                flux_z(:,:,:,1,lb) = zfluxes(:,:,:,nguard)
                flux_z(:,:,:,2,lb) = zfluxes(:,:,:,nzb+nguard+1)

                
        endif
        enddo
        endif
       
The final step is to ensure conservation. The fluxes on a common block boundary between two blocks at different refinement level are not likely to be consistent. We assume that the fluxes on the more refined block are more accurate and so we must modify the appropriate fluxes on the coarse neighbor to make them consistent. This is done by calling AMR_FLUX_CONSERVE, which updates FLUX_X, FLUX_Y and FLUX_Z. However since the original fluxes were already used in ADVANCE to update the solution on the entire block, once we modify the fluxes at the boundary on the coarse block, we must also modify the solution for the layers of cells immediately inside that boundary to reflect this modification to the boundary fluxes.
! Modify the boundary fluxes
!
! note: the first thing that happens inside amr_flux_conserve is that
! the arrays flux_x, flux_y and flux_z are copied into the temporary
! storage arrays tflux_x, tflux_y and tflux_z respectively.
        nsub = 1
        call amr_flux_conserve(mype,nsub)


! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then

            dx = bsize(1,lb)/real(nxb)
            dy = bsize(2,lb)/real(nyb)
            dz = bsize(3,lb)/real(nzb)
            area_yz = dy*dz
            area_xz = dx*dz
            area_xy = dx*dy
            vol = dx*dy*dz
            rvol = 1./vol


! Update the solution immediately inside the block boundaries to reflect this
! modification of the fluxes. Note it is assumed here that the arrays flux_x,
! tflux_x, etc are fluxes per unit area.
            unk(:,1+nguard,:,:,lb) = unk(:,1+nguard,:,:,lb) + &
               (flux_x(:,1,:,:,lb) - tflux_x(:,1,:,:,lb))*area_yz*rvol
            unk(:,nxb+nguard,:,:,lb) = unk(:,nxb+nguard,:,:,lb) + &
               (tflux_x(:,2,:,:,lb) - flux_x(:,2,:,:,lb))*area_yz*rvol
            unk(:,:,1+nguard,:,lb) = unk(:,:,1+nguard,:,lb) + &
               (flux_y(:,:,1,:,lb) - tflux_y(:,:,1,:,lb))*area_xz*rvol
            unk(:,nyb+nguard,:,lb) = unk(:,:,nyb+nguard,:,lb) + &
               (tflux_y(:,:,2,:,lb) - flux_y(:,:,2,:,lb))*area_xz*rvol
            unk(:,:,:,1+nguard,lb) = unk(:,:,:,1+nguard,lb) + &
               (flux_z(:,:,:,1,lb) - tflux_z(:,:,:,1,lb))*area_xy*rvol
            unk(:,:,:,nzb+nguard,lb) = unk(:,:,:,nzb+nguard,lb) + &
               (tflux_z(:,:,:,2,lb) - flux_z(:,:,:,2,lb))*area_xy*rvol


        endif
        enddo
        endif


        return
        end
This example illustrates most of the basic features that a user might require in a routine to advance the solution. There are of course many other bells and whistles which a user may need to support inside their algorithm, but it should be clear from this example how these might be handled. The coding listed above was designed for clarity of expression rather than efficiency or elegance. Obviously some of its details are dictated by the input and output requirements of the user routines (SOURCE and ADVANCE) which are called, and by details of the user's data-structure (USERS_HEADER_FILE.FH) if needed. For completeness sake we have fleshed out the internal structure of both SOURCE and ADVANCE.

Guardcell Strategy II (No Permanent Guard cells)

The first step is unchanged.
        subroutine hydro_3d (mype,dt)

! Include the amr package's header files.

      use paramesh_dimensions
      use physicaldata

! include file defining the tree
      use tree

! IMPORTANT: Expose the PARAMESH interface blocks to your code
use paramesh_interfaces
use paramesh_mpi_interfaces

! include file required for mpi library.

        include 'mpif.h'


! Include header files for user's code.
        include 'user_header_file.fh'


! Declare any local variables required.
        real :: dt,dthalf

        real :: rho (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vx  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vy  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: vz  (1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)
        real :: ener(1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)

        real :: xfluxes(5,1:nxb+2*nguard+1,1:nyb+2*nguard  ,1:nzb+2*nguard)
        real :: yfluxes(5,1:nxb+2*nguard  ,1:nyb+2*nguard+1,1:nzb+2*nguard)
        real :: zfluxes(5,1:nxb+2*nguard  ,1:nyb+2*nguard  ,1:nzb+2*nguard+1)

        real :: src(5,1:nxb+2*nguard,1:nyb+2*nguard,1:nzb+2*nguard)


        integer :: mype,iopt,nlayers,jface,lb
As before, step 2 is the solution advance through a half-timestep. This has a few significant modifications. The first is that we change N_VAR from 5 to 10, so that we can use UNK(6:10,:,:,:,:) to store a second copy of the solution. N_VAR is set inside PARAMESH_PREPROCESSOR.FH. The reason for this extra copy will be apparent shortly. When we loop over blocks we must import the current block's data into the interior of the `working block' data-structure, UNK1, and then fill UNK1's guard cells. Both these tasks are accomplished with a call to AMR_1BLK_GUARDCELL. Note, for internal PARAMESH reasons, UNK1 has 2 data layers. The user can assume, unless explicitly told otherwise, that they will use layer 1, which is selected by setting IDEST=1. We then copy the solution from layer 1 of the working block data UNK1 into the user's arrays (RHO, VX, etc). We call SOURCE and ADVANCE as before, and then copy the solution back from the user's variables to the working block data. Finally the call to AMR_1BLK_TO_PERM stores the interior of the working block in the permanent data-structure, UNK.
! Make a temporary copy of the solution at the beginning of the timestep.
        unk(6:10,:,:,:,:) = unk(1:5,:,:,:,:)


! Integrate forward through a half timestep

        dthalf = dt*.5

! Restrict the solution to parent blocks. This is necessary to ensure that
! parent blocks have valid data which can be used during guardcell filling.
! This step is not necessary if advance_all_levels is set to .true. or if
! you are using permanent guardcell storage.
call amr_restrict(mype, 1, 0, filling_guardcells=.true.)

! Store a copy of the current solution in gt_unk. This is the copy
! which amr_1blk_guardcell uses when satisfying requests for guardcell data.
! (ie copy from UNK to GT_UNK)
        call amr_1blk_copy_soln(-1)

! Perform the data communication step BEFORE entering the loop over blocks
! The logical flags are set such that the data necessary for guardcell filling
! is communicated

lcc = .true.
lfc = .false.
lec = .false.
lnc = .false.

lguard = .true.
lprolong = .false.
lflux = .false.
ledge = .false.
lrestrict = .false.

tag_offset = 100

call mpi_amr_comm_setup(mype,nprocs, &
lguard,lprolong, &
lflux,ledge,lrestrict, &
iopt,lcc,lfc,lec,lnc,tag_offset)


  
! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then

! Copy data from current block into working block and fill its guard cells
! (ie GT_UNK to UNK1)
                idest = 1
                iopt = 1
                nlayers = nguard
                lcc = .true.
                lfc = .false.
                lec = .false.
                lnc = .false.
                l_srl_only = .false.
                icoord = 0
                ldiag = .true.
                call amr_1blk_guardcell(mype,iopt,nlayers,lb,mype, &
                                       lcc,lfc,lec,lnc, &
                                       l_srl_only,icoord,ldiag)


! copy solution data for this block from PARAMESH data-structure to user's 
! local variables

                rho(:,:,:)  = unk1(1,:,:,:,idest)
                vx(:,:,:)   = unk1(2,:,:,:,idest)
                vy(:,:,:)   = unk1(3,:,:,:,idest)
                vz(:,:,:)   = unk1(4,:,:,:,idest)
                ener(:,:,:) = unk1(5,:,:,:,idest)

! Compute required source terms on this block
                call source(src,rho,vx,vy,vz,ener)


! Advance the solution on this block through dt/2
                call advance(dthalf,rho,vx,vy,vz,ener, &
                               xfluxes,yfluxes,zfluxes,src)

! Copy updated solution back from user variables to working block variables
                unk1(1,:,:,:,idest) = rho(:,:,:)
                unk1(2,:,:,:,idest) = vx(:,:,:)
                unk1(3,:,:,:,idest) = vy(:,:,:)
                unk1(4,:,:,:,idest) = vz(:,:,:)
                unk1(5,:,:,:,idest) = ener(:,:,:)


! Store new solution for this block in permanent data-structure UNK
! (ie copy from UNK1 to UNK)
                call amr_1blk_to_perm( lcc,lfc,lec,lnc,lb,iopt,idest)

        endif
        enddo
        endif
The full timestep update is very similar to the half timestep stage. There are two significant differences. The first is that the updated solution is copied into the user arrays (RHO, VX, etc), for use in SOURCE, but then the user arrays are overwritten with the original solution from the beginning of the timestep before calling ADVANCE. The second is that we capture fluxes at the block boundaries, exactly as we did in the Strategy I case.
! Integrate forward through a full timestep

! Restrict the solution to parent blocks. This is necessary to ensure that
! parent blocks have valid data which can be used during guardcell filling.
! This step is not necessary if advance_all_levels is set to .true. or if
! you are using permanent guardcell storage.
call amr_restrict(mype, 1, 0, filling_guardcells=.true.)

! Store a copy of the current solution in gt_unk. This is the copy
! which amr_1blk_guardcell uses when satisfying requests for guardcell data.
! (ie copy from UNK to GT_UNK)
        call amr_1blk_copy_soln(-1)

! Perform the data communication step BEFORE entering the loop over blocks
! The logical flags are set such that the data necessary for guardcell filling
! is communicated

lcc = .true.
lfc = .false.
lec = .false.
lnc = .false.

lguard = .true.
lprolong = .false.
lflux = .false.
ledge = .false.
lrestrict = .false.

tag_offset = 100

call mpi_amr_comm_setup(mype,nprocs, &
lguard,lprolong, &
lflux,ledge,lrestrict, &
iopt,lcc,lfc,lec,lnc,tag_offset)


! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then


! Copy data from current block into working block and fill its guard cells
                idest = 1
                iopt = 1
                nlayers = nguard
                lcc = .true.
                lfc = .false.
                lec = .false.
                lnc = .false.
                l_srl_only = .false.
                icoord = 0
                ldiag = .true.
                call amr_1blk_guardcell(mype,iopt,nlayers,lb,mype, &
                                       lcc,lfc,lec,lnc, &
                                       l_srl_only,icoord,ldiag)


! copy solution data for this block from PARAMESH data-structure to user's 
! local variables

                rho(:,:,:)  = unk1(1,:,:,:,idest)
                vx(:,:,:)   = unk1(2,:,:,:,idest)
                vy(:,:,:)   = unk1(3,:,:,:,idest)
                vz(:,:,:)   = unk1(4,:,:,:,idest)
                ener(:,:,:) = unk1(5,:,:,:,idest)


! Compute required source terms on this block
                call source(src,rho,vx,vy,vz,ener)

! Restore original solution 
                rho(:,:,:)  = unk1(6,:,:,:,idest)
                vx(:,:,:)   = unk1(7,:,:,:,idest)
                vy(:,:,:)   = unk1(8,:,:,:,idest)
                vz(:,:,:)   = unk1(9,:,:,:,idest)
                ener(:,:,:) = unk1(10,:,:,:,idest)

! Advance the solution on this block through dt/2
                call advance(dt,rho,vx,vy,vz,ener, &
                            xfluxes,yfluxes,zfluxes,src)

! Capture fluxes at block boundaries
                flux_x(:,1,:,:,lb) = xfluxes(:,nguard,:,:)
                flux_x(:,2,:,:,lb) = xfluxes(:,nxb+nguard+1,:,:)
                flux_y(:,:,1,:,lb) = yfluxes(:,:,nguard,:)
                flux_y(:,:,2,:,lb) = yfluxes(:,:,nyb+nguard+1,:)
                flux_z(:,:,:,1,lb) = zfluxes(:,:,:,nguard)
                flux_z(:,:,:,2,lb) = zfluxes(:,:,:,nzb+nguard+1)

! Store new solution for this block in permanent data-structure UNK
                call amr_1blk_to_perm( lcc,lfc,lec,lnc,lb,iopt,idest)

                
        endif
        enddo
        endif
As before the final step is to ensure conservation. This step is identical to that of Strategy I, except that we drop NGUARD when setting the indeces for UNK.
! Modify the boundary fluxes
!
! note: the first thing that happens inside amr_flux_conserve is that
! the arrays flux_x, flux_y and flux_z are copied into the temporary
! storage arrays tflux_x, tflux_y and tflux_z respectively.
! Store the boundary fluxes
        nsub = 1
        call amr_flux_conserve(mype,nsub)


! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do lb=1,lnblocks
        if(nodetype(lb).eq.1) then

            dx = bsize(1,lb)/real(nxb)
            dy = bsize(2,lb)/real(nyb)
            dz = bsize(3,lb)/real(nzb)
            area_yz = dy*dz
            area_xz = dx*dz
            area_xy = dx*dy
            vol = dx*dy*dz
            rvol = 1./vol

! Update the solution immediately inside the block boundaries to reflect this
! modification of the fluxes. Note it is assumed here that the arrays flux_x,
! tflux_x, etc are fluxes per unit area.
            unk(:,1,:,:,lb) = unk(:,1,:,:,lb) + &
               (flux_x(:,1,:,:,lb) - tflux_x(:,1,:,:,lb))*area_yz*rvol
            unk(:,nxb,:,:,lb) = unk(:,nxb,:,:,lb) + &
               (tflux_x(:,2,:,:,lb) - flux_x(:,2,:,:,lb))*area_yz*rvol
            unk(:,:,1,:,lb) = unk(:,:,1,:,lb) + &
               (flux_y(:,:,1,:,lb) - tflux_y(:,:,1,:,lb))*area_xz*rvol
            unk(:,nyb,:,lb) = unk(:,:,nyb,:,lb) + &
               (tflux_y(:,:,2,:,lb) - flux_y(:,:,2,:,lb))*area_xz*rvol
            unk(:,:,:,1,lb) = unk(:,:,:,1,lb) + &
               (flux_z(:,:,:,1,lb) - tflux_z(:,:,:,1,lb))*area_xy*rvol
            unk(:,:,:,nzb,lb) = unk(:,:,:,nzb,lb) + &
               (tflux_z(:,:,:,2,lb) - flux_z(:,:,:,2,lb))*area_xy*rvol


        endif
        enddo
        endif


        return
        end
It is also possible to use local timesteps which vary with the refinement level. A variable timestep example is provided.

How to Construct a Timestep Routine

In a parallel amr code, there are a number of extra complications in computing an integration timestep which do not exist in a serial non-amr code. The processors must compute local timesteps for each of the leaf blocks which they store, and then share information about the shortest timestep required at any point in the mesh.

If the same timestep is to be used everywhere, regardless of the local refinement level, then it must be this global minimum timestep. If the timestep is permitted to vary depending on the local refinement level, then this package imposes a number of constraints on how the timestep may vary. It must be a monotonic function of refinement level. The shortest timestep must be used on the finest leaf blocks, and the longest on the coarsest leaf blocks, and all timesteps must be integer multiples of the finest timestep.

The timestep template routine (AMR_TIMESTEP_TEMPLATE.F90) is already set up to takes care of all these issues. All the user need add is a code segment which computes a local timestep on a single generic block. Comment lines in the template routine indicate where this code segment should be inserted. It should return the timestep it computes in a local variable called DTL, as shown in the example here, taken from a 3D fluid code.

! loop over leaf blocks on this processor
        if(lnblocks.gt.0) then
        do l=1,lnblocks


      if(nodetype(l).eq.1 .or. advance_all_levels) then


                dx = bsize(1,l)/(real(nxb/2)*2)   ! set typical grid cell size

                rho => unk(1,:,:,:,l)
                vx => unk(2,:,:,:,l)
                vy => unk(3,:,:,:,l)
                vz => unk(4,:,:,:,l)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! users local timestep calculation

                speed2(:,:,:) = (vx(:,:,:)*vx(:,:,:)+
     .                  vy(:,:,:)*vy(:,:,:)+
     .                  vz(:,:,:)*vz(:,:,:))/
     .                  (rho(:,:,:)*rho(:,:,:)+1.e-10)


                maxspeed=0.
                do k=1+nguard,nzb+nguard
                do j=1+nguard,nyb+nguard
                do i=1+nguard,nxb+nguard
                        maxspeed=max(maxspeed,speed2(i,j,k))
                enddo
                enddo
                enddo

! compute timestep using Courant condition
                dtl = courant*dx/sqrt(maxspeed+1.e-10)

! end of users local timestep calculation
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                dtlevel(lrefine(l)) = min(dtlevel(lrefine(l)),dtl)


      endif


        enddo
        endif
       
Notice, that in this example Fortran 90 pointers were used to establish the connection between the users variable names and the amr package's data-stru