Domain Decomposition — 2D Poisson Equation

Elliptic Equation — 2D Poisson Equation

The Model

We will solve the equation

\[ \nabla^2 u = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial x^2} = S(x,y) \]

We solve for the function $u(x,y)$ in a 2D domain, $x \in [0,1]$ and $y \in [0,1]$. The boundary conditions are assumed of Dirichlet type, namely$ u(0,y)$, $u(1,y)$, $u(x,0)$, and $u(x,1)$ are given. In addition, the function $u(x,y)$ can also be specified on specific points in the interior domain.

In the following discussion, the source $S(x,y)$ is taken to be a simple displaced Gaussian for simplicity.

The domain is discretized on a 2D numerical lattice, with $x_i = (i-1) h$, $y_j = (j-1)h$, and $h = 1.0/(N-1)$. $N$ is the number of grid points in both directions. The Poisson equation is re-written in finite difference form:

\[ \frac{1}{h^2} \left( u_{i+1,j} + u_{i-1,j} + u_{i,j+1} + u_{i,j-1} - 4 u_{i,j} \right) = S_{i,j} \]

This leads to a large set of linear equations to solve for the field values $u_{i,j}$ on the grid within the domain, keeping $u_{i,j}$ fixed at the boundary. This set of equations can be solved by an iterative method, whereby

\[ u_{i,j}^{k+1} = \frac{1}{4} \left( u_{i+1,j}^k + u_{i-1,j}^k + u_{i,j+1}^k + u_{i,j-1}^k - h^2 S_{i,j} \right) \]

is iterated in place until the changes in $u_{i,j}$

\[ diff = Max \| u_{i,j}^{k+1} - u_{i,j}^{k} \| \]

become less than some predefined tolerance criteria. An arbitrary guess for $u_{i,j}$ is assumed to start with. The method is stable, i.e., the solution will converge.

The old and new values of $u_{i,j}$ are mixed to accelerate the convergence, leading to the Successive Over Relaxation scheme.

\[ \begin{aligned} u_{i,j}^{k+\frac{1}{2}} = \frac{1}{4} \left( u_{i+1,j}^k + u_{i-1,j}^{k+\frac{1}{2}} + u_{i,j+1}^k + u_{i,j-1}^{k+\frac{1}{2}} - h^2 S_{i,j} \right) \\ u_{i,j}^{k+1} = \omega u_{i,j}^{k+\frac{1}{2}} - ( 1-\omega) u_{i,j}^k \end{aligned} \]

The accuracy of the solution remains second order in $h$.

Serial Implementation

The code poisson_2d.c implements the above scheme. The steps in the code are:

Initialize the numerical grid.
Provide an initial guess for the solution.
Set the boundary values & source term.
Iterate the solution until convergence.
Output the solution for plotting.

Compile and execute with

gcc poisson_2d.c -lm -o poisson_2d
./poisson_2d N Nx Ny | ./plot_image.py -s Nx,Ny -t "2d Poisson"

where N, Nx and Ny are the (arbitrary) maximum number of iterations and number of grid points (image size) respectively. Good numbers to use could be 10000, 300, 300. We introduced plot_image.py in Programming Strategies: Color Images.

The code stops once good convergence (parameter EPS) is obtained.

Note:

The chi parameter which implements the over-relaxation approach is chosen arbitrarily for simplicity (see Numerical Receipes for an optimum choice).
The code is written in the simplest possible way.
The code outputs the field values to pipe into plot_image.py for field display.

Parallel Implementation

Domain decomposition consists in dividing the original grid among the processes with an equal number of grid points in each process for good load balance. We use here the simplest possible division, i.e., in simple stripes. This results in the following arrangement on each processor.

If the number of grid points does not divide equally among the processes, the extra grid points are distributed one each to a subset of processors as indicated schematically in the following figure.

The domain is divided in stripes which correspond to a split of the first index in the two dimensional arrays containing the fields. This scheme is adopted to facilitate the transmission of the GHOST lines of points. Recall that C stores two dimensional arrays in row fashion; therefore transmitting row ip of the array phi containing jj numbers simply requires

MPI_Ssend( &phi[ip][0] , jj, ... )

with no further packing. This is efficient and easy to implement. Note that a FORTRAN programmer would divide the domain in stripes along the second index of the array since FORTRAN stores two dimensional arrays in columns.

The code should follow very much the approach of the 1D code. Some functions, like decompose_domain(), could even be borrowed as is since the stripe decomposition is essentially a 1D domain decomposition. The flowchart below shows the structure that the code should have:

Red and Black algorithm

The parallel and serial implementations of the Poisson equation solver in 2D as implemented above yield the same results to within the tolerance only. Yet these two implementations are not strictly identical. This stems from the handling of the GHOST lines in the parallel implementation.

Consider the update of the field at the location marked 0 below. The Gauss-Siedel solution calls for an in-place update of the field via the finite difference formula in a systematic sweep of the numerical lattice. This is done via the 2D finite difference formula. The values of the field at the points 2 and 3 have already been updated if the sweep originates from the upper-left corner of the lattice, while the field at the point 1 and 4 are yet to be updated.

The sub-domain decomposition requires the transmission of the GHOST line of values at each iteration. This is normally done before each Gauss-Siedel step. The values of the field in these GHOST lines are those of the previous Gauss-Siedel iteration. The value of the field at point "2" below is no longer the updated one. Therefore the parallel implementation is strictly not Gauss-Siedel.

The solution to this problem is to modify the Gauss-Siedel algorithm into the the Red and Black algorithm. In this approach the grid is scanned alternatively on the "red" then on the "black" dots. The name is reminiscent of a checker board.

It is clear from the picture that the needed values of the field at points 1, 2, 3, and 4 are the old field values (black) when updating the field at point 0, and vice-versa for any red point. The algorithm then calls fro two subsequent sweep of the lattice alternately on the red and black points. In a parallel implementation, the GHOST lines are exchanged twice at each iteration:

Exchange the GHOST lines.
Update the red points on the lattice.
Exchange the GHOST lines.
Update the black points on the lattice.

This algorithm yields numerically equivalent solutions between the serial and parallel implementations, but might be somewhat slower than the strict Gauss-Siedel.