Gerris in parallel

From Gerris

Jump to: navigation, search

The principle is relatively simple: each GfsBox can take a pid argument which defines the rank of the process on which the solution for this GfsBox will be computed. If you take the "half cylinder" example and do something like:

4 3 GfsSimulation GfsBox GfsGEdge {} {
  Time { end = 10 }
  Refine 6
  GtsSurfaceFile half-cylinder.gts
  Init {} { U = 1 }
  OutputProjectionStats { step = 0.02 } stderr
  OutputSimulation { step = 1 } simulation-%3.1f
  OutputTiming { start = end } stderr
}
GfsBox { pid = 0 left = BoundaryInflowConstant 1 }
GfsBox { pid = 1 }
GfsBox { pid = 2 }
GfsBox { pid = 3 right = BoundaryOutflow }
1 2 right
2 3 right
3 4 right

if you run this using

% gerris2D half-cylinder.gfs

it will run on one processor. If you now do

% mpirun -np 4 gerris2D half-cylinder.gfs

it will run on 4 processors with each of the GfsBoxes assigned to a different processor. Gerris takes care of the communications necessary at the boundaries between GfsBoxes on different processors.

Any Gerris parameter file can be manually "parallelised" as explained previously. Gerris also includes functions designed to create "parallelised" simulation files. A short description of these functions is given when typing:

% gerris2D -h
 -s N   --split=N     splits the domain N times and returns
                      the corresponding simulation
 -i     --pid         keep box pids when splitting
 -p N   --partition=N partition the domain in 2^N subdomains and returns
                      the corresponding simulation
 -b N   --bubble=N    partition the domain in N subdomains and returns
                      the corresponding simulation

The option -s is used to split the domain which will create more GfsBoxes. It takes the already existing domain, splits it N times and attribute a new pid to each GfsBox (unless the -i option is specified).

The -p and -b options are used to "parallelise" a Gerris file that already contains enough GfsBoxes (at least 2^N for the -p option and N for the -b option). It will group the GfsBoxes together in order to get 2^N (or N) subdomains. Each of these domains is attributed a different pid. The difference between the -p and -b options is the algorithm used to perform the graph partitioning. The -b option uses a simple and fast bubble partitioning algorithm which will not necessarily yield well-balanced subdomains. The -p options uses a more complex and slower recursive bisection algorithm which is optimised to yield well-balanced subdomains.

Example: the GfsSimulation is 2D and made of 3 GfsBoxes

1- If only one processor is to be used no parallelisation is required. We use the usual command line:

% gerris2D simulation.gfs

2- If 2 processors are to be used: being constituted of 3 GfsBoxes 2 solutions can be considered:

  • Either the 3 GfsBox can be redistributed in 2 subdomains, which are bound to be 1 subdomain of 1 GfsBox and one of 2 GfsBoxes. This can be done by:
% gerris2D -b 2 simulation.gfs > parallelsimulation.gfs

then the simulation can be started using:

% mpirun -np 2 gerris2D parallelsimulation.gfs
  • If we want to get a better balance between the size of the 2 subdomains, it is possible to split the simulation once and then reassemble it.

The 3 GfsBoxes can be split once which would create 3*4 = 12 GfsBoxes

% gerris2D -s 1 simulation.gfs > splitsimulation.gfs

then the 12 GfsBoxes simulation can be partitioned in 2 groups of 6 GfsBoxes, where the same pid is given to the GfsBoxes of the same subdomain:

% gerris2D -b 2 splitsimulation.gfs > parallelsimulation.gfs

The simulation is still started in the same way:

% mpirun -np 2 gerris2D parallelsimulation.gfs

3- If 4 processors are to be used, then the domain has to be split anyway.

The 3 GfsBoxes can be split once wich would create 3*4 = 12 GfsBoxes

% gerris2D -s 1 simulation.gfs > splitsimulation.gfs

then the 12 GfsBoxes simulation can be partitioned in 4 groups of 3 GfsBoxes, where the same pid is given to the GfsBoxes of the same subdomain:

% gerris2D -b 4 splitsimulation.gfs > parallelsimulation.gfs

The simulation is still started in the same way:

% mpirun -np 4 gerris2D parallelsimulation.gfs

Dynamic load-balancing

When adaptive mesh refinement is used, the number of cells of each subdomain will change during the course of the simulation. If the size of the subdomains is not changed, some processors will end up working much harder than others which will lead to inefficient parallelisation. It is then necessary to "rebalance" the simulation. This is done using the GfsEventBalance object. Note that in this case the quality of the initial partition does not matter much as it will be rebalanced regularly anyway. In this case using the simpler and faster -b option to create the initial partition is adequate.

Known issues in parallel

See known issues in parallel.

Views
Navigation
communication