Introduction to Gerris Programming
From Gerris
Contents |
Introduction
This course is about how Gerris is programmed. It aims to teach the style of programming used in Gerris to an audience with limited experience of programming. The objective of the course is to understand the Gerris source code, gain the ability to program in this particular style, and adapt Gerris to one's needs.
This course is not about using Gerris or the numerical methods in Gerris (however it is necessary for students of this course to learn about them, for instance in the Journal of Computational Physics article).
Gerris is closely linked to GTS, the GNU triangulated surface library, also written by Stephane Popinet, and the programming styles of Gerris and GTS are analogous.
Prerequisites, references and supplementary reading material
- We assume that you have good knowledge of Gerris as a user. If however this was not the case, it is imperative that you master the usage of Gerris before starting to take this programming course. Read the Gerris Tutorial thoroughly and try the examples there, until Section 4. Then look at the Examples and Test suite pages, and try to adapt some of these, perhaps as suits your interests in fluid mechanics. When you get stuck, read the FAQ and then the archive of the Gerris mailing list. (Of course you have subscribed to the mailing list long ago.)
- This course is meant to serve, among other things, as an introduction to the Gerris/GTS object system.
- You may possibly want to know more about "object systems" written in C, although this is not strictly necessary for a first understanding of Gerris. You will find an introduction and in-depth discussion of the Gobject system in http://en.wikipedia.org/wiki/GObject.
- Introduction to C: it is required to have a good knowledge of C to understand Gerris. The course will point to what you may not know, but you will need more detailed presentations, in textbook form, of the language features you do not already know. No preferences here as to the best textbook. Use a recent edition of a good text and/or manual. Read the parts about pointers to functions, structures, and pointers to structures. We assume you know about types, precedence, arrays, pointers, functions and macros.
About the difficulty to read the code
Perhaps, as I am, you are used to code that is transparent: any time you read it, you understand quickly and exactly what it does. You can debug it just by looking at it. To be fair, this mostly happens with your code, and less frequently with the codes of others ... Then you plunge into Gerris and you are depressed.
There are five reasons you may get stuck when reading the code.
- You do not know the underlying concepts.
- In the case of the user interface, these concepts are the Objects used in the parameter files, such as GfsBox or GfsOutputSimulation. You learn about these concepts by learning to use the code, see the previous Section.
- For numerical schemes, the concepts are those described in the basic bibliography, for instance the numerical scheme for advection is described in the Journal of Computational Physics article.
- There are also useful concepts from computer science, such as heaps, piles, hash tables etc... See the bibliography on the GTS site.
- You do not know C well enough. See the previous Section.
- You miss the description of some functions or macros, or you do not understand what they are used for. You can find many useful descriptions and comments in the Reference Manual.
- You miss the meaning of some variables, structures or structure members. We will describe ways in which you can figure them out.
- You lack a general familiarity with the code and its style of programming. This course is supposed to provide you with it.
How (not) to plunge into the code
- The best place to get an idea about the code is to discover the most basic components of it. Arguably, the most basic component of Gerris is the FTT tree. When studying a component, it is best to start from the header functions, which give an idea about the structures and functions used. However there are two important things you should know:
- First of all you should distinguish between public objects and private ones. Public objects a kind of higher-level stuff, which is meant to be used in several parts of the code. Private objects are meant to stay hidden from the eyes of the programmer who works at the higher level. You do not need to know about private objects at this stage. This is an important distinction that is not easy to grasp immediately.
- Second, there are parts more basic and easier to understand than others. However, at the beginning, you shall need some guidance which this course is meant to provide. In this course, we try to lead you to the easier parts first.
- A mistake that beginners frequently make is to try to look at the part of the code where their simulation failed. For instance if your simulation failed when reading some line of the tutorial's vorticity.gfs file, you will look at where
(* GTS_OBJECT_CLASS (gfs_domain_class ())->parent_class->read) (o, fp);
is called. It is not a good idea: the read
and write
functions are not easy to understand without going through some preliminary concepts as in the first two sessions of this course.
- Another mistake is to read the code starting from
int main(int argc, char * argv[])
. This is a mistake because the beginning of the code is full of bookkeeping stuff and is not the most interesting from the point of view of understanding the general structure and peculiarities of Gerris/GTS.
- At this stage it is also not a good idea to read the Gerris reference manual in a sequential manner, chapter after chapter. The manual is indeed for Reference, when you want to find out about the definition of a function or class. Read the manual as you discover the basic functions of one component of gerris. We will point you to it when it becomes useful for you to look up the definitions of some objects.
To conclude, this course introduces you step by step to the most used programming constructs of Gerris. After working (not just reading) the course you should be able to go back to the code and understand large portions of it just by reading the code.
Keep essential information nearby
- Keep a good C text nearby; for authoritative detail see The GNU C Reference Manual.
- A lot of macros and functions such
g_assert
come from the Glib, so keep a bookmark to the Glib documentation.
C basics
Pointers
We assume you are familiar with pointers. In modern computer science, pointers are considered a technique to reference data. A reference is an object containing information which refers to data stored elsewhere, as opposed to containing the data itself. Accessing the value referred to by a reference is called dereferencing it.
Introduction to Structures
This part of the course is intended for those who have no experience programming with structures. A very simple example:
struct Point {
char name;
double x, y;
};
what we have written is a structure template named point.
An example of usage
struct Point {
char name;
double x, y;
};
main()
{
struct Point my_point; /* declaration */
my_point.x = 0. ;
my_point.y = 1.;
my_point.name = ‘A’;
}
The variable my_point is sometimes also called a structure. However we should distinguish between the structure template called Point and the variable my_point. In this example, my_point is a variable of type struct, with template named Point and whose name is my_point, and Point is the tag or name of a structure template.
Programmers often operate on variables that are pointers to structure variables instead of structure variables. There are several reasons for that. One is that it is more efficient to pass a pointer to a structure in a function: it avoids the need to make a copy of the entire structure data to pass it to the function. Thus the following example:
struct Point symmetric_point(struct Point * my_point)
{
struct Point new_point;
new_point.x = - my_point->x;
new_point.y = - my_point->y;
new_point.name = my_point->name;
return new_point;
}
main()
{
struct Point my_point;
my_point.x = 0. ;
my_point.y = 1.;
my_point.name = ‘A’;
struct Point A_point = symmetric_point(&my_point);
}
In that case, the members of the structure are indicated by an arrow sign "->".
There are similar structures in Gerris . In ftt.h :
struct _FttVector {
gdouble x, y, z;
};
Note that the gerris code uses the type gdouble from the glib library.
A similar structure exists in the gts library. In gts-stable/src/gts.h :
struct _GtsPoint {
...
gdouble x, y, z;
};
The dots indicate extra code that need not concern us at present.
Operations with structures
Below one defines several structures defining geometric figures.
typedef struct _Quadrangle Quadrangle;
typedef struct _Square Square;
typedef struct _Triangle Triangle;
struct _Quadrangle {
struct Point A, B, C, D;
};
struct _Square {
struct Point A, B, C, D;
};
struct _Triangle {
struct Point A, B, C;
};
Notice that we use typedef: this creates an "alias" for the structure name. This allows to define Quadrangle without typing "struct". There are other advantages of the use of typedef that we will discover later.
Structures often need to be initialized. For instance we could initialize a point as follows
struct Point * init_Point()
{
struct Point * Point_ptr
= (struct Point *) malloc (sizeof (struct Point));
Point_ptr->x = 0.;
Point_ptr->y = 0.;
Point_ptr->name = 'A';
return Point_ptr;
}
This allocates dynamically the memory space needed to hold the data in the variable pointed to by Point_ptr, and gives some value to its members. Of course these values are somewhat arbitrary and it would probably be more convenient to be able to specify the values in the argument list of the function init_Point_ptr:
struct Point * init_Point(double x, double y, char c)
{
struct Point * ptr
= (struct Point *) malloc (sizeof (struct Point));
ptr->x = x;
ptr->y = y;
ptr->name = c;
return ptr;
}
Our other structures will also have similar initialization functions, for instance
Quadrangle * init_Quadrangle(struct Point A, struct Point B, struct Point C, struct Point D)
{
Quadrangle * ptr
= (Quadrangle *) malloc (sizeof (Quadrangle));
ptr->A = A;
ptr->B = A;
ptr->C = C;
ptr->D = D;
return ptr;
}
It seems reasonable to assume that all the structures we use will need some kind of initialisation. So there will be an init_Square, init_Triangle and so on.
The init functions are only the beginning. We may have for instance area functions that compute the area of geometrical figures, so we would have an area_Quadrangle, area_Square and area_Triangle.
double area_Triangle(Triangle * T)
{
double x1, y1, x2, y2;
x1 = T->A.x - T->B.x;
y1 = T->A.y - T->B.y;
x2 = T->A.x - T->C.x;
y2 = T->A.y - T->C.y;
return ABS( x1*y2 - y1*x2 )/2.;
}
Public and Private parts
We could conceivably work with our geometrical structures above in the following way: we would do all we have to do (creating figures, transforming them, drawing them etc...) with functions such as init_triangle, area_triangle etc... without ever accessing the member points or coordinates in any other way.
Only the functions would access the inside of the structures.
This is one of the characteristics of object-oriented programming: there is a public part, the tags of the structure template, the calling sequence of the various functions, and a private part.
If you look at the source code, you will often find things like /*< private >*/
struct _FttCell {
/*< public >*/
guint flags;
gpointer data;
/*< private >*/
struct _FttOct * parent, * children;
};
One way to look at it is to view the code as having two levels: a high-level part, in which we use functions that operate on structures such as init_xxx and the <public>
members of the structures, and the low-level part, in which operations on the private members are performed. In many cases in Gerris, the entire membership of a structure template is meant to be private. It is important to understand this concept. Some aspects of the English language may help us. In English, private part has a dual meaning. It is not socially acceptable (in most circumstances) to touch someone else's "private parts". Similarly, the programmer of an object-oriented language does not want you to access the members of its structures ...
One of the motivations for the various levels is that one may decide to change the way in which the various objects are represented inside the structures. If such a change is performed but only high-level functions are used in a piece of code, this piece of code will continue working even with the new implementation. For instance we may decide to represent points with polar instead of cartesian coordinates. This would affect all operation which rely on the structure to have members "x" and "y", but not those which use the functions area_xxx etc...
In languages such as C++ the access to private members of classes is guarded by the language. This is called encapsulation, or hiding implementation . In C this is not provided, but there is a way to achieve this effect, as shown in the next section.
Inheritance
So far, we have seen that our structures have public and private parts and that they have a list of functions associated to them. These functions are often similar to each other: for instance all the init functions are similar, and the area functions could also probably be made to derive from a single expression. These functions are the methods associated with this particular structure template. Methods are a concept of Object-Oriented programming, so we could say we have an elementary object system with elementary methods.
To further improve our object system, we could imagine having a Polygon structure template, so that all the structure templates would somehow be built on top of it. This is the parent/child or inheritance structure of Object-Oriented languages.
typedef struct _Polygon Polygon;
typedef struct Point Point_t;
struct _Polygon
{
int number_of_vertices;
Point_t * vertices;
};
More about pointers
As a short aside, we notice that we refer to the vertices by a pointer to a type Point_t . We can also refer in structures to pointers to structures without defining the structure previously:
struct _Polygon
{
int number_of_vertices;
struct Point * vertices;
};
struct Point
{
double x, y;
char name;
};
If instead we referred to a type which is struct and not a pointer to a struct, we would need to define it before. The following does not work:
struct _Polygon
{
int number_of_vertices;
struct Point * vertices;
struct Point useful_point; /* error: incomplete type */
};
struct Point
{
double x, y;
char name;
};
This is useful to understand the use of pointers! In fact, most of the variables passed to functions and macros in Gerris are pointers.
Elementary inheritance system
Obviously Triangle will share many properties of Polygon. We say it inherits these properties, and that Polygon is the parent of Triangle. How do we make Triangle a child of Polygon? We can do it as follows:
struct _Triangle
{
Polygon parent;
};
Actually our "child" structure is trivial: it need not hold any extra data in addition to the parent data. However, we may have extra methods: only the triangle has an incircle (it is a circle inside the triangle and tangent to all its faces), the general polygon does not. So we would have a new method
Point_t incircle_triangle(Triangle * t)
{
...
}
An example of child with extra data, we may want to have an colored polygon. Suppose we have defined a type (or structure) Color. (We could implement it as RGB values for example.)
struct _Colored_Polygon
Polygon parent;
Color color;
};
Type cast
Suppose we want to write a general area function
double area(Polygon * p)
{
...
}
How can we apply it to a Triangle t? We cannot do
Triangle * t = init_triangle();
double A = area( t ) ;
because t is of type Triangle and area expects an object of type Polygon. What we can do instead is
Triangle * t = init_triangle();
Polygon * p = (Polygon *) ( t );
double A = area( p );
we have used a very useful feature of C: we have cast the Triangle variable t to a Polygon variable p. The cast operation does not rearrange the data in the memory allocated to the variable. So the area function will expect to find it where it is in the Polygon template, for instance, number of vertices first. Thus it is important that the parent is the first member of the structure:
struct _Colored_Polygon
Polygon parent;
/* The parent should be the first object in the structure.*/
/* The whole object-inheritance mechanism in C relies on this data alignment. */
/* Extra data are here */
Color color;
};
We have been able to build an elementary object system in C, with encapsulation (public and private parts are distinguished), methods and inheritance. Inheritance, in particular, uses a specific cast operation. However, the use of these concepts is not yet systematic. We shall see later how Gerris/GTS organizes these concepts and operations. For a first look at Gerris, our current knowledge of structures is sufficient.
Types and "typedef"
This section is more advanced and may be omitted on first reading.
We shall see that typedef is very frequently used in Gerris. This is not only to save us the work of typing struct before each structure type. Typedef may be used to provide an incomplete definition of a structure or union type, providing the type name but not the contents. Such an incomplete type cannot be directly used, but pointers to such a type can be used (but not dereferenced). You can use this to implement a kind of encapsulation, that is, to hide implementation. Suppose the following is in a file Triangle.h :
/* Hiding implementation in C. */
/* Triangle: Header */
typedef struct _Triangle_hidden Triangle;
Triangle * init_Triangle();
double area_Triangle(Triangle * T);
and the following in another file Triangle.c:
/* Triangle: Object */
struct _Triangle_hidden {
struct Point A, B, C; /* hidden template for Triangle */
};
Triangle * init_Triangle()
{
Triangle * ptr = (Triangle *) malloc( sizeof( Triangle) );
ptr->A = *init_point(0.,0.,'A');
ptr->B = *init_point(0.,1.,'B');
ptr->C = *(init_point(1.,0.,'C'));
return ptr;
}
#define ABS(x) ((x) > 0 ? (x) : - (x))
double area_Triangle(Triangle * T)
{
double x1, y1, x2, y2;
x1 = T->A.x - T->B.x;
y1 = T->A.y - T->B.y;
x2 = T->A.x - T->C.x;
y2 = T->A.y - T->C.y;
return ABS( x1*y2 - y1*x2 )/2.;
}
Notice the Header and Object comments that we shall find in Gerris and GTS as well. Notice that the header Triangle.h declares the type Triangle without describing the contents of this structure.
Code including the Triangle.h header above can manipulate objects of type Triangle only through pointers (This is similar to the restriction on incomplete types, see previous section) and only using the provided functions. Code using this header cannot get access to the actual declaration of the _Triangle_hidden structure. In fact, if Triangle.c is compiled into a library and distributed that way, the programmer can't even cheat and read the code.
See IBM developer world article Everything you ever wanted to know about C types, Part 1 for more details.
This type of encapsulation is seldom used in Gerris, but it helps to reflect on the distinction between public and private parts.
Reading assignments and exercises
- Check that you have all the prerequisites that this course demands. For instance, read the Journal of Computational Physics article if you have not yet done so.
- Write a short program that implements the
Triangle
andPoint
objects, initializes them and prints the area of aTriangle
objectt
.
- Write a function that initializes the
Triangle
in the case where it has the parentPolygon
.
- Many object systems implement a
destroy
operation for objects. In the case ofTriangle
, what would that be?
- (Advanced question) How could we implement a function that checks that an object variable
t
is of typeTriangle
? (If you can answer that question you can probably teach the class.)
⇑ up: Gerris Flow Solver Programming Course for Dummies, ⇒ next: The Fully Threaded Tree.