Some details about functions
C is a procedural programming language. In C, functions act as procedures, and they are building blocks of a C program. So, it is important to know what they are, how they behave, and what is happening when you enter or leave a function. In general, functions (or procedures) are analogous to ordinary variables that store algorithms instead of values. By putting variables and functions together into a new type, we can store relevant values and algorithms under the same concept. This is what we do in object-oriented programming, and it will be covered in the third part of the book, Object Orientation. In this section, we want to explore functions and discuss their properties in C.
Anatomy of a function
In this section, we want to recap everything about a C function in a single place. If you feel this is familiar to you, you can simply skip this section.
A function is a box of logic that has a name, a list of input parameters, and a list of output results. In C and many other programming languages that are influenced by C, functions return only one value. In object-oriented languages such as C++ and Java, functions (which are usually called methods) can also throw an exception, which is not the case for C. Functions are invoked by a function call, which is simply using the name of the function to execute its logic. A correct function call should pass all required arguments to the function and wait for its execution. Note that functions are always blocking in C. This means that the caller has to wait for the called function to finish and only then can it collect the returned result.
Opposite to a blocking function, we can have a non-blocking function. When calling a non-blocking function, the caller doesn't wait for the function to finish and it can continue its execution. In this scheme, there is usually a callback mechanism which is triggered when the called (or callee) function is finished. A non-blocking function can also be referred to as an asynchronous function or simply an async function. Since we don't have async functions in C, we need to implement them using multithreading solutions. We will explain these concepts in more detail in the fifth part of the book, Concurrency.
It is interesting to add that nowadays, there is a growing interest in using non-blocking functions over blocking functions. It is usually referred to as event-oriented programming. Non-blocking functions are centric in this programming approach, and most of the written functions are non-blocking.
In event-oriented programming, actual function calls happen inside an event loop, and proper callbacks are triggered upon the occurrence of an event. Frameworks such as libuv
and libev
promote this way of coding, and they allow you to design your software around one or several event loops.
Importance in design
Functions are fundamental building blocks of procedural programming. Since their official support in programming languages, they have had a huge impact on the way we write code. Using functions, we can store logic in semi-variable entities and summon them whenever and wherever they are needed. Using them, we can write a specific logic only once and use it multiple times in various places.
In addition, functions allow us to hide a piece of logic from other existing logic. In other words, they introduce a level of abstraction between various logical components. To give an example, suppose that you have a function, avg
, which calculates the average of an input array. And you have another function, main
, which calls the function, avg
. We say that the logic inside the avg
function is hidden from the logic inside the main
function.
Therefore, if you want to change the logic inside avg
, you don't need to change the logic inside the main
function. That's because the main
function only depends on the name and the availability of the avg
function. This is a great achievement, at least for those years when we had to use punched cards to write and execute programs!
We are still using this feature in designing libraries written in C or even higher-level programming languages such as C++ and Java.
Stack management
If you look at the memory layout of a process running in a Unix-like operating system, you notice that all of the processes share a similar layout. We will discuss this layout in more detail in Chapter 4, Process Memory Structure, but for now, we want to introduce one of its segments; the Stack segment. The Stack segment is the default memory location where all local variables, arrays, and structures are allocated from. So, when you declare a local variable in a function, it is being allocated from the Stack segment. This allocation always happens on top of the Stack segment.
Notice the term stack in the name of the segment. It means that this segment behaves like a stack. The variables and arrays are always allocated on top of it, and those at the top are the first variables to get removed. Remember this analogy with the stack concept. We will return to this in the next paragraph.
The Stack segment is also used for function calls. When you call a function, a stack frame containing the return address and all of the passing arguments is put on top of the Stack segment, and only then is the function logic executed. When returning from the function, the stack frame is popped out, and the instruction addressed by the return address gets executed, which should usually continue the caller function.
All local variables declared in the function body are put on top of the Stack segment. So, when leaving the function, all Stack variables become freed. That is why we call them local variables and that is why a function cannot access the variables in another function. This mechanism also explains why local variables are not defined before entering a function and after leaving it.
Understanding the Stack segment and the way it works is crucial to writing correct and meaningful code. It also prevents common memory bugs from occurring. It is also a reminder that you cannot create any variable on the Stack with any size you like. The Stack is a limited portion of memory, and you could fill it up and potentially receive a stack overflow error. This usually happens when we have too many function calls consuming up all the Stack segment by their stack frames. This is very common when dealing with recursive functions, when a function calls itself without any break condition or limit.
Pass-by-value versus pass-by-reference
In most computer programming books, there is a section dedicated to pass-by-value and pass-by-reference regarding the arguments passed to a function. Fortunately, or unfortunately, we have only pass-by-value in C.
There is no reference in C, so there is no pass-by-reference either. Everything is copied into the function's local variables, and you cannot read or modify them after leaving a function.
Despite the many examples that seem to demonstrate pass-by-reference function calls, I should say that passing by reference is an illusion in C. In the rest of this section, we want to uncover this illusion and convince you that those examples are also pass-by-value. The following example will demonstrate this:
#include <stdio.h> void func(int a) { a = 5; } int main(int argc, char** argv) { int x = 3; printf("Before function call: %d\n", x); func(x); printf("After function call: %d\n", x); return 0; }
Code Box 1-29 [ExtremeC_examples_chapter1_17.c]: An example of a pass-by-value function call
It is easy to predict the output. Nothing changes about the x
variable because it is passed by value. The following shell box shows the output of example 1.17 and confirms our prediction:
$ gcc ExtremeC_examples_chapter1_17.c $ ./a.out Before function call: 3 After function call: 3 $
Shell Box 1-10: Output of example 1.17
The following example, example 1.18, demonstrates that passing by reference doesn't exist in C:
#include <stdio.h> void func(int* a) { int b = 9; *a = 5; a = &b; } int main(int argc, char** argv) { int x = 3; int* xptr = &x; printf("Value before call: %d\n", x); printf("Pointer before function call: %p\n", (void*)xptr); func(xptr); printf("Value after call: %d\n", x); printf("Pointer after function call: %p\n", (void*)xptr); return 0; }
Code Box 1-30 [ExtremeC_examples_chapter1_18.c]: An example of pass-by-pointer function call which differs from pass-by-reference
And this is the output:
$ gcc ExtremeC_examples_chapter1_18.c $ ./a.out The value before the call: 3 Pointer before function call: 0x7ffee99a88ec The value after the call: 5 Pointer after function call: 0x7ffee99a88ec $
Shell Box 1-11: Output of example 1.18
As you see, the value of the pointer is not changed after the function call. This means that the pointer is passed as a pass-by-value argument. Dereferencing the pointer inside the func
function has allowed accessing the variable where the pointer is pointing to. But you see that changing the value of the pointer parameter inside the function doesn't change its counterpart argument in the caller function. During a function call in C, all arguments are passed by value and dereferencing the pointers allows the modification of the caller function's variables.
It is worth adding that the above example demonstrates a pass-by-pointer example in which we pass pointers to variables instead of passing them directly. It is usually recommended to use pointers as arguments instead of passing big objects to a function but why? It is easy to guess. Copying 8 bytes of a pointer argument is much more efficient than copying hundreds of bytes of a big object.
Surprisingly, passing the pointer is not efficient in the above example! That's because of the fact that the int
type is 4 bytes and copying it is more efficient than copying 8 bytes of its pointer. But this is not the case regarding structures and arrays. Since copying structures and arrays is done byte-wise, and all of the bytes in them should be copied one by one, it is usually better to pass pointers instead.
Now that we've covered some details regarding the functions in C, let's talk about function pointers.