The most interesting for me function of the glibc library, fork().
What does it do? Well, a simple thing, a copy of a process that used this function.
Two basically identical processes are created, they have identical copies of data, identical code, access to all previously opened files ... And here is the beauty and magic of the fork() function.
It is a bit like in Science Fiction books. We have two identical astronaut clones after a failed teleportation. The astronaut appeared on a planet in a distant galaxy, but due to a hardware failure, he did not disappear on Earth.
They both have the same knowledge, experience, name, surname, common past until the moment of teleportation.
They will both do what they planned to do after teleporting. They both consider themselves the one and only original copy, not a clone of the original.
How does it work? Let's write a simple program:
#include <stdio.h> #include <stdlib.h> #include <stddef.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> main (argc, argv) { printf("START!\n"); fork(); printf("I am the parent process, my PID is %i\n", getpid()); }
What will we see on the screen after compiling and running it? Well, something like this:
root@mobile3:/home/rcp# ./test_fork START! I am the parent process, my PID is 9364 I am the parent process, my PID is 9365
"START" message is displayed only once, but "I am the parent process" twice!
"START" our process displayed before the fork() was executed.
The message "I am the process ..." was displayed after the fork(). Two processes were already running, both displayed it!
They even informed us about their PIDs, 9364 and 9365.
Yet, there is no big benefit from such cloning, even in Science Fiction books. The post fork() process needs to know if it is the parent or child process.
Then the parent process will be able to continue its main task, for example, checking the state of temperature sensors in the reactors, and the child process will be able to take care of time-consuming adding components to the reactor which has reached the correct temperature.
Fortunately, it is easy to achieve.
fork() returns 0 in the child process and PID of the child in the parent process (to be precise, it can also return -1 if it failed).
Let's do some minor modifications to our test program:
main (argc, argv) { int m; if (! (m = fork()) ) { printf("I am the child process, my PID is %i\n", getpid()); } else { printf("I am the parent process, I created the child process with PID %i\n", m); } }
After compiling and running the program, we can see that the processes figured out who is who:
root@mobile3:/home/rcp# ./test_fork I am the parent process, I created the child process with PID 10212 I am the child process, my PID is 10212
In the child process fork() returned 0, in the parent, 1022 (we kept this value in the m variable). This is how fork() was used to take pictures in the example camera server.
To prove that there are two processes, we will modify our test program again:
main (argc, argv) { printf("I am the parent process, I will do sleep for 10 seconds\n"); sleep(10); int m; if (! (m = fork()) ) { printf("I am the child process, my PID is %i\n", getpid()); printf("I am the child process, I will do sleep for 10 seconds\n"); fflush(0); sleep(10); } else { printf("I am the parent process, I created the child process with PID %i\n", m); printf("I am the parent process, I will do sleep for 10 seconds\n"); fflush(0); sleep(10); } }
We have added 10 seconds of sleep before and after fork(). This will give us time to stop the process (processes) and check that they are there (CTRL+Z from the keyboard and ^Z displayed on the screen).
root@mobile3:/home/rcp# ./test_fork I am the parent process, I will do sleep for 10 seconds ^Z [1]+ Stopped ./test_fork root@mobile3:/home/rcp# ps ax|grep test_fork 10952 pts/0 T 0:00 ./test_fork 10958 pts/0 S+ 0:00 grep test_fork root@mobile3:/home/rcp# fg ./test_fork I am the parent process, I created the child process with PID 10960 I am the parent process, I will do sleep for 10 seconds I am the child process, my PID is 10960 I am the child process, I will do sleep for 10 seconds ^Z [1]+ Stopped ./test_fork root@mobile3:/home/rcp# ps ax|grep test_fork 10952 pts/0 T 0:00 ./test_fork 10960 pts/0 T 0:00 ./test_fork 10963 pts/0 S+ 0:00 grep test_fork
We clearly see one test_fork process during the first sleep and two during the second.
Another experiment. This time, the child process will terminate immediately after displaying the message:
main (argc, argv) { int m; if (! (m = fork()) ) { printf("I am the child process, I am terminating\n"); } else { printf("I am the parent process, I will do sleep for 10 seconds\n"); fflush(0); sleep(10); } }
And what processes are we going to see now as the parent process does sleep?
root@mobile3:/home/rcp# ./test_fork I am the parent process, I will do sleep na 10 seconds I am the child process, I am terminating ^Z [1]+ Stopped ./test_fork root@mobile3:/home/rcp# ps ax|grep fork 11534 pts/0 T 0:00 ./test_fork 11535 pts/0 Z 0:00 [test_fork] <defunct> 11538 pts/0 S+ 0:00 grep fork
Still two! And yet the child process had ended long ago!
Pay attention to the letter in the 3rd column of the message displayed by the ps ax command.
This is the process state, T means that the process is stopped.
This is correct, we stopped it by pressing CTRL+Z.
But next to the second process, the child process, which was supposed to no longer exist, we have the Z letter.
It is a "zombie" process. It has finished but still exists in the system, because the parent process may want to know with what result it ended its work.
After the parent process is terminated, it will disappear, but we do not always want to wait.
The parent process may run even for many years, creating several hundred child processes in a second ...
How to deal with these zombies? It is easy, take care of its child.
When the child process is finished, the system sends a SIGCHLD signal to the parent process. Just handle this signal.
Another minor upgrade of our program:
catch_CHLD(int signal_num) { printf("I am handling the signal %i\n", signal_num); fflush(0); } main (argc, argv) { signal(SIGCHLD, catch_CHLD); int m; if (! (m = fork()) ) { printf("I am the child process, I am terminating\n"); } else { printf("I am the parent process, I will do sleep for 10 seconds\n"); fflush(0); sleep(10); } }
If the SIGCHLD signal is received, the process will interrupt and execute the catch_CHLD function code.
It can of course do something more useful than just displaying the message that the signal has arrived, but the fact that the parent process has handled SIGCHLD is enough for the child process to disappear from the system's process list. It will no longer be a zombie.
On the screen we will see:
root@mobile3:/home/rcp# ./test_fork I am the parent, I will do sleep for 10 seconds I am the child process, I am terminating I am handling the signal 17
If you compile and run the previous program, you will probably notice something strange.
This 10-second time during which the parent process was supposed to be doing sleep disappeared. The process stops working immediately.
Well, this is how the sleep() function works. The process remains inactive for a given period of time, but if a signal comes in, it interrupts. This is clearly described in the documentation for the GNU C library.
However, it returns the remaining time of sleep. If it is not very important for us to precisely count the sleep time, and the interrupts do not come very often, we can use this information:
main (argc, argv) { signal(SIGCHLD, catch_CHLD); int m; int sleep_time; if (! (m = fork()) ) { printf("I am the child process, I am terminating\n"); } else { printf("I am the parent process, I will do sleep for 10 seconds\n"); fflush(0); sleep_time = 10; while (sleep_time > 0) sleep_time = sleep (sleep_time); } }
If sleep() returns a value greater than 0, it will be called again for the time remaining until the end of the originally indicated time. Unfortunately, the value of time remaining returned by sleep() is expressed in seconds. 7 seconds might mean 6.65 or 7.28 seconds remaining. With a higher number of interrupts, these errors can accumulate. In this case, it is better to measure the time using the system clock, as in the example from the camera server program, or use the much more accurate nanosleep() function.
The child process receives its own copy of the data area and code of the parent process, as well as copies of open file descriptors and network connections.
It is not able to destroy the data of the parent program because it runs on its own copy.
It is obvious that the child process can change the contents of a file opened by the parent process and should be taken into account by the programmer.
In the case of open files and network connections, the situation is no longer so clear.
For example:
mysql_init (&mysql); mysql_real_connect (&mysql, "localhost", "rcp", "Akuku", "rcp", 0, NULL, 0) if (! (m = fork()) ) { // the child process, it does something, and after, it sends a query to the database sprintf(buf, "Select * from departments where idw > 100); mysql_query (&mysql, &buf[0]); } else { // the parent process, it does something, and after, also sends a query to the database sprintf(buf, "Select id_door from doors where nr_door=23"); mysql_query (&mysql, &buf[0]); }
If a database query from one process will be performed while the other process is performing a query, nothing good will come of it.
Both processes use the same MYSQL structure, and our program will 'explode'. The database engine does not expect two queries on the same connection at the same time.
If the inquiries will not happen at the same time, everything will be OK.
To avoid a technically difficult synchronization of queries from several processes, it is best to open a new connection with the database in the child process:
mysql_init (&mysql);
mysql_real_connect (&mysql, "localhost", "rcp", "Akuku", "rcp", 0, NULL, 0)
if (! (m = fork()) )
{
// the child process, it does something, and after, sends a query to the database
// it opens a new connection to the database:
mysql_init (&mysql_1);
mysql_real_connect (&mysql_1, "localhost", "rcp", "Akuku", "rcp", 0, NULL, 0)
sprintf(buf, "Select * from departments where idd > 100);
mysql_query (&mysql_1, &buf[0]);
}
else
{
// the parent process, it does something, and after, also sends a query to the database
sprintf(buf, "Select id_door from doors where nr_door=23");
mysql_query (&mysql, &buf[0]);
}