The basics of multi-process application development

1. To gain hands-on experience with fork(), exec(), and wait() system calls.
2. To master the basics of multi-process application development.
3. To appreciate the performance and fault-tolerance bene_ts of multi-process applications.
4. To implement a multi-process downloader application.

File downloaders are programs used for downloading _les from the Internet. In this assignment you will implement two different types of multi-process downloaders (i.e., _le downloaders that comprise multiple processes):
1. a serial _le downloader which downloads _les one by one.
2. a parallel _le downloader which dowloads multiple _les in parallel.
You will then compare the performance of the two types of downloaders.
Both downloaders will use the Linux wget program in order to perform the actual downloading.
The usage of the wget is simple: wget<FILE URL>. For example, running from command line the following command:
wget will download the Ubuntu Linux iso image to the current directory. Before proceeding with the assignment, you may want to take a moment to experiment with the wget command.
In your program, the parent process shall _rst read the _le, urls.txt, containing the URLs of the _les to be downloaded. urls.txt shall have the following format:
For example:
Next, the parent process shall fork the child processes. Each created child process shall use the execlp() system call to replace its executable image with that of the wget program. The two types downloaders are described in detail below.
The two downloaders shall be implemented as separate programs. The serial downloader program shall be called serial.c (or .cpp extension if you use C++). The parallel downloader program shall be called parallel.c (or .cpp extension if you use C++).

Serial Downloader

The serial downloader shall download _les one by one. After the parent process has read and parsed the urls.txt _le, it shall proceed as follows:
1. The parent process forks o_ a child process.
2. The child uses execlp("/usr/bin/wget", "wget", <URL STRING1>, NULL) system call in order to replace its program with wget program that will download the _rst _le in urls.txt (i.e. the _le at URL <URL STRING1>).
3. The parent executes a wait() system call until the child exits.
4. The parent forks o_ another child process which downloads the next _le speci_ed in urls.txt.
5. Repeat the above steps until all _les are downloaded.

Parallel Downloader

1. The parent forks o_ n children, where n is the number of URLs in urls.txt.
2. Each child executes execlp("/usr/bin/wget", "wget", <URL STRING>, NULL) system call where each <URL STRING> is a distinct URL in urls.txt.
3. The parent calls wait() (n times in a row) and waits for all children to terminate.
4. The parent exits.

Please note:
_ While the parallel downloader executes, the outputs from di_erent children may intermingle. This is acceptable.
_ fork.c _le posted on Titanium provides an example of using fork(), execlp(), and wait() system calls. Please feel free to modify it in order to complete the above tasks.
_ Please make sure to error-check all system calls. This is very important in practice and can also save you hours of debugging frustration. fork(), execlp(), and wait() will return -1 on error. Hence, you need to always check their return values and terminate your program
if the return value is -1. For example:
pi d t pid = f o r k ( )
i f ( pid< 0)
pe r r o r (" f o r k " ) ;
e x i t (??1);

The perror() function above will print out fork followed by the explanation of the error.
Performance Comparison
Use the time program to measure the execution time for the two downloaders. For example:
time ./serial
real 0m10.009s
user 0m0.008s
sys 0m0.000s

The column titled real gives the execution time in seconds. Please get the execution times for both downloaders using the following urls.txt _le:
Your execution times should be submitted along with your code (see the section titled "Submis-
sion Guidelines".

In your submission, please include the answers to the following questions (you may need to do some research):
1. In the output of time, what is the di_erence between real, user, and sys times?
2. Which is longer: user time or sys time? Use your knoweldge to explain why.
3. When downloading the _les above, which downloader _nishes faster? Why do you think that is?
4. Repeat the experiment for 10 _les (any reasonably large-sized _les, e.g., 100 MB, will do).
Is the downloader in the previous question still faster? If not so, why do you think that is?

Technical Details

The program shall be ran using the following command line:
Where <FILE NAME> is the name of the _le containing the strings, <NUMBER OF PROCESSES> is the number of child processes, and <KEY> is the string to search for. For example, ./multi-search strings.txtabcd 10 tells the program to split the task of searching for string abcd in _le string.txt amongst 10 child processes.

_ This assignment MUST be completed using C or C++ on Linux.
_ Please hand in your source code electronically
_ You must make sure that the code compiles and runs correctly.
_ Write a README _le (text _le, do not submit a .doc _le) which contains
Your name and email address.
The programming language you used (i.e. C or C++).
How to execute your program.
The execution times for both downloaders.
The answers to all questions above.
Whether you implemented the extra credit.
Anything special about your submission that we should take note of.
_ Place all your _les under one directory with a unique name (such as p1-[userid] for assignment 1, e.g. p1-m1).
_ Tar the contents of this directory using the following command. tarcvf [directory name].tar
[directory name] E.g. tar -cvf p1-m1.tar p1-m1/

