(Warm Up) Project I: System Calls and Error Checking

CSE 34341 - Silicon Valley - Spring 2018

First, read the general instructions for assignments. Make sure you are using the correct compiler on the correct machine.

This project is a warm up assignment for the course. The basic concept is very simple: to write a program that copies a file from one place to another. However, the main challenge of engineering operating systems is dealing with errors and unexpected conditions. Thus, the main focus of this assignment will be on the correct handling of errors. The goals of this project are:

  • To review your knowledge of basic C programming.
  • To learn the most essential Unix system calls.
  • To gain experience in rigorous error handling.
  • Essential Requirements

    You will write a program called copyit which simply copies a file from one place to another. The program will be invoked as follows:
    copyit SourceFile TargetFile
    
    copyit must create an exact copy of SourceFile under the new name TargetFile. Upon successful completion, copyit should report the total number of bytes copied and exit with result zero. For example:
    copyit: Copied 38475 bytes from file foobar to bizbaz.
    
    If the copy takes longer than one second, then every second the program will emit a short message:
    copyit: still copying...
    copyit: still copying...
    copyit: still copying...
    
    If copyit encounters eny kind of error or user mistake, it must immediately stop and emit a message that states the program name, the failing operation, and the reason for the failure, and then exit with result 1. For example:
    copyit: Couldn't open file foobar: Permission Denied.
    copyit: Couldn't write to file bizbaz: Disk Full.
    

    If the program is invoked incorrectly, then it should immediately exit with a helpful message:

    copyit: Too many arguments!
    usage: copyit <sourcefile> <targetfile>
    
    In short, there should be no possible way to invoke the program that results in a segmentation fault, a cryptic message, or a silent failure. Either the program runs successfully to completion, or it emits a helpful and detailed error message to the user.

    System Calls

    To carry out this assignment, you will need to learn about the following system calls:
    open, creat, read, write, close, strerror, errno, exit, signal, alarm
    
    Manual pages ("man pages") provide the complete reference documentation for system calls. They are available on any Linux machine by typing man with the section number and topic name. Section 1 is for programs, section 2 for system calls, section 3 for C library functions. For example man 2 open gives you the man page for open. You can also use this online service which has the same information.

    As you probably already know, man pages are a little difficult to digest, because they give complete information about one call, but not how they all fit together. However, with a little practice, you can become an expert with man pages. Consider the man page for open. At the top, it tells you what include files are needed in order to use open:

    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    
    It also gives the the prototype for several variations of the system call:
    int open(const char *pathname, int flags);
    int open(const char *pathname, int flags, mode_t mode);
    int creat(const char *pathname, mode_t mode);
    
    The documentation contains many more details than you need, but it is usually a good bet to read the first paragraph, and then skim ahead until you find what you are looking for. It is also always fruitful to read the RETURN VALUE section, and to use the SEE ALSO section to look up related system calls.

    signal in particular is a little confusing, so it's easiest if I give an example. signal tells the operating system that you want a particular function to be called whenever a signal arrives. We are going to use the SIGALRM alarm signal to trigger the "still copying" message each second. So, you need to write a short function that displays a message when the signal arrives:

    void display_message( int s )
    {
    	printf("copyit: still copying...\n");
    }
    
    Then, inside of main, use signal to connect the function to SIGALRM:
    #include <signal.h>
    signal(SIGALRM,display_message);
    
    Now, read the man page for alarm, which arranges for the signal to be delivered in a certain amount of time. Note that you will need to call alarm more than once in your program...

    Handling Errors

    The basic outline of your program will look like this:
    open the source file
    create the target file
    loop {
            read a bit of data from the source file
            write a bit of data to the target file
    }
    close both files
    print success message
    
    In practice, it is going to be more complicated.

    An operating system (and any program that deals with it) must be vigilant in dealing with errors and unexpected conditions. End users constantly attempt operations that are illegal or make no sense. Digital devices behave in unpredictable ways. So, when you write operating system code, you must always check return codes and take an appropriate action on failure.

    How do you know if a system call succeeded or failed? In general read the RETURN VALUE section of the manual page, and it will tell you exactly how success or failure is indicated. However, nearly all Unix system calls that return an integer (open, read, write, close, etc.) have the following convention:

  • If the call succeeded, it returns an integer greater than or equal to zero.
  • If the call failed, it returns an integer less than zero, and sets the global variable errno to reflect the reason for the error.
  • And, nearly all C library calls that return a pointer (malloc, fopen, etc.) have a slightly different convention:
  • If the call succeeded, it returns a non-null pointer.
  • If the call failed, it returns null, and sets the global variable errno to reflect the reason for the error.
  • The errno variable is simply an integer that explains the reason behind an errno. Each integer value is given a macro name that makes it easy to remember, like EPERM for permission denied. All of the error types are defined in the file /usr/include/asm/errno.h. Here are a few of them:
    #define EPERM            1      /* Operation not permitted */
    #define ENOENT           2      /* No such file or directory */
    #define ESRCH            3      /* No such process */
    #define EINTR            4      /* Interrupted system call */
    ...
    
    You can check for specific kinds of errors like this:
    fd = open(filename,O_RDONLY,0);
    if(fd<0) {
    	 if(errno==EPERM) {
    		printf("Error: Permission Denied\n");
    	} else {
    		printf("Some other error.\n");
    		...
    	}
    	exit(1);
    }
    
    This would get rather tedious with 129 different error messages. Instead, use the strerror routine to convert errno into a string, and print out a descriptive message like this:
    fd = open(filename,O_RDONLY,0);
    if(fd<0) {
    	 printf("Unable to open %s: %s\n",filename,strerror(errno));
    	 exit(1);
    }
    
    With read and write, you must be especially careful. If you request to read count bytes of data like this:
    int result = read(fd,buffer,count);
    
    There are several possible outcomes. If read was able to access some data, it will return the number of bytes actually read. The number might be as high as count, but it could also be smaller. For example, if you request to read 4096 bytes, but there are only 40 bytes remaining in the file, read will return 40. If there is nothing more left in the file, read will return zero. If there is an error, the result will be less than zero, as above. write operates in a very similar way.

    There is still one more wrinkle. If your read or write operation is interrupted by a signal (e.g. the "still copying" message), the operating system will abort it prematurely. The call will return failure and set errno to EINTR, indicating that it was interrupted. In this case, the call should simply be tried again without changing any thing.

    So, your program should look like this overall:

    set up the periodic message
    
    open the source file or exit with an error
    create the target file or exit with an error
    
    loop over {
         read a bit of data from the source file
         if the read was interrupted, try it again
         if there was an error reading, exit with an error
         if no data left, end the loop
    
         write a bit of data to the target file
         if the write was interrupted, try it again
         if not all the data was written, exit with an error
    }
    close both files
    print success message
    

    Testing

    Make sure to test your program on a wide variety of conditions. Test small files and big files. Make sure to copy a really big file to make sure that your "still copying" message works and the copy recovers correctly from EINTR errors. How do you know if the file copy worked correctly? Use the program md5sum to take the checksum of both files, and double check that it matches:

    % md5sum /tmp/SourceFile
    b92891465b9617ae76dfff2f1096fc97  /tmp/SourceFile
    % md5sum /tmp/TargetFile
    b92891465b9617ae76dfff2f1096fc97  /tmp/TargetFile
    

    Grading

    Your grade will be based on:
  • Correct functioning of the file copy. (50%)
  • Correct handling of all error conditions. (40%)
  • Good coding style, such as clear formatting, sensible variable names, and useful comments. (10%)
  • Turning In

    Please review the general instructions for turning in.

    This assignment is due at 11:59PM on Thursday, 25 January. Late assignments are not accepted.

    Extra Credit

    For ten percent extra credit, write a second version of copyit that copies directories recursively. Turn in this version as copyit_extracredit.c.