A server listens on a fixed port. (Port 80 is the standard, but your web server will have to choose a different port.) A client connects to that TCP port, and the server must accept the connection. The client sends an HTTP request stating what file it wishes to retrieve, along with the version of the protocol that it understands.
GET /index.html HTTP/1.0The server examines this request, and then sends a response header:
HTTP/1.0 200 OK Date: Tue, 11 Jan 2005 21:31:45 GMT Server: Apache/1.3.27 Connection: close Content-Type text/html...followed by an extran newline and the actual data of the file in question. After sending the file data it closes the connection. If you are curious, you can speak to web servers directly without an intervening browser by using the telnet tool. Try this to see the raw output of a web server:
% telnet www.cse.nd.edu 80 GET /index.html HTTP/1.0 (type return one more time)Most HTTP requests to the CSE web server are for static (unchanging) content stored in plain files. However, a URL can just as easily refer to dynamic (changing) content. The "file" portion of a URL can refer to a program that must be run to generate a web page on the fly. This would be common in a web server found at an online auction site. The web server might run a program that queries the auction database to determine the state of a sale and produce the appropriate web page. Most real-world web servers have a mix of static and dynamic content.
Note that there are several other ways in which a server can respond. If the client requests a file that does not exist, it will respond:
HTTP/1.0 404 Not FoundOr, if the client does not have access:
HTTP/1.0 403 ForbiddenOr, if the server wants to redirect the client elsewhere, it says:
HTTP/1.0 307 Temporary Redirection Location: http://other.server.edu/path
The main program must call tcp_listen to listen on a particular port number, and then in a loop, accept a connection with tcp_accept, handle the request, and then drop the connection with tcp_close. To handle each request, you must read the request line with tcp_readline, open the proper file, transmit it to the client, then close the file.
In single mode, the server will simply handle one request at a time, then close the connection, and wait for another connection to be accepted. In fork mode, the server should call fork to create a new child process to handle the request, then immediately close the connection and attempt to accept a new one.
Note that different files needed to be handled in different ways. Your server should accept requests for files ending in html or gif, and transmit them to the browser with a Content-Type of text/html or image/gif, respectively. If the user requests a file ending in cgi, then your server must instead execute the program and return its output with a Content-Type of text/plain. (This is easier than it sounds: look up the popen command.) If the web browser should request a file with any other extension, the web server should respond with a well-formed 403 Forbidden code. This is important, because it will prevent other people from using your web server to view your source code!
In a distributed system, it is absolutely vital that you detect and respond to errors. Your server must check the result of every operation that it attempts, and return an appropriate error message. For example, if the server cannot listen on the desired port, it should print a message to the standard output and exit. Or, if the server cannot provide the browser with the desired file, then it must return an appropriate error code to the web browser. There are other error conditions, and it is your job to identify and handle them all correctly!
I will provide you with a module for managing tcp connections. Brief documentation is given in the header file tcp.h, and we will discuss more of the details in class. Download the following files to get started:
printf, sprintf, sscanf, fopen, fread, fclose, popen, pclose, atoi, rand, srand, fork, exit
I recommend that you use extensive logging to the console with printf, so that you can see exactly what the browser sends to the server. The logging has no effect on the web browser, so you can leave it in your server permanently.
Turn in three files: webserver.c, webmux.c, and Makefile into the dropbox directory:
/afs/nd.edu/courses/cse/cse40771.01/dropbox/YOURNAME/a3Your grade will be based on the following: