Project 03: Message Queue

Overview

The third project is to build a message queue client that interacts with a rudimentary pub/sub system using POSIX threads and network sockets via a RESTful API. In a pub/sub system there is usually one server and multiple clients:

As shown above, a typical pub/sub system has a server that maintains a collection of topics, which serve as endpoints for messages, and queues, which store the messages corresponding to individual clients or groups. Clients in the pub/sub system connect to this server and perform the following operations:

(1) SUBSCRIBE: This associates a queue to a particular topic.

In the example above, Client A sends an HTTP PUT command to subscribe the "Spidey" queue to the "Marvel" topic. This means that any messages sent to the "Marvel" topic will be automatically forwarded to the "Spidey" queue.

Note, clients can subscribe to as many topics as they wish. However, they will only receive messages after they have subscribed (any messages sent to the topic before they subscribe) will not be accessible.

(2) PUBLISH: This posts a message to a particular topic.

In the example above, Client B sends a HTTP PUT command to publish a message to the "Marvel" topic with the message body: "With great power, comes great responsibility". Internally, the pub/sub server will see that "Spidey" is subscribed to the the "Marvel" topic, and thus it will forward the message to the "Spidey" queue.

(3) RETRIEVE: This fetches one message in the queue.

In the example above, Client A sends a HTTP GET command to retrieve a message from the "Spidey" queue. Internally, the pub/sub server will fetch one message from the "Spidey" queue and return it as the response to the HTTP request.

Note, when clients retrieve a message but the corresponding queue is empty, then the pub/sub server will delay responding to the request until there is something in the queue. This means that performing a retrieve operation is a blocking action for the client.

Working in groups of one, two, or three people, you are to create a library that utilizes POSIX threads and concurrent data structures to implement a client for the described pub/sub system by midnight on Monday, October 8, 2018. Additionally, you are to utilize this library to build and demonstrate a distributed and parallel application of your choosing by Friday, October 12, 2018.

More details about this project and your deliverables are described below.

Publisher / Subscriber

As mentioned in class, pub/sub systems are used on many real-world platforms such as Google Cloud and Amazon Web Services. These systems allow developers to construct distributed and parallel applications that operate concurrently by utilizing both message passing and event-driven programming paradigms.

Protocol

The communication between the client and server utilizes HTTP to perform RESTful operations:

Operation	Description	Request/Response
Subscribe	This associates a `$QUEUE` to a particular `$TOPIC`.	Request PUT /subscription/$QUEUE/$TOPIC Response If `$QUEUE` does not exist, then the server will respond with a `404` HTTP status code and the message: There is no queue named: $QUEUE Otherwise, the server will respond with a `200` HTTP status code and the message: Subscribed queue ($QUEUE) to topic ($TOPIC)
Unsubscribe	This disassociates a `$QUEUE` to a particular `$TOPIC`.	Request DELETE /subscription/$QUEUE/$TOPIC Response If `$QUEUE` does not exist, then the server will respond with a `404` HTTP status code and the message: There is no queue named: $QUEUE Otherwise, the server will respond with a `200` HTTP status code and the message: Unsubscribed queue ($QUEUE) from topic ($TOPIC)
Publish	This posts a message `$BODY` to a particular `$TOPIC`.	Request PUT /topic/$TOPIC $BODY Response If there are no subscribers to `$TOPIC`, then the server will respond with a `404` HTTP status code and the message: There are no subscribers for topic: $TOPIC Otherwise, the server will respond with a `200` HTTP status code and the message: Published message ($BYTES bytes) to $SUBSCRIBERS subscribers of $TOPICS
Retrieve	This fetches a message `$BODY` from a particular `$QUEUE`.	Request GET /queue/$QUEUE Response If there is no `$QUEUE`, then the server will respond with a `404` HTTP status code and the message: There is no queue named: $QUEUE Otherwise, the server will respond with a `200` HTTP status code and the message `$BODY`: $BODY

Some notes about this protocol:

Communication should be done via streaming network sockets.
Each transaction is a HTTP request in the form:
```
$METHOD $URI HTTP/1.0\r\n
Content-Length: $BYTES\r\n
\r\n
$BODY
```
Most of the RESTful operations above only need the first line. For instance, to subscribe the "Spidey" queue to the "Marvel topic**, it is sufficient that we do:
```
PUT /subscription/Spidey/Marvel\r\n
\r\n
```
However, to publish a message, we must send the whole HTTP request format:
```
PUT /topic/Marvel HTTP/1.0\r\n
Content-Length: 44
\r\n
With great power, comes great responsibility
```
As fitting with the RESTful programming paradigm, the client will need to reconnect to the server for each operation.

Feel free to use curl or nc to play around with the either the client or server.

Server

Due to the limited time frame, a Python server is provided to you in the Project 03 repository:

# Usage
$ ./bin/mq_server.py --help
...

# Start Server on port 9123
$ ./bin/mq_server.py --port=9123

If you take a look at the server source code, you will see that it uses the Tornado framework, which provides event-based concurrency for overlapping compute and I/O. This means that the server will handle as many clients as system resources allow and will continuously process requests until the client disconnects (implicitly or explicitly).

Client

The main goal of this project is to create a client library (ie. lib/libmq_client.a) that communicates to a pub/sub server as described above. This library will be used by a variety of test programs and an user application of your own design.

Concurrency

As shown above, the client library should utilize multiple POSIX threads to enable concurrent publishing and retrieving messages. This means that at any one time, the library should be able to do the following things concurrently:

Publishing: The library should be able to publish any messages that have been queued up in an outgoing queue. That is, any PUBLISH or SUBSCRIBE requests should go to the outgoing queue rather than directly to the pub/sub server. It will be the job of a pusher thread to send the messages of the outgoing queue to the server.
Retrieving: The library should be able to retrieve any messages that the server has available to the client and place them into an incoming queue. That is, a puller thread should continuously retrieve messages from the pub/sub server and place them in the incoming queue.

Because of this, the library should have at least 2 POSIX threads. To coordinate data access between these threads, you should utilize concurrent data structures. This architecture will allow the client library to operate asynchronously and allow the user to setup their own event loop or to simply overlap compute and I/O in any fashion they want.

Back to the Queue

Hide most of the complexity of multi-threaded programming by using concurrent data structures such as the queue we created in Lecture 11: Condition Variables. Note, you can use any combination of locks, condition variables, and semaphores to synchronize the threads in your client library.

Tests

To test the client library, we have provided a variet of tests:

$ make test
Compiling src/client/socket.o
Compiling src/client/request.o
Compiling src/client/queue.o
Compiling src/client/client.o
Linking   lib/libmq_client.a
Linking   bin/test_queue_functional
Linking   bin/test_queue_unit
Linking   bin/test_echo_client
Testing   Queue (Unit)
Testing   Queue (Functional)
Testing   Echo Client

The test_queue_unit is a unit test for the concurrent queue you will be implementing. It also comes with a functional test (i.e. test_queue_functional) that tests the queue with multiple threads. Finally, we include a basic echo client test (ie. test_echo_client) that will use your message queue library to perform basic operations.

Feel free to augment or add to these tests to help verify the correctness of your library.

User Application

Once you have implemented the client library and are confident in your testing, you must create a distributed and parallel application that utilizes the pub/sub system. Here are some possible applications:

Real-time message service (ie. chat)
System monitoring service (ie. nagios)
Parallel data processing service (ie. filtering, aggregation)
Internet-of-Things messaging (eg. send messages between some Raspberry Pi's)
Web crawler.

The exact user application is up to you. However, the application must utilize your client library and must utilize multiple threads correctly.

Deliverables

As noted above, you are to work in groups of one or two (three is permitted, but discouraged) to implement libmq_client.a. You must use C99 (not C++) as the implementation language. Any test scripts or auxillary tools can be written in any reasonable scripting language.

Timeline

Here is a timeline of events related to this project:

Date	Event
Wednesday, September 26	Project description and repository are available.
Monday, October 8	Client library is due (pushed to GitLab to `master` branch).
Friday, October 12	Demonstrations of user application must be completed.

Repository

To start this project, one group member must fork the Project 03 repository on GitLab:

https://gitlab.com/nd-cse-30341-fa18/cse-30341-fa18-project03

Once this repository has been forked, follow the instructions from Reading 00 to:

Make the repository private.
Configure access to the repository

Make sure you add all the members of the team in addition to the instructional staff.

Source Code

As you can see, the base Project 03 repository contains a README.md file and the following folder hierarchy:

project03
    \_  Makefile        # This is the project Makefile
    \_  bin             # This contains the executables
    \_  include
        \_  mq          # This contains the mq client header files
    \_  lib             # This contains the mq client library
    \_  src
        \_  client      # This contains the mq client source code
        \_  tests       # This contains any test source code / scripts

You must maintain this folder structure for your project and place files in their appropriate place.

Compiling

To help you get started, we have provided you with a Makefile with all the necessary targets:

$ make                  # Builds lib/libmq_client.a
Compiling src/client/socket.o
Compiling src/client/request.o
Compiling src/client/queue.o
Compiling src/client/client.o
Linking   lib/libmq_client.a

$ make test             # Builds and runs test programs
Compiling src/tests/test_queue_functional.o
Linking   bin/test_queue_functional
Compiling src/tests/test_queue_unit.o
Linking   bin/test_queue_unit
Compiling src/tests/test_echo_client.o
Linking   bin/test_echo_client
Testing   Queue (Functional)
Testing   Queue (Unit)
Testing   Echo Client

$ make clean            # Removes all targets and intermediate objects
Removing  objects
Removing  libraries
Removing  test programs

Note, you will need to modify this Makefile to support your user application.

K.I.S.S.

While the exact organization of the project code is up to you, keep in mind that you will be graded in part on coding style, cleaniness, and organization. This means your code should be consistently formatted, not contain any dead code, have reasonable comments, and appropriate naming among other things:

Break long functions into smaller functions.
Make sure each function does one thing and does it well.
Abstract, but don't over do it.

Please refer to these Coding Style slides for some tips and guidelines on coding style expectations.

Running

As noted above, you are provided with a Python implementation of the pub/sub server. To run it, you just specify the port you want to use (by default it is 9620):

$ ./bin/mq_server.py --port=9456    # Start server on port 9456

Since you are only writing the a library, we have provided a simple test client application that uses your library called test_echo_client. Once the server is up and running, you can use the test program by doing the following:

$ ./test_echo_client localhost 9456 # Contact localhost on port 9456

Implementation

All of the C99 header files are in the include/mq folder while the C99 source code for the client librar is in the src/client directory. To help you get started, parts of the project are already implemented:

[~] include/mq/client.h     # MQ client header (mostly implemented)
[x] include/mq/logging.h    # MQ logging header (implemented)
[~] include/mq/queue.h      # MQ queue header (mostly implemented)
[x] include/mq/request.h    # MQ request header (implemented)
[x] include/mq/socket.h     # MQ socket header (implemented)
[x] include/mq/string.h     # MQ string header (implemented)
[x] include/mq/thread.h     # MQ thread header (implemented)
[ ] src/client/client.c     # MQ client implementation (not implemented)
[ ] src/client/queue.c      # MQ queue implementation (not implemented)
[ ] src/client/request.c    # MQ request implementation (not implemented)
[x] src/client/socket.c     # MQ socket implementation (implemented)

Basically, the socket code along with the basic code skeleton is provided to you. However, you must implement the client, queue, and request structures and functionality. Each of the functions in the incomplete files above have comments that describe what needs to be done.

You will need to examine these source files and complete the implementation of the message queue client library. To do so, you will first need to implement a basic concurrent Queue structure and utilize this in your MessageQueue client structure:

include/mq/client.h, include/mq/queue.h: While most of the headers are complete, these two are considered only mostly implemented because they lack any [mutexes], condition variables, or semaphores. That is, they do not currently utilize any synchronization primitives. You will need to determine which ones you wish to use and how.

To help simplify your code a bit, we have provided include/mq/thread.h which contains macros and type definitions that can help simplify your POSIX threads code. Feel free to either use these or ignore them.
src/client/request.c: This file contains the implementation of a Request structure which records the basic components of a HTTP request:

a. method: This is the HTTP method to perform (ie. GET, PUT, DELETE).

b. uri: This is the resource to access (ie. /topic/$TOPIC or /queue/$QUEUE)

c. body: This is the body of the HTTP message.
src/client/queue.c: This file contains the implementation of a concurrent Queue structure which implements a basic [monitor] for synchronized access to the Queue via push and pop operations. You will need to think carefully on how and when to use your synchronization primitives.
src/client/client.c: This file contains the implementation of the MessageQueue client structure which is the object the user will interface with. This is where you will define functions that wrap and implement the RESTful api described above. Likewise, this is where you will need to implement the POSIX threads that run in the background (ie. pusher and puller).

Demonstration

As part of your grade, you will need to present your user application to a TA where you will demonstrate the correctness of your pub/sub client library and the implementation of your user application.

Presentation

As part of your demonstration, you must provide a Google Drive presentation (between 5 - 10 slides) with the following content:

Design: An overview of the design of your user application.
Implementation: A summary of how you implemented the user application and utilized the pub/sub system.
Testing: A discussion on how you tested your client library and your end user application.
Paradigms: A discussion on how you used three different concurrency programming paradigms: message passing, threading, and events and what the advantages and disadvantages are (based on your project experience).
Summary: A summary of what you learned.

Note, you must incorporate images, graphs, diagrams and other visual elements as part of your presentation where reasonable.

Be prepared to be asked about different aspects of your project, as the TA may ask you questions to probe your understanding of the material and your work.

To arrange your demonstration time, please complete the form below:

Documentation

As noted above, the Project 03 repository includes a README.md file with the following sections:

Members: This should be a list of the project members.
Demonstration: This is where you should provide a Google Drive link to your demonstration slides.
Errata: This is a section where you can describe any deficiencies or known problems with your implementation.

You must complete this document report as part of your project.

Extra credit

Once you have completed your project, you may extend your implementation of libmq_client.a by performing either (or both) of the following modifications:

Graphical User Interface: Create a graphical application using a toolkit such as Qt that utilizes your libmq_client.a.
Scripting Interface: Create a language binding to your libmq_client.a so that you can utilize the library from a scripting language such as Python to replicate the test_echo_client program.

Each of these modifications is worth 1 Point of extra credit each.

Grading

Your project will be graded on the following metrics:

Metric	Points
Source Code General Builds and cleans without warnings or errors Uses system calls appropriately Manages resources such as memory and files appropriately Is consistent, readable, and organized On-time code submission Request Implements `request_create` appropriately Implements `request_delete` appropriately Implements `request_write` appropriately Queue Implements `queue_create` appropriately Implements `queue_delete` appropriately Implements `queue_push` appropriately Implements `queue_pop` appropriately Client Performs `SUBSCRIBE` properly Performs `UNSUBSCRIBE` properly Performs `PUBLISH` properly Performs `RETRIEVE` properly Successfully shutdowns properly Utilizes multiple threads for sending and receiving properly Utilizes concurrent data structures properly Free of concurrency bugs (race conditions, deadlocks, etc.) User Application Utilizes multiple threads properly Utilizes client library for communication properly Free of concurrency bugs (race conditions, deadlocks, etc.)	20.0 *3.0* 0.5 0.5 0.5 0.5 1.0 *2.0* 0.5 0.5 1.0 *5.0* 0.5 0.5 2.0 2.0 *7.0* 0.5 0.5 0.5 0.5 1.0 2.0 1.0 1.0 *3.0* 1.0 1.0 1.0
Demonstration Organization, Punctuality Design Implementation Testing Paradigms	3.5 1.0 0.5 0.5 0.5 1.0
Documentation `README.md`	0.5 0.5

Project 03: Message Queue

Overview

Publisher / Subscriber

Protocol

Request

Response

Request

Response

Request

Response

Request

Response

Server

Client

Concurrency

Back to the Queue

Tests

User Application

Deliverables

Timeline

Repository

Source Code

Compiling

K.I.S.S.

Running

Implementation

Demonstration

Presentation

Documentation

Extra credit

Grading