Overview

The first project is to build a file monitoring application called rorschach, which scans a root directory for any changes to files underneath this folder and executes user-specified actions on the files based on pattern matching rules.

For instance, suppose rorschach was monitoring the root folder /folder/to/monitor and a file named /folder/to/monitor/Foster the People/Helena Beat.mp3 was created. A scan of the root folder would detect that this file was created and check if there is a pattern rule that matches the name of the file.

For example, suppose there is a pattern rule that looks like this: CREATE *.mp3 mpv ${FULLPATH}. When the scan detects that the /folder/to/monitor/Foster the People/Helena Beat.mp3 is created, it will see that this file matches the *.mp3 pattern and will execute the mpv ${FULLPATH} command (ie. it will play the song).

Working in groups of one, two, or three people, you are to create this rorschach application by noon on Saturday, September 9. More details about this project and your deliverables are described below.

Who Watches the Watchmen?

This type of utility is called a file watching service and is quite useful. Many people use such applications to trigger actions (e.g. rebuild project, deploy code, etc.) based on different events (file modification, creation, or removal). For instance, Facebook has their own Watchman utility, while there is also fswatch and inotifywatch.

Scanning

To beginning scan, a user would start rorschach in the following manner:

# Scan current directory every 5 seconds
$ ./rorschach .
Monitoring /home/pbui/src/teaching/cse.30341.fa17/project01

This will start the rorschach program such that it scans the current directory every 5 seconds. The user may specify a different scanning frequency via the -t command-line option:

# Scan current directory every 60 seconds
$ ./rorschach -t 60 .
Monitoring /home/pbui/src/teaching/cse.30341.fa17/project01

Events

During the scan, rorschach will examine all the files under the root directory and any sub-directories (ie. nested directories), to determine if any of the following file system events occurred:

  1. CREATE: A file has been created under the monitored directory.

  2. MODIFY: A file has been modified under the monitored directory.

  3. DELETE: A file has been deleted under the monitored directory.

Note: On the initial scan, no action should be taken since rorschach doesn't know about any of the files yet, so it doesn't know if the file was created, modified, or deleted.

Rules

By default rorschach will load in pattern rules from the file rules unless otherwise specified by the -f flag as shown below:

# Scan current directory every 5 seconds with custom.rules
$ ./rorschach -f custom.rules .
Monitoring /home/pbui/src/teaching/cse.30341.fa17/project01

This rules file contails a series of rules (one per line) specified in the following format:

EVENT   PATTERN     ACTION

Any empty lines or lines that begin with a # in the rules file should be ignore or skipped. Any invalid lines (ie. do not match the format above) should display an error message such as "Invalid rule: ..." and quit the program.

For example, to automatically compile a C program when it has been modified, you can have the following rule:

# Compilation rule

MODIFY  *.c         cc -o ${BASEPATH} ${FULLPATH}

This rule means that when a MODIFY event is detected, any files that match the pattern *.c should execute the command cc -o ${BASEPATH} ${FULLPATH}.

Note: The pattern should be checked against both the full path of the file in question and its basename. If either the full path or the basename matches the pattern, then the rule is considered a match and the action should be executed.

If a user were to modify say the hello.c in the root directory, then rorschach would detect this file modification, match the rule above, print out the message below, and execute the corresponding action:

# Scan current directory every 5 seconds
$ ./rorschach .
Monitoring /home/pbui/src/teaching/cse.30341.fa17/project01
Detected "CREATE" event on "hello.c"        # Detect existence of hello.c

# Modify hello.c in another terminal
Detected "MODIFY" event on "hello.c"        # Detect modification
Matched "*.c" pattern on "hello.c"          # Detect pattern match
Executing action "cc -o ${BASEPATH} ${FULLPATH}" on "hello.c"

Note: You should including logging messages as shown above as part of the normal operation of rorschach.

Process System Calls

For this assignment, you must use low-level process system calls such as fork, exec, and wait. You cannot use functions such as system or popen.

Environment Variables

To allow actions to use information about the corresponding event, rorschach must pass the following [environmental variables] to specified command:

  1. BASEPATH: This is the base path of the file (i.e. without any proceeding extensions).

  2. FULLPATH: This is the full path of the file.

  3. EVENT: This is the type of event detected.

  4. TIMESTAMP: This is the current timestamp.

In the example above, the ${BASEPATH} and ${FULLPATH} variables will be expanded to hello.c and /home/pbui/src/teaching/cse.30341.fa17/project01/hello.c respectively.

SIGINT

Upon receiving the SIGINT signal, rorschach must cleanup any allocated resources (where possible) and exit gracefully:

$ ./rorschach .
Monitoring /home/pbui/src/teaching/cse.30341.fa17/project01

# Send Control-C
Cleaning up
Bye!

Usage

The full set of rorschach command-line options are show below:

$ ./rorschach -h
Usage: rorschach [options] ROOT

Options:
    -h          Print this help message
    -f RULES    Load rules from this file (default is rules)
    -t SECONDS  Time between scans (default is 5 seconds)

Reference Implementation

A reference implementation of rorschach can be found on AFS:

/afs/nd.edu/user15/pbui/pub/bin/rorschach

Feel free to play around with this in order to explore how rorschach should behave.

Deliverables

As noted above, you are to work in groups of one, two, or three to implement rorschach. You may use either C or C++ as the implementation language. Any test scripts or auxillary tools can be written in any reasonable scripting language.

Repository

To start this project, one group member must fork the Project 01 repository on GitLab:

https://gitlab.com/nd-cse-30341-fa17/cse-30341-fa17-project01

Once this repository has been forked, follow the instructions from Reading 00 to:

  1. Make the repository private

  2. Configure access to the repository

    Make sure you add all the members of the team in addition to the instructional staff.

Source Code

As you can see, the base Project 01 repository only contains a README.md file. Unlike in CSE.20289.SP17, you are to write all the source code for this project yourself.

A Find Solution

This project has many similarities with Project 01: Find from last semester's CSE.20289.SP17 class. You may wish to use this as a source of inspiration for this project.

That said, you must include a Makefile that builds and cleans up the project (and all its components):

$ make          # Builds rorschach

$ make clean    # Remove rorschach and any intermediate files

Keep in mind that while the exact organization of the code is up to you, but you will be graded in part on coding style, cleaniness, and organization. This means your code should be consistently formatted, not contain any dead code, have reasonable comments, and appropriate naming among other things.

Demonstration

As part of your grade, you will need to present your project face-to-face to a TA where you will demonstrate the correctness of your implementation and to discuss its efficiency. Additionally, the TA may ask you questions to probe your understanding of the material and your work.

The purpose of this demonstration is to provide you with better feedback and to help you grow as a programmer and computer scientist.

This presentation should happen within a week of the project deadline.

Documentation

As noted above, the Project 01 repository includes a README.md file with the following sections:

  1. Members: This should be a list of the project members.

  2. Design: This is a list of design questions that you should answer before you do any coding as they will guide you towards the resources you need.

  3. Testing: This is where you should briefly describe how your group tested and verified your project works.

  4. Analysis: These are a set of questions you should answer after the project is completed and which explore your implementation.

  5. Errata: This is a section where you can describe any deficiencies or known problems with your implementation.

  6. Extra Credit: This is for describing any extra credit you attempted.

You must complete this document report as part of your project.

Test Scripts

Tests scripts will be available from the TAs beginning next week. In order for you to receive these test scripts, you must show them your responses to the Design questions. If they find them reasonable, then the TAs will provide you with test scripts that you can add to your project repository and use to verify your implementation.

Extra Credit

Once you have completed your project, you many extend your implementation of rorschach by performing either (or both) of the following modifications:

  1. Concurrency: Currently, we use a single process to scan the directory and then execute the rules whenever we detect an event. Add some concurrency by splitting these two roles into separate processes: the parent process should scan the directory and then report any events to a child process via a pipe that will be in charge of executing the rules.

  2. Inotify: Rather than perform busy waiting, utilize inotify(7) to have Linux notify you when a file has changed.

Each of these modifications is worth 1 Point of extra credit each.

Rubric

Your project will be scored on the following metrics:

Metric Points
Source Code
  1. Correctness
    • Builds and cleans without warnings or errors
    • Handle command-line arguments
    • Logs operations, events, and actions
    • Periodically scans root directory
    • Parses rules
    • Detects file creation
    • Detects file modification
    • Detects file deletion
    • Matches patterns
    • Passes environmental variables
    • Executes actions
    • Cleans up on SIGINT
    • Uses appropriate system calls
    • Contains no detectable memory errors
  2. Efficiency
  3. Testing
  4. Demonstration
16.0
  1. 11.0
    • 0.5
    • 0.5
    • 0.5
    • 1.0
    • 1.0
    • 0.5
    • 0.5
    • 1.0
    • 0.5
    • 1.0
    • 1.0
    • 1.0
    • 1.0
    • 1.0
  2. 1.0
  3. 2.0
  4. 2.0
Documentation
  1. Design
  2. Analysis
  3. Testing & Errata
5.0
  1. 2.0
  2. 2.0
  3. 1.0
Miscellaneous
  1. Style
  2. Contribution
3.0
  1. 1.0
  2. 2.0

Memory Correctness

The correctness metric includes memory correctness along with meeting the requirements of the project (as described above). That is, you should not have any memory leaks or invalid accesses as would detected by Valgrind.