Readings

The readings for Monday, November 14 are:

  1. The Google File System

  2. MapReduce: Simplified Data Processing on Large Clusters

TL;DR

The focus of the reading is Google File System and Map-Reduce.

Questions

Once you have completed the readings, answer the following questions in the reading11/README.md file in your assignments repository:

  1. What problem is the Google File System trying to solve?

    • How does it solve this problem?

    • Describe how data is stored in Google File System.

    • Based on what you read, describe some data structures or algorithms that would be used in Google File System.

  2. What problem is Map-Reduce trying to solve?

    • How does it solve this problem?

    • Describe the three phases of a typical Map-Reduce workflow.

    • Based on what you read, describe some data structures or algorithms that would be used in Map-Reduce.

Submission

To submit your solution, you must initiate a Merge Request in your private assignments repository and assign it to the appropriate TA from the Reading 11 - TA assignment list.

Development Branch

To facility the Merge Request workflow, you must do your development in its own branch:

$ cd path/to/repo                 # Go to your repository
$ git checkout master             # Make sure we are on master branch first
$ git pull                        # Make sure we have changes from GitLab
$ git checkout -b reading11       # Create reading11 branch
...                               # Do your work
$ git commit                      # Commit your work (can do this multiple times)
$ git push -u origin reading11    # Push branch to GitLab