Overview

The fourth project is to build your own implementation of malloc and free. That is, you will need to implement a library that interacts with the operating system to perform heap management on behalf of a user process as demonstrated in class. Using the Lecture 15: Example Code as a starting point, you will need to add the following features:

  1. Implement splitting and coalescing of free blocks.

  2. Implement additional heap management strategies: Next Fit, Best Fit, and Worst Fit (First Fit has already been implemented for you).

    To support different strategies in the same source file, we will use C's #if directive to select different blocks of code at compile time.

  3. Add counters for tracking of the following events:

    • Number of times the user calls malloc successfully
    • Number of times the user calls free successfully
    • Number of times we reuse an existing block
    • Number of times we request a new block
    • Number of times we split a block
    • Number of times we coalesce blocks
    • Number blocks in free list
    • Total amount of memory requested
    • Maximum size of the heap

    This information should be reported whenever the program exits (hint: atexit) and should look like this:

    mallocs:   6
    frees:     6
    reuses:    2
    grows:     4
    splits:    1
    coalesces: 1
    blocks:    4
    requested: 6144
    max heap:  4096
    
  4. Create tests to check the correctness of your implementations.

  5. Create benchmarks to analyze the metrics above for the different strategies.

Note: We will not consider thread-safety or support multiple threads at all for this project.

Working in groups of one or two people, you are to create a library that implements malloc and free by midnight on Thursday, November 9, 2017. Additionally, you are benchmark and analyze your library by Tuesday, November 14, 2017.

More details about this project and your deliverables are described below.

Heap Management

Implementing your own heap management library is more common than you think. For instance, Firefox uses jemalloc, while GNU uses dlmalloc, and Google provides tcmalloc. As hardware and operating systems evolve, the optimal heap management strategies change as well, and so developers continue to advance the state of the art. For this project, we will focus on understanding the basics and getting something that works (good enough).

For a reference on how to create a simple heap management library, I recommend: A quick tutorial on implementing and debugging malloc, free, calloc, and realloc. This is what I used as the basis for the Lecture 15: Example Code.

Deliverables

As noted above, you are to work in groups of one or two (three is permitted, but discouraged) to implement your heap management library. You must use C (not C++) as the implementation language. Any test scripts or auxillary tools can be written in any reasonable scripting language.

Timeline

Here is a timeline of events related to this project:

Date Event
Sunday, October 29 Project description and repository are available.
Thursday, November 2 Test scripts are available from the TA (once you check off your design).
Thursday, November 9 Library is due (pushed to GitLab).
Tuesday, November 14 Demonstrations of library must be completed.

Repository

To start this project, one group member must fork the Project 04 repository on GitLab:

https://gitlab.com/nd-cse-30341-fa17/cse-30341-fa17-project04

Once this repository has been forked, follow the instructions from Reading 00 to:

  1. Make the repository private

  2. Configure access to the repository

    Make sure you add all the members of the team in addition to the instructional staff.

Source Code

As you can see, the base Project 04 repository contains a README.md file and the following folder hierarchy:

project04
    \_  Makefile        # This is the project Makefile
    \_  bin             # This contains the binary executables and scripts
    \_  lib             # This contains the malloc library when built
    \_  src             # This contains the malloc source code
        \_  malloc.c    # This contains the malloc implementation from Lecture 15

You must maintain this folder structure for your project and place files in their appropriate place.

To help you get started, we have provided a Makefile that will build the First Fit library for you:

# Build First First version of malloc library
$ make lib/libmalloc-ff.so
gcc -shared -fPIC -g -gdwarf-2 -std=gnu99 -Wall -DFIT=0 -o lib/libmalloc-ff.so src/malloc.c

Once you have the library, you can use it to override the existing malloc by using the LD_PRELOAD trick:

# Use our implementation of library
$ env LD_PRELOAD=lib/libmalloc-ff.so cat README.md

This will load our library first and thus link any calls to malloc and free in cat to our library rather than the standard C library. To make this more convenient, you should consider creating a bin/run.sh script that sets the environment variable for you and then runs the specified command.

In addition to the Makefile, we have also provided the initial implementation of malloc and free from the Lecture 15: Example Code. It contains TODOs to mark where you should consider modifying the library. Of course, you are free to do as you wish, as long as you implement the malloc and free API.

K.I.S.S.

While the exact organization of the project code is up to you, keep in mind that you will be graded in part on coding style, cleaniness, and organization. This means your code should be consistently formatted, not contain any dead code, have reasonable comments, and appropriate naming among other things:

Please refer to these Coding Style slides for some tips and guidelines on coding style expectations.

Testing

As you develop your library, you should create functional or unit tests that ensure your heap management library works correctly. These tests could take any of the following forms:

You must have both functional and unit tests for full credit.

Benchmarks

Once you have working implementations of the four heap management stratgies and you have added the ability to track and report the information described above, you should benchmark your library to answer the following questions:

  1. Which heap management strategy does the best job of reusing free blocks? Which one is the worst?

  2. Which heap management strategy requires the least amount of heap space? Which one is the worst?

  3. Which heap management strategy allows for the most splits? most coalescing? least?

  4. Which heap management strategy was the fastest? slowest?

  5. Consider the following applications: cat, md5sum, sort, du, and find.

    • Which one requires the most mallocs? frees?

    • Which one requests the most amount of space?

    • Which one requires the largest heap?

You must create scripts or additional programs that perform these benchmarks and collect the data.

Demonstration

As part of your grade, you will need to present your library to a TA where you will demonstrate the correctness of your malloc and free along with an analysis of the heap management strategies.

Presentation

As part of your demonstration, you must provide a presentation (between 5 - 10 slides) with the following content:

  1. Design: An overview of your library implementation.

  2. Testing: A discussion on how you tested your library.

  3. Benchmarks: A discussion on how you benchmarked your library and what the results were. You should address each of the questions above and have data and diagrams to backup your presentation.

  4. Analysis:: A discussion of the following questions:

    • Which heap management strategy suffers the most from fragmentation (what type)?

    • Which heap management strategy is the best?

    You should provide evidence for your choices.

  5. Summary: A summary of what you learned about memory management.

Note, you must incorporate images, graphs, diagrams and other visual elements as part of your presentation where reasonable.

Please upload these slides to Google Drive and place a link in your README.md.

Be prepared to be asked about different aspects of your project, as the TA may ask you questions to probe your understanding of the material and your work.

Documentation

As noted above, the Project 04 repository includes a README.md file with the following sections:

  1. Members: This should be a list of the project members.

  2. Design: This is a list of design questions that you should answer before you do any coding as they will guide you towards the resources you need.

  3. Demonstration: This is where you should provide a Google Drive link to your demonstration slides.

  4. Errata: This is a section where you can describe any deficiencies or known problems with your implementation.

You must complete this document report as part of your project.

Design and Implementation

You should look at the design questions in the README.md file of the Project 04 repository. The questions there will help you design and plan your process queue implementation.

Feel free to ask the TAs or the instructor for some guidance on these questions before attempting to implement this project.

Extra credit

Once you have completed your project, you may extend your implementation of malloc and free by performing either (or both) of the following modifications:

  1. Implement realloc and calloc, as some programs such as curl require those functions.

  2. Improve your coalescing mechanism so that it combines surrounding blocks rather than just the next block (when possible).

  3. Add the ability to shrink the heap when possible.

Each modification is worth 0.5 points each.

Grading

Your project will be graded on the following metrics:

Metric Points
Source Code
  1. General
    • Builds and cleans without warnings or errors
    • Manages resources such as memory and files appropriately
    • Uses system calls appropriately
    • Is consistent, readable, and organized
  2. Library
    • Implements split properly
    • Implements coalescing properly
    • Implements Next Fit properly
    • Implements Best Fit properly
    • Implements Worst Fit properly
    • Implements malloc tracker properly
    • Implements free tracker properly
    • Implements split tracker properly
    • Implements coalesce tracker properly
    • Implements existing block reuse tracker properly
    • Implements new block request tracker properly
    • Implements total amount of memory tracker properly
    • Implements maximum heap size tracker properly
    • Reports tracking information at end of program
  3. Testing
    • Provides functional tests
    • Provides unit tests
  4. Benchmarks
    • Provides scripts that automate data generation
    • Provides scripts that automate data collection
16.0
  1. 3.5
    • 1.0
    • 0.5
    • 1.0
    • 1.0
  2. 7.5
    • 1.0
    • 1.0
    • 1.0
    • 1.0
    • 1.0
    • 0.25
    • 0.25
    • 0.25
    • 0.25
    • 0.25
    • 0.25
    • 0.25
    • 0.25
    • 0.5
  3. 3.0
    • 1.5
    • 1.5
  4. 2.0
    • 1.0
    • 1.0
Demonstration
  1. Slides
  2. Testing
  3. Benchmarks
  4. Analysis
  5. Summary
6.0
  1. 1.0
  2. 1.0
  3. 2.0
  4. 1.0
  5. 1.0
Documentation
  1. Design
2.0
  1. 2.0