The goal of this homework assignment is to allow you to practice using dynamic memory in C. The first activity involves building a singly-linked list library, while the second activity requires you to use this library to build and benchmark a `lsort` utility similar to sort.

For this assignment, record your source code and any responses to the following activities in the `homework07` folder of your assignments GitLab repository and push your work by 11:59 PM Friday, March 31, 2017.

## Activity 1: Singly-Linked List Library (12 Points)

As discussed in class, arrays in C are fast for random access, but have of the limitation of being fixed sized. When we need a data structure that can grow organically at run-time, we can turn to the singly-linked list.

For this first activity, you are to create singly-linked list library, `liblist.a`, which contains functions for creating a list, deleting a list, adding elements to a list, dumping a list, converting a list to an array, sorting a list, and reversing a list.

As explained in your data structures class, a singly-linked list is a data structure which consists of linked nodes. That is, it is a sequence container of node structures where each node consists of a value and a reference to the next node in the sequence as shown below:

For our library, we will be using a singly-linked list with a tail pointer and a stored size:

• As with a normal singly-linked list, we will have a series of individual node structures which consist of a value and a reference to the next node.

In the diagram above, we have three individual node structures holding the values `5`, `7`, and `4`. The `5` node has a reference to the `7` node, while the `7` node has a reference to the `4` node. Because there is nothing after the `4` node, the `4` node's next node reference is simply the `NULL` pointer.

This is useful because it allows us to iterate through our sequence using the following pseudo-code:

```for (curr = head; curr != NULL; curr = curr.next)
visit(curr)
```
• To keep track of our nodes, we have a list structure that tracks the `head` of the list, the `tail` of the list, and the `size` of the list.

In the diagram above, `head` references the first node structure, which is the `5` node. Similarly, `tail` references the last node structure, which is the `4` node. Finally, `size` has the value of `3` since the sequence container has `3` elements.

• Because our singly-linked list has both a `head` and `tail` pointer, we can insert new node structures to either the front or back of the sequence container in constant time (ie. `O(1)`).

• Because we maintain the `size` of the list in the list structure, we do not need to iterate through the sequence container to compute this value and thus can report the number of elements in constant time.

For this first activity, you are to implement this version of a singly-linked list.

#### Starter Code

To help you get started, the instructor has provided you with the following starter code:

```# Go to assignments repository
\$ cd path/to/assignments/repository

\$ curl -O http://www3.nd.edu/~pbui/teaching/cse.20289.sp17/static/tar/homework07.tar.gz

# Extract starter code tarball
\$ tar xzvf homework07.tar.gz
```

Once downloaded and extracted, you should see the following files in your `homework07` directory:

```homework07
\_ Makefile        # This is the Makefile for building all the project artifacts
\_ list.c          # This is the C99 implementation file for the list structure
\_ list.h          # This is the C99 header file for the list structure
\_ lsort.c         # This is the C99 implementation file for the list sorting utility
\_ node.c          # This is the C99 implementation file for the node structure
\_ node.h          # This is the C99 header file for the node structure
\_ test_list.c     # This is the C99 implementation file for the list structure test program
\_ test_lsort.sh   # This is the shell script for testing the list sorting utility
\_ test_node.c     # This is the C99 implementation file for the node structure test program
```

The details on what you need to implement are described in the following sections.

### Task 0: `node.h` and `list.h`

The `node.h` file is the header file for the node structure used in our implementation of a singly-linked list. Besides a few function prototypes, this header defines a `struct node` as:

```struct node {
char  *string;
int    number;
struct node *next;
};
```

This means that each node consists of a string value , its corresponding integer value, and a reference to the next node.

It also contains a type definition `node_compare` that defines a type for comparison functions on `struct nodes`:

```typedef int   (*node_compare)(const void *a, const void *b);
```

This will allow us to define functions which accept as node comparison functions such as `node_compare_string` and `node_compare_number` as arguments.

The `list.h` file is the header file for the list structure used in our implementation of a singly-linked list. Besides a few function prototypes, this header defines a `struct list` as:

```struct list {
struct node *tail;
size_t       size;
};
```

This reflects the design describe above. Each `struct list` has a **reference to the `head` of the list (ie. first element), the `tail` of the list (ie. last element), and the `size` of the list (ie. number of elements).

For this task, you do not need to modify these header files. Instead, you should review them and ensure you understand the provided code.

### Task 1: `Makefile`

Once again, the `Makefile` contains all the rules or recipes for building the project artifacts (e.g. `liblist.a`, `lsort`, etc.). Although the provided `Makefile` contains most of the variable definitions and `test` recipes, you must add the appropriate rules for `liblist.a`, `test_node`, `test_list`, `lsort`, and any intermediate objects. The dependencies for these targets are shown in the DAG below:

Note: Although we are producing a static library in the form of `liblist.a`, we will not be statically linking our executables. Instead, we will use the `liblist.a` as another object file when we link our executables.

#### Makefile Variables

You must use the `CC`, `CFLAGS`, `LD`, `LDFLAGS`, `AR`, and `ARFLAGS` variables when appropriate in your rules.

Once you have a working `Makefile`, you should be able to run the following commands:

```# Build all TARGETS
\$ make
Compiling lsort.o...
Compiling node.o...
Compiling list.o...
list.c:151:15: warning: 'merge' defined but not used [-Wunused-function]
struct node * merge(struct node *left, struct node *right, node_compare f) {
^~~~~
list.c:138:7: warning: 'split' defined but not used [-Wunused-function]
void  split(struct node *head, struct node **left, struct node **right) {
^~~~~
list.c:125:15: warning: 'msort' defined but not used [-Wunused-function]
struct node * msort(struct node *head, node_compare f) {
^~~~~
list.c:102:14: warning: 'reverse' defined but not used [-Wunused-function]
struct node* reverse(struct node *curr, struct node *prev) {
^~~~~~~

# Run all tests
\$ make test
Compiling test_node.o...
Testing node...
Segmentation fault
...
```

Depending on your compiler, you may see some warnings with the initial starter code. Likewise, the test programs will all fail in some fashion.

Note: To display messages rather than showing the full command you are executing for each recipe, you can use the following approach:

```target:   source
@echo "Action \$@..."
@command \$@ \$<
```

Prefixing a command in the action portion of a recipe with `@` will suppress displaying the command (only the output will be displayed). If you wish to actually see the commands, you can temporarily display them by running `make` with the `-n` flag:

```# Simulate build with tracing output
\$ make -n
echo Compiling lsort.o...
gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o lsort.o lsort.c
echo Compiling node.o...
gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o node.o node.c
echo Compiling list.o...
gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o list.o list.c
ar rcs liblist.a node.o list.o
gcc -L. -o lsort lsort.o liblist.a
```

Note: the `-n` has make simulate a build and display the actions it would take. It does not actually perform the actions.

### Task 2: `node.c`

The `node.c` file contains the C99 implementation for the node structure. For this task, you will need to implement the following functions:

1. `struct node * node_create(char *string, struct node *next)`

This function allocates a new `struct node`, sets its `string`, `number`, and `next` fields, and returns the allocated structure.

Notes:

• You must check if memory allocation fails.

• You must allocate and copy the `string` (consider strdup).

• The `number` value comes from converting the `string` to an integer (consider use either atoi or strtol).

2. `struct node * node_delete(struct node *n, bool recursive)`

This function deallocates the given `struct node` (including the `string` field). If `recursive` is `true` this function will also deallocate the `next` `struct node` and any of its successors.

Notes: Althought the `recursive` parameter implies you should use recursion, you may wish to consider using an iterative implementation instead to avoid stack overflow.

3. `void node_dump(struct node *n, FILE *stream)`

This function prints the `struct node` information to the `stream` in the following format: "Node{string, number, next}".

4. `int node_compare_number(const void *a, const void *b)`

This function is used by sorting functions such as qsort to compare two `struct nodes` by their `number` fields.

Note: Although the parameters are `const void *`, they are really `struct node**`. Since C lacks templates, it has to use `void *` to represent any type. Sorting implementations such as qsort will pass in `&element` to the comparison function, meaning each parameter is really a `struct node**` since each element is originally a `struce node*`.

Hint: You can cast the `void *` parameters to an appropriate type.

5. `int node_compare_string(const void *a, const void *b)`

This function is used by sorting functions such as qsort to compare two `struct nodes` by their `string` fields.

As you implement `node.c`, you can test it by running the `test-node` target:

```# Run test-node target
\$ make test-node
Compiling test_node.o...
Compiling node.o...
Compiling list.o...
Testing node...

# View log
\$ cat test_node.log

# Run test_node manually
\$ ./test_node
```

These tests run on the principle of silence is golden, which means you will only get error messages when something fails (it will not say `success` or `ok`).

Feel free to use valgrind to gdb to track down any memory errors.

### Task 3: `list.c`

The `list.c` file contains the C99 implementation for the list structure. For this task, you will need to implement the following functions:

1. `struct list * list_create()`

This function allocates a new `struct list`, initializes its fields, and returns the allocate structure.

2. `struct list * list_delete(struct list *l)`

This function deallocates the given `struct list`.

3. `void list_push_front(struct list *l, char *s)`

This function adds a new `struct node` with the given `s` string value to the front of the `struct list`.

Notes: Be sure to update the `head` and `tail` of the `struct list` appropriately. Also, you should use the `node_create` function.

4. `void list_push_back(struct list *l, char *s)`

This function adds a new `struct node` with the given `s` string value to the back of the `struct list`.

Notes: Be sure to update the `head` and `tail` of the `struct list` appropriately. Also, you should use the `node_create` function.

5. `void list_dump(struct list *l, FILE *stream)`

This function iterates through the `struct list` and calls `node_dump` on each element.

6. `struct node ** list_to_array(struct list *l)`

This function converts the `struct list` into an array of `struct node` pointers.

Notes: You should not create new nodes; rather, you should simply have the pointers in the array reference the existing nodes in the `struct list`.

7. `void list_qsort(struct list *l, node_compare f)`

This function uses the qsort function to sort the `struct list`.

Hints:

• You should use the `list_to_array` function to create an array.

• Remember to update the `next` fields of each element after sorting.

• Remember to update the `head` and `tail` fields of the `struct list`.

8. `void list_reverse(struct list *l)`

This function reverses the `struct list` recursively using the internal `reverse` function.

Hints:

• Draw a picture on a piece of paper and do this by hand first.

• `reverse` takes in the current node and what its previous node is. Because we are reversing the list, we want to end up saying the the `next` field of the current node is the previous node.

• Remember to update the `head` and `tail` fields of the `struct list`.

9. `void list_msort(struct list *l, node_compare f)`

This function performs merge sort, which is a recursive divide and conquer sorting algorithm, on the `struct list` using the internal `msort`, `split`, and `merge` functions.

Hints:

• Draw a picture on a piece of paper and do this by hand first.

• The `msort` function has the following pseudo-code:

```# Divide

# Conquer
left  = msort(left, f)
right = msort(right, f)

# Combine
```
• The `split` function divides the provided list into `left` and `right` sublists by using the fast and slow pointer trick. Be sure to `NULL` terminate your sublists.

• The `merge` function combines two sublists into a single ordered list.

• Remember to update the `head` and `tail` fields of the `struct list`.

#### Online Code

There are many examples of merge sort on linked lists. For instance, GeeksForGeeks has an article about Merge Sort for Linked Lists. You should resist the temptation to look at such sources.

If you do look at those examples, remember that any code you submit must be your own and that you should understand everything you submit.

Moreover, in this case, the example from GeeksForGeeks is suboptimal as it uses recursion to perform the `merge`. You will need an iterative version of `merge`, otherwise, you run the risk of overflowing the stack on large input sizes.

As you implement `list.c`, you can test it by running the `test-list` target:

```# Run test-list target
\$ make test-list
Compiling test_list.o...
Testing list...

# View log
\$ cat test_list.log

# Run test_list manually
\$ ./test_list
Push front...
Node{superman, 0, 0x13e4500}
Node{wolverine, 0, 0x13e44c0}
Node{batman, 0, 0x13e4480}
Node{spider-man, 0, 0x0}
Push back...
Node{spider-man, 0, 0x13e4500}
Node{batman, 0, 0x13e4480}
Node{wolverine, 0, 0x13e4440}
Node{superman, 0, 0x0}
QSort...
Node{batman, 0, 0x13e4500}
Node{spider-man, 0, 0x13e4440}
Node{superman, 0, 0x13e4480}
Node{wolverine, 0, 0x0}
Reversing...
Node{wolverine, 0, 0x13e4440}
Node{superman, 0, 0x13e4540}
Node{spider-man, 0, 0x13e4500}
Node{batman, 0, 0x0}
MSort...
Node{batman, 0, 0x13e4500}
Node{spider-man, 0, 0x13e4440}
Node{superman, 0, 0x13e4480}
Node{wolverine, 0, 0x0}
```

### Task 4: `README.md`

In your `README.md`, respond to the following prompts:

1. Describe what memory you allocated in `node_create` and what memory you deallocated in `node_delete`. How did you handle the `recursive` flag?

2. Describe what memory you allocated in `list_create` and what memory you deallocated in `list_delete`. How did you handle internal `struct nodes`?

3. Describe how your `list_qsort` function works and analyze its time complexity and space complexity in terms of Big-O notation (both average and worst case).

4. Describe how your `list_reverse` function works and analyze its time complexity and space complexity in terms of Big-O notation (both average and worst case).

5. Describe how your `list_msort` function works and analyze its time complexity and space complexity in terms of Big-O notation (both average and worst case).

## Activity 2: List Sort (3 Points)

Once you have a working list library, you are to complete the `lsort.c` program:

```# Display usage
\$ ./lsort -h
Usage: ./lsort
-n   Numerical sort
-q   Quick sort

# Perform string sort using merge sort
\$ shuf -i1-10 -n 10 | ./lsort
1
10
2
3
4
5
6
7
8
9

# Perform numeric sort using merge sort
\$ shuf -i1-10 -n 10 | ./lsort -n
1
2
3
4
5
6
7
8
9
10

# Perform string sort using quick sort
\$ shuf -i1-10 -n 10 | ./lsort -q
1
10
2
3
4
5
6
7
8
9

# Perform numeric sort using quick sort
\$ shuf -i1-10 -n 10 | ./lsort -n -q
1
2
3
4
5
6
7
8
9
10
```

The `lsort` program reads in data from standard input one line at a time, builds a singly-linked list, and then uses either qsort or your custom merge sort to order the data. The user may specify either a string or numerical based ordering based on the the `-n` flag.

### Task 1: `lsort.c`

The `lsort.c` file is contains the C99 implementation of the list sorting tool described above.

In addition to implementing command line parsing in the `main` function and you will need to implement the `lsort` function:

1. `void lsort(FILE *stream, bool numeric, bool quicksort)`

This function reads one line at a time from the `stream` and builds a singly-linked list. It then sorts the list using `list_qsort` if `quicksort` is `true`, otherwise it uses the `list_msort`. To compare elements, it uses `node_compare_number` if `numeric` is `true`, otherwise it uses `node_compare_string`. Once the list is sorted, this function will print the `string` field of each element in the list.

### Task 2: `test_lsort.sh`

As you implement `lsort.c`, you can test it by running the `test-list` target:

```# Run test-lsort target
\$ make test-lsort
make test-lsort
Compiling lsort.o...
Compiling node.o...
Compiling list.o...
Testing lsort...
Testing lsort output (strings & msort)...
Testing lsort output (numbers & msort)...
Testing lsort output (strings & qsort)...
Testing lsort output (numbers & qsort)...
Testing lsort memory (strings & msort)...
Testing lsort memory (numbers & msort)...
Testing lsort memory (strings & qsort)...
Testing lsort memory (numbers & qsort)...

# View log
\$ cat test_lsort.log

# Run test_lsort.sh manually
\$ ./test_lsort.sh
Testing lsort output (strings & msort)...
Testing lsort output (numbers & msort)...
Testing lsort output (strings & qsort)...
Testing lsort output (numbers & qsort)...
Testing lsort memory (strings & msort)...
Testing lsort memory (numbers & msort)...
Testing lsort memory (strings & qsort)...
Testing lsort memory (numbers & qsort)...
```

### Task 3: `benchmark.{sh,py}`

Once you have a working `lsort` list sorting utility, you are to write a benchmark script in either shell or Python that measures the amount of time and space it takes to numerically sort the following number of items:

```1, 10, 100, 1000, 10000, 100000, 1000000, 10000000
```

To help you measure the time and space a program uses, you can utilize the measure.c program written by David Chiang. Once you have downloaded and compiled this program, you can use it in the following manner:

```\$ shuf -i1-100000000 -n 10000000 | ./measure ./lsort -n > /dev/null
11.706000 seconds       611.625000 Mbytes
```

This means that the `lsort` utility performed a numerical merge sort on `10000000` numbers in `11.7` seconds and used `611.6` megabytes of memory.

Your script, either `benchmarch.sh` or `benchmark.py`, should automate the timing and measurement of these experiments to produce a markdown table looks like the following:

```| NITEMS   | SORT     | TIME      | SPACE     |
| -------- | -------- |-----------|-----------|
| 1        | Merge    | 0.0       | 1.0       |
| 1        | Quick    | ...       | ...       |
| ...      | ...      | ...       | ...       |
| 10000000 | Merge    | 11.7      | 611.6     |
| 10000000 | Quick    | ...       | ...       |
```

### Task 4: `README.md`

In your `README.md`, include markdown table your benchmark script generated and then answer the following questions:

1. Based on your experiments, what is the overall trend in terms of time and space as the number of items increases for both of your sorting methods?

Are the results surprising to you? Explain why or why not.

2. Given what you said about the space complexity and time complexity of `list_msort` and `list_qsort` in Activity 1 and the experimental results in Activity 2, what can you say is the relationship between theoretical complexity and real world performance? Explain.

## Guru Point: Hacker Rank (1 Point)

A great way to practice your data structures knowledge (say for preparing for a technical interview) is to work on some programming challenges. Hacker Rank is a website that allows users to practice all sorts of different types of programming concepts including data structures and algorithms.

For extra credit, you are to signup for Hacker Rank and solve at least three Linked Lists Challenges.

Afterwards, simply show the instructor or TA the Hacker Rank dashboard showing you successfully solved at least three challenges to receive credit.

## Submission

To submit your assignment, please commit your work to the `homework07` folder in your assignments GitLab repository. Your `homework07` folder should only contain at the following files:

• `Makefile`
• `README.md`
• `benchmark.sh` or `benchmark.py`
• `list.c`
• `list.h`
• `lsort.c`
• `node.c`
• `node.h`
• `test_list.c`
• `test_lsort.sh`
• `test_node.c`