The goal of this homework assignment is to allow you to practice using
dynamic memory in C. The first activity involves building a
singly-linked list library, while the second activity requires you to use
this library to build and benchmark a lsort
utility similar to sort.
For this assignment, record your source code and any responses to the
following activities in the homework07
folder of your assignments
GitLab repository and push your work by 11:59 PM Friday, March 31, 2017.
As discussed in class, arrays in C are fast for random access, but have of the limitation of being fixed sized. When we need a data structure that can grow organically at run-time, we can turn to the singly-linked list.
For this first activity, you are to create singly-linked list library,
liblist.a
, which contains functions for creating a list, deleting a list,
adding elements to a list, dumping a list, converting a list to an array,
sorting a list, and reversing a list.
As explained in your data structures class, a singly-linked list is a data structure which consists of linked nodes. That is, it is a sequence container of node structures where each node consists of a value and a reference to the next node in the sequence as shown below:
For our library, we will be using a singly-linked list with a tail pointer and a stored size:
As with a normal singly-linked list, we will have a series of individual node structures which consist of a value and a reference to the next node.
In the diagram above, we have three individual node structures
holding the values 5
, 7
, and 4
. The 5
node has a reference to
the 7
node, while the 7
node has a reference to the 4
node.
Because there is nothing after the 4
node, the 4
node's next node
reference is simply the NULL
pointer.
This is useful because it allows us to iterate through our sequence using the following pseudo-code:
for (curr = head; curr != NULL; curr = curr.next) visit(curr)
To keep track of our nodes, we have a list structure that tracks the
head
of the list, the tail
of the list, and the size
of the list.
In the diagram above, head
references the first node structure,
which is the 5
node. Similarly, tail
references the last node
structure, which is the 4
node. Finally, size
has the value of 3
since the sequence container has 3
elements.
Because our singly-linked list has both a head
and tail
pointer, we
can insert new node structures to either the front or back of the
sequence container in constant time (ie. O(1)
).
Because we maintain the size
of the list in the list structure, we do
not need to iterate through the sequence container to compute this value
and thus can report the number of elements in constant time.
For this first activity, you are to implement this version of a singly-linked list.
To help you get started, the instructor has provided you with the following starter code:
# Go to assignments repository $ cd path/to/assignments/repository # Download starter code tarball $ curl -O http://www3.nd.edu/~pbui/teaching/cse.20289.sp17/static/tar/homework07.tar.gz # Extract starter code tarball $ tar xzvf homework07.tar.gz
Once downloaded and extracted, you should see the following files in your
homework07
directory:
homework07 \_ Makefile # This is the Makefile for building all the project artifacts \_ README.md # This is the README file for recording your responses \_ list.c # This is the C99 implementation file for the list structure \_ list.h # This is the C99 header file for the list structure \_ lsort.c # This is the C99 implementation file for the list sorting utility \_ node.c # This is the C99 implementation file for the node structure \_ node.h # This is the C99 header file for the node structure \_ test_list.c # This is the C99 implementation file for the list structure test program \_ test_lsort.sh # This is the shell script for testing the list sorting utility \_ test_node.c # This is the C99 implementation file for the node structure test program
The details on what you need to implement are described in the following sections.
node.h
and list.h
The node.h
file is the header file for the node structure used in our
implementation of a singly-linked list. Besides a few function prototypes,
this header defines a struct node
as:
struct node { char *string; int number; struct node *next; };
This means that each node consists of a string value , its corresponding integer value, and a reference to the next node.
It also contains a type definition node_compare
that defines a type for
comparison functions on struct nodes
:
typedef int (*node_compare)(const void *a, const void *b);
This will allow us to define functions which accept as node comparison
functions such as node_compare_string
and node_compare_number
as
arguments.
The list.h
file is the header file for the list structure used in our
implementation of a singly-linked list. Besides a few function prototypes,
this header defines a struct list
as:
struct list { struct node *head; struct node *tail; size_t size; };
This reflects the design describe above. Each struct list
has a
**reference to the head
of the list (ie. first element), the tail
of the
list (ie. last element), and the size
of the list (ie. number of elements).
For this task, you do not need to modify these header files. Instead, you should review them and ensure you understand the provided code.
Makefile
Once again, the Makefile
contains all the rules or recipes for building the
project artifacts (e.g. liblist.a
, lsort
, etc.). Although the provided
Makefile
contains most of the variable definitions and test
recipes, you
must add the appropriate rules for liblist.a
, test_node
, test_list
,
lsort
, and any intermediate objects. The dependencies for these targets
are shown in the DAG below:
Note: Although we are producing a static library in the form of
liblist.a
, we will not be statically linking our executables. Instead,
we will use the liblist.a
as another object file when we link our
executables.
You must use the CC
, CFLAGS
, LD
, LDFLAGS
, AR
, and ARFLAGS
variables when appropriate in your rules.
Once you have a working Makefile
, you should be able to run the following
commands:
# Build all TARGETS $ make Compiling lsort.o... Compiling node.o... Compiling list.o... list.c:151:15: warning: 'merge' defined but not used [-Wunused-function] struct node * merge(struct node *left, struct node *right, node_compare f) { ^~~~~ list.c:138:7: warning: 'split' defined but not used [-Wunused-function] void split(struct node *head, struct node **left, struct node **right) { ^~~~~ list.c:125:15: warning: 'msort' defined but not used [-Wunused-function] struct node * msort(struct node *head, node_compare f) { ^~~~~ list.c:102:14: warning: 'reverse' defined but not used [-Wunused-function] struct node* reverse(struct node *curr, struct node *prev) { ^~~~~~~ Linking liblist.a... Linking lsort... # Run all tests $ make test Compiling test_node.o... Linking test_node... Testing node... Segmentation fault ...
Depending on your compiler, you may see some warnings with the initial starter code. Likewise, the test programs will all fail in some fashion.
Note: To display messages rather than showing the full command you are executing for each recipe, you can use the following approach:
target: source @echo "Action $@..." @command $@ $<
Prefixing a command in the action portion of a recipe with @
will
suppress displaying the command (only the output will be displayed). If
you wish to actually see the commands, you can temporarily display them
by running make
with the -n
flag:
# Simulate build with tracing output $ make -n echo Compiling lsort.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o lsort.o lsort.c echo Compiling node.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o node.o node.c echo Compiling list.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o list.o list.c echo Linking liblist.a... ar rcs liblist.a node.o list.o echo Linking lsort... gcc -L. -o lsort lsort.o liblist.a
Note: the -n
has make simulate a build and display the actions it
would take. It does not actually perform the actions.
node.c
The node.c
file contains the C99 implementation for the node
structure. For this task, you will need to implement the following
functions:
struct node * node_create(char *string, struct node *next)
This function allocates a new
struct node
, sets itsstring
,number
, andnext
fields, and returns the allocated structure.
Notes:
struct node * node_delete(struct node *n, bool recursive)
This function deallocates the given
struct node
(including thestring
field). Ifrecursive
istrue
this function will also deallocate thenext
struct node
and any of its successors.
Notes: Althought the recursive
parameter implies you should use
recursion, you may wish to consider using an iterative implementation
instead to avoid stack overflow.
void node_dump(struct node *n, FILE *stream)
This function prints the
struct node
information to thestream
in the following format: "Node{string, number, next}".
int node_compare_number(const void *a, const void *b)
This function is used by sorting functions such as qsort to compare two
struct nodes
by theirnumber
fields.
Note: Although the parameters are const void *
, they are really
struct node**
. Since C lacks templates, it has to use void *
to
represent any type. Sorting implementations such as qsort will pass in
&element
to the comparison function, meaning each parameter is really a
struct node**
since each element is originally a struce node*
.
Hint: You can cast the void *
parameters to an appropriate type.
int node_compare_string(const void *a, const void *b)
This function is used by sorting functions such as qsort to compare two
struct nodes
by theirstring
fields.
As you implement node.c
, you can test it by running the test-node
target:
# Run test-node target $ make test-node Compiling test_node.o... Compiling node.o... Compiling list.o... Linking liblist.a... Linking test_node... Testing node... # View log $ cat test_node.log # Run test_node manually $ ./test_node
These tests run on the principle of silence is golden, which means you will
only get error messages when something fails (it will not say success
or
ok
).
Feel free to use valgrind to gdb to track down any memory errors.
list.c
The list.c
file contains the C99 implementation for the list
structure. For this task, you will need to implement the following
functions:
struct list * list_create()
This function allocates a new
struct list
, initializes its fields, and returns the allocate structure.
struct list * list_delete(struct list *l)
This function deallocates the given
struct list
.
void list_push_front(struct list *l, char *s)
This function adds a new
struct node
with the givens
string value to the front of thestruct list
.
Notes: Be sure to update the head
and tail
of the struct list
appropriately. Also, you should use the node_create
function.
void list_push_back(struct list *l, char *s)
This function adds a new
struct node
with the givens
string value to the back of thestruct list
.
Notes: Be sure to update the head
and tail
of the struct list
appropriately. Also, you should use the node_create
function.
void list_dump(struct list *l, FILE *stream)
This function iterates through the
struct list
and callsnode_dump
on each element.
struct node ** list_to_array(struct list *l)
This function converts the
struct list
into an array ofstruct node
pointers.
Notes: You should not create new nodes; rather, you should simply
have the pointers in the array reference the existing nodes in the
struct list
.
void list_qsort(struct list *l, node_compare f)
This function uses the qsort function to sort the
struct list
.
Hints:
You should use the list_to_array
function to create an array.
Remember to update the next
fields of each element after sorting.
Remember to update the head
and tail
fields of the struct list
.
void list_reverse(struct list *l)
This function reverses the
struct list
recursively using the internalreverse
function.
Hints:
Draw a picture on a piece of paper and do this by hand first.
reverse
takes in the current node and what its previous node is.
Because we are reversing the list, we want to end up saying the the
next
field of the current node is the previous node.
Remember to update the head
and tail
fields of the struct list
.
void list_msort(struct list *l, node_compare f)
This function performs merge sort, which is a recursive divide and conquer sorting algorithm, on the
struct list
using the internalmsort
,split
, andmerge
functions.
Hints:
Draw a picture on a piece of paper and do this by hand first.
The msort
function has the following pseudo-code:
# Divide split(head, &left, &right) # Conquer left = msort(left, f) right = msort(right, f) # Combine head = merge(left, right, f);
The split
function divides the provided list into left
and right
sublists by using the fast and slow pointer trick. Be sure to NULL
terminate your sublists.
The merge
function combines two sublists into a single ordered
list.
Remember to update the head
and tail
fields of the struct list
.
There are many examples of merge sort on linked lists. For instance, GeeksForGeeks has an article about Merge Sort for Linked Lists. You should resist the temptation to look at such sources.
If you do look at those examples, remember that any code you submit must be your own and that you should understand everything you submit.
Moreover, in this case, the example from
GeeksForGeeks is suboptimal as it
uses recursion to perform the merge
. You will need an iterative
version of merge
, otherwise, you run the risk of overflowing the
stack on large input sizes.
As you implement list.c
, you can test it by running the test-list
target:
# Run test-list target $ make test-list Compiling test_list.o... Linking test_list... Testing list... # View log $ cat test_list.log # Run test_list manually $ ./test_list Push front... Node{superman, 0, 0x13e4500} Node{wolverine, 0, 0x13e44c0} Node{batman, 0, 0x13e4480} Node{deadpool, 0, 0x13e4440} Node{spider-man, 0, 0x0} Push back... Node{spider-man, 0, 0x13e4500} Node{deadpool, 0, 0x13e44c0} Node{batman, 0, 0x13e4480} Node{wolverine, 0, 0x13e4440} Node{superman, 0, 0x0} QSort... Node{batman, 0, 0x13e4500} Node{deadpool, 0, 0x13e4540} Node{spider-man, 0, 0x13e4440} Node{superman, 0, 0x13e4480} Node{wolverine, 0, 0x0} Reversing... Node{wolverine, 0, 0x13e4440} Node{superman, 0, 0x13e4540} Node{spider-man, 0, 0x13e4500} Node{deadpool, 0, 0x13e44c0} Node{batman, 0, 0x0} MSort... Node{batman, 0, 0x13e4500} Node{deadpool, 0, 0x13e4540} Node{spider-man, 0, 0x13e4440} Node{superman, 0, 0x13e4480} Node{wolverine, 0, 0x0}
README.md
In your README.md
, respond to the following prompts:
Describe what memory you allocated in node_create
and what memory you
deallocated in node_delete
. How did you handle the recursive
flag?
Describe what memory you allocated in list_create
and what memory you
deallocated in list_delete
. How did you handle internal struct nodes
?
Describe how your list_qsort
function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Describe how your list_reverse
function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Describe how your list_msort
function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Once you have a working list library, you are to complete the lsort.c
program:
# Display usage $ ./lsort -h Usage: ./lsort -n Numerical sort -q Quick sort # Perform string sort using merge sort $ shuf -i1-10 -n 10 | ./lsort 1 10 2 3 4 5 6 7 8 9 # Perform numeric sort using merge sort $ shuf -i1-10 -n 10 | ./lsort -n 1 2 3 4 5 6 7 8 9 10 # Perform string sort using quick sort $ shuf -i1-10 -n 10 | ./lsort -q 1 10 2 3 4 5 6 7 8 9 # Perform numeric sort using quick sort $ shuf -i1-10 -n 10 | ./lsort -n -q 1 2 3 4 5 6 7 8 9 10
The lsort
program reads in data from standard input one line at a time,
builds a singly-linked list, and then uses either qsort or your custom
merge sort to order the data. The user may specify either a string or
numerical based ordering based on the the -n
flag.
lsort.c
The lsort.c
file is contains the C99 implementation of the list sorting
tool described above.
In addition to implementing command line parsing in the main
function and
you will need to implement the lsort
function:
void lsort(FILE *stream, bool numeric, bool quicksort)
This function reads one line at a time from the
stream
and builds a singly-linked list. It then sorts the list usinglist_qsort
ifquicksort
istrue
, otherwise it uses thelist_msort
. To compare elements, it usesnode_compare_number
ifnumeric
istrue
, otherwise it usesnode_compare_string
. Once the list is sorted, this function will print thestring
field of each element in the list.
test_lsort.sh
As you implement lsort.c
, you can test it by running the test-list
target:
# Run test-lsort target $ make test-lsort make test-lsort Compiling lsort.o... Compiling node.o... Compiling list.o... Linking liblist.a... Linking lsort... Testing lsort... Testing lsort output (strings & msort)... Testing lsort output (numbers & msort)... Testing lsort output (strings & qsort)... Testing lsort output (numbers & qsort)... Testing lsort memory (strings & msort)... Testing lsort memory (numbers & msort)... Testing lsort memory (strings & qsort)... Testing lsort memory (numbers & qsort)... # View log $ cat test_lsort.log # Run test_lsort.sh manually $ ./test_lsort.sh Testing lsort output (strings & msort)... Testing lsort output (numbers & msort)... Testing lsort output (strings & qsort)... Testing lsort output (numbers & qsort)... Testing lsort memory (strings & msort)... Testing lsort memory (numbers & msort)... Testing lsort memory (strings & qsort)... Testing lsort memory (numbers & qsort)...
benchmark.{sh,py}
Once you have a working lsort
list sorting utility, you are to write a
benchmark script in either shell or Python that measures the amount of
time and space it takes to numerically sort the following number
of items:
1, 10, 100, 1000, 10000, 100000, 1000000, 10000000
To help you measure the time and space a program uses, you can utilize the measure.c program written by David Chiang. Once you have downloaded and compiled this program, you can use it in the following manner:
$ shuf -i1-100000000 -n 10000000 | ./measure ./lsort -n > /dev/null
11.706000 seconds 611.625000 Mbytes
This means that the lsort
utility performed a numerical merge sort on
10000000
numbers in 11.7
seconds and used 611.6
megabytes of memory.
Your script, either benchmarch.sh
or benchmark.py
, should automate the
timing and measurement of these experiments to produce a markdown table
looks like the following:
| NITEMS | SORT | TIME | SPACE | | -------- | -------- |-----------|-----------| | 1 | Merge | 0.0 | 1.0 | | 1 | Quick | ... | ... | | ... | ... | ... | ... | | 10000000 | Merge | 11.7 | 611.6 | | 10000000 | Quick | ... | ... |
README.md
In your README.md
, include markdown table your benchmark script
generated and then answer the following questions:
Based on your experiments, what is the overall trend in terms of time and space as the number of items increases for both of your sorting methods?
Are the results surprising to you? Explain why or why not.
Given what you said about the space complexity and time complexity of
list_msort
and list_qsort
in Activity 1 and the experimental results
in Activity 2, what can you say is the relationship between theoretical
complexity and real world performance? Explain.
A great way to practice your data structures knowledge (say for preparing for a technical interview) is to work on some programming challenges. Hacker Rank is a website that allows users to practice all sorts of different types of programming concepts including data structures and algorithms.
For extra credit, you are to signup for Hacker Rank and solve at least three Linked Lists Challenges.
Afterwards, simply show the instructor or TA the Hacker Rank dashboard showing you successfully solved at least three challenges to receive credit.
To submit your assignment, please commit your work to the homework07
folder
in your assignments GitLab repository. Your homework07
folder should
only contain at the following files:
Makefile
README.md
benchmark.sh
or benchmark.py
list.c
list.h
lsort.c
node.c
node.h
test_list.c
test_lsort.sh
test_node.c