The goal of this homework assignment is to allow you to practice using
dynamic memory in C. The first activity involves building a
singly-linked list library, while the second activity requires you to use
this library to build and benchmark a lsort utility similar to sort.
For this assignment, record your source code and any responses to the
following activities in the homework07 folder of your assignments
GitLab repository and push your work by 11:59 PM Friday, March 31, 2017.
As discussed in class, arrays in C are fast for random access, but have of the limitation of being fixed sized. When we need a data structure that can grow organically at run-time, we can turn to the singly-linked list.
For this first activity, you are to create singly-linked list library,
liblist.a, which contains functions for creating a list, deleting a list,
adding elements to a list, dumping a list, converting a list to an array,
sorting a list, and reversing a list.
As explained in your data structures class, a singly-linked list is a data structure which consists of linked nodes. That is, it is a sequence container of node structures where each node consists of a value and a reference to the next node in the sequence as shown below:
For our library, we will be using a singly-linked list with a tail pointer and a stored size:
As with a normal singly-linked list, we will have a series of individual node structures which consist of a value and a reference to the next node.
In the diagram above, we have three individual node structures
holding the values 5, 7, and 4. The 5 node has a reference to
the 7 node, while the 7 node has a reference to the 4 node.
Because there is nothing after the 4 node, the 4 node's next node
reference is simply the NULL pointer.
This is useful because it allows us to iterate through our sequence using the following pseudo-code:
for (curr = head; curr != NULL; curr = curr.next) visit(curr)
To keep track of our nodes, we have a list structure that tracks the
head of the list, the tail of the list, and the size of the list.
In the diagram above, head references the first node structure,
which is the 5 node. Similarly, tail references the last node
structure, which is the 4 node. Finally, size has the value of 3
since the sequence container has 3 elements.
Because our singly-linked list has both a head and tail pointer, we
can insert new node structures to either the front or back of the
sequence container in constant time (ie. O(1)).
Because we maintain the size of the list in the list structure, we do
not need to iterate through the sequence container to compute this value
and thus can report the number of elements in constant time.
For this first activity, you are to implement this version of a singly-linked list.
To help you get started, the instructor has provided you with the following starter code:
# Go to assignments repository $ cd path/to/assignments/repository # Download starter code tarball $ curl -O http://www3.nd.edu/~pbui/teaching/cse.20289.sp17/static/tar/homework07.tar.gz # Extract starter code tarball $ tar xzvf homework07.tar.gz
Once downloaded and extracted, you should see the following files in your
homework07 directory:
homework07
\_ Makefile # This is the Makefile for building all the project artifacts
\_ README.md # This is the README file for recording your responses
\_ list.c # This is the C99 implementation file for the list structure
\_ list.h # This is the C99 header file for the list structure
\_ lsort.c # This is the C99 implementation file for the list sorting utility
\_ node.c # This is the C99 implementation file for the node structure
\_ node.h # This is the C99 header file for the node structure
\_ test_list.c # This is the C99 implementation file for the list structure test program
\_ test_lsort.sh # This is the shell script for testing the list sorting utility
\_ test_node.c # This is the C99 implementation file for the node structure test program
The details on what you need to implement are described in the following sections.
node.h and list.hThe node.h file is the header file for the node structure used in our
implementation of a singly-linked list. Besides a few function prototypes,
this header defines a struct node as:
struct node { char *string; int number; struct node *next; };
This means that each node consists of a string value , its corresponding integer value, and a reference to the next node.
It also contains a type definition node_compare that defines a type for
comparison functions on struct nodes:
typedef int (*node_compare)(const void *a, const void *b);
This will allow us to define functions which accept as node comparison
functions such as node_compare_string and node_compare_number as
arguments.
The list.h file is the header file for the list structure used in our
implementation of a singly-linked list. Besides a few function prototypes,
this header defines a struct list as:
struct list { struct node *head; struct node *tail; size_t size; };
This reflects the design describe above. Each struct list has a
**reference to the head of the list (ie. first element), the tail of the
list (ie. last element), and the size of the list (ie. number of elements).
For this task, you do not need to modify these header files. Instead, you should review them and ensure you understand the provided code.
MakefileOnce again, the Makefile contains all the rules or recipes for building the
project artifacts (e.g. liblist.a, lsort, etc.). Although the provided
Makefile contains most of the variable definitions and test recipes, you
must add the appropriate rules for liblist.a, test_node, test_list,
lsort, and any intermediate objects. The dependencies for these targets
are shown in the DAG below:
Note: Although we are producing a static library in the form of
liblist.a, we will not be statically linking our executables. Instead,
we will use the liblist.a as another object file when we link our
executables.
You must use the CC, CFLAGS, LD, LDFLAGS, AR, and ARFLAGS
variables when appropriate in your rules.
Once you have a working Makefile, you should be able to run the following
commands:
# Build all TARGETS $ make Compiling lsort.o... Compiling node.o... Compiling list.o... list.c:151:15: warning: 'merge' defined but not used [-Wunused-function] struct node * merge(struct node *left, struct node *right, node_compare f) { ^~~~~ list.c:138:7: warning: 'split' defined but not used [-Wunused-function] void split(struct node *head, struct node **left, struct node **right) { ^~~~~ list.c:125:15: warning: 'msort' defined but not used [-Wunused-function] struct node * msort(struct node *head, node_compare f) { ^~~~~ list.c:102:14: warning: 'reverse' defined but not used [-Wunused-function] struct node* reverse(struct node *curr, struct node *prev) { ^~~~~~~ Linking liblist.a... Linking lsort... # Run all tests $ make test Compiling test_node.o... Linking test_node... Testing node... Segmentation fault ...
Depending on your compiler, you may see some warnings with the initial starter code. Likewise, the test programs will all fail in some fashion.
Note: To display messages rather than showing the full command you are executing for each recipe, you can use the following approach:
target: source @echo "Action $@..." @command $@ $<
Prefixing a command in the action portion of a recipe with @ will
suppress displaying the command (only the output will be displayed). If
you wish to actually see the commands, you can temporarily display them
by running make with the -n flag:
# Simulate build with tracing output $ make -n echo Compiling lsort.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o lsort.o lsort.c echo Compiling node.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o node.o node.c echo Compiling list.o... gcc -g -gdwarf-2 -Wall -std=gnu99 -c -o list.o list.c echo Linking liblist.a... ar rcs liblist.a node.o list.o echo Linking lsort... gcc -L. -o lsort lsort.o liblist.a
Note: the -n has make simulate a build and display the actions it
would take. It does not actually perform the actions.
node.cThe node.c file contains the C99 implementation for the node
structure. For this task, you will need to implement the following
functions:
struct node * node_create(char *string, struct node *next)
This function allocates a new
struct node, sets itsstring,number, andnextfields, and returns the allocated structure.
Notes:
struct node * node_delete(struct node *n, bool recursive)
This function deallocates the given
struct node(including thestringfield). Ifrecursiveistruethis function will also deallocate thenextstruct nodeand any of its successors.
Notes: Althought the recursive parameter implies you should use
recursion, you may wish to consider using an iterative implementation
instead to avoid stack overflow.
void node_dump(struct node *n, FILE *stream)
This function prints the
struct nodeinformation to thestreamin the following format: "Node{string, number, next}".
int node_compare_number(const void *a, const void *b)
This function is used by sorting functions such as qsort to compare two
struct nodesby theirnumberfields.
Note: Although the parameters are const void *, they are really
struct node**. Since C lacks templates, it has to use void * to
represent any type. Sorting implementations such as qsort will pass in
&element to the comparison function, meaning each parameter is really a
struct node** since each element is originally a struce node*.
Hint: You can cast the void * parameters to an appropriate type.
int node_compare_string(const void *a, const void *b)
This function is used by sorting functions such as qsort to compare two
struct nodesby theirstringfields.
As you implement node.c, you can test it by running the test-node target:
# Run test-node target $ make test-node Compiling test_node.o... Compiling node.o... Compiling list.o... Linking liblist.a... Linking test_node... Testing node... # View log $ cat test_node.log # Run test_node manually $ ./test_node
These tests run on the principle of silence is golden, which means you will
only get error messages when something fails (it will not say success or
ok).
Feel free to use valgrind to gdb to track down any memory errors.
list.cThe list.c file contains the C99 implementation for the list
structure. For this task, you will need to implement the following
functions:
struct list * list_create()
This function allocates a new
struct list, initializes its fields, and returns the allocate structure.
struct list * list_delete(struct list *l)
This function deallocates the given
struct list.
void list_push_front(struct list *l, char *s)
This function adds a new
struct nodewith the givensstring value to the front of thestruct list.
Notes: Be sure to update the head and tail of the struct list
appropriately. Also, you should use the node_create function.
void list_push_back(struct list *l, char *s)
This function adds a new
struct nodewith the givensstring value to the back of thestruct list.
Notes: Be sure to update the head and tail of the struct list
appropriately. Also, you should use the node_create function.
void list_dump(struct list *l, FILE *stream)
This function iterates through the
struct listand callsnode_dumpon each element.
struct node ** list_to_array(struct list *l)
This function converts the
struct listinto an array ofstruct nodepointers.
Notes: You should not create new nodes; rather, you should simply
have the pointers in the array reference the existing nodes in the
struct list.
void list_qsort(struct list *l, node_compare f)
This function uses the qsort function to sort the
struct list.
Hints:
You should use the list_to_array function to create an array.
Remember to update the next fields of each element after sorting.
Remember to update the head and tail fields of the struct list.
void list_reverse(struct list *l)
This function reverses the
struct listrecursively using the internalreversefunction.
Hints:
Draw a picture on a piece of paper and do this by hand first.
reverse takes in the current node and what its previous node is.
Because we are reversing the list, we want to end up saying the the
next field of the current node is the previous node.
Remember to update the head and tail fields of the struct list.
void list_msort(struct list *l, node_compare f)
This function performs merge sort, which is a recursive divide and conquer sorting algorithm, on the
struct listusing the internalmsort,split, andmergefunctions.
Hints:
Draw a picture on a piece of paper and do this by hand first.
The msort function has the following pseudo-code:
# Divide split(head, &left, &right) # Conquer left = msort(left, f) right = msort(right, f) # Combine head = merge(left, right, f);
The split function divides the provided list into left and right
sublists by using the fast and slow pointer trick. Be sure to NULL
terminate your sublists.
The merge function combines two sublists into a single ordered
list.
Remember to update the head and tail fields of the struct list.
There are many examples of merge sort on linked lists. For instance, GeeksForGeeks has an article about Merge Sort for Linked Lists. You should resist the temptation to look at such sources.
If you do look at those examples, remember that any code you submit must be your own and that you should understand everything you submit.
Moreover, in this case, the example from
GeeksForGeeks is suboptimal as it
uses recursion to perform the merge. You will need an iterative
version of merge, otherwise, you run the risk of overflowing the
stack on large input sizes.
As you implement list.c, you can test it by running the test-list target:
# Run test-list target $ make test-list Compiling test_list.o... Linking test_list... Testing list... # View log $ cat test_list.log # Run test_list manually $ ./test_list Push front... Node{superman, 0, 0x13e4500} Node{wolverine, 0, 0x13e44c0} Node{batman, 0, 0x13e4480} Node{deadpool, 0, 0x13e4440} Node{spider-man, 0, 0x0} Push back... Node{spider-man, 0, 0x13e4500} Node{deadpool, 0, 0x13e44c0} Node{batman, 0, 0x13e4480} Node{wolverine, 0, 0x13e4440} Node{superman, 0, 0x0} QSort... Node{batman, 0, 0x13e4500} Node{deadpool, 0, 0x13e4540} Node{spider-man, 0, 0x13e4440} Node{superman, 0, 0x13e4480} Node{wolverine, 0, 0x0} Reversing... Node{wolverine, 0, 0x13e4440} Node{superman, 0, 0x13e4540} Node{spider-man, 0, 0x13e4500} Node{deadpool, 0, 0x13e44c0} Node{batman, 0, 0x0} MSort... Node{batman, 0, 0x13e4500} Node{deadpool, 0, 0x13e4540} Node{spider-man, 0, 0x13e4440} Node{superman, 0, 0x13e4480} Node{wolverine, 0, 0x0}
README.mdIn your README.md, respond to the following prompts:
Describe what memory you allocated in node_create and what memory you
deallocated in node_delete. How did you handle the recursive flag?
Describe what memory you allocated in list_create and what memory you
deallocated in list_delete. How did you handle internal struct nodes?
Describe how your list_qsort function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Describe how your list_reverse function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Describe how your list_msort function works and analyze its time
complexity and space complexity in terms of Big-O notation (both average
and worst case).
Once you have a working list library, you are to complete the lsort.c
program:
# Display usage $ ./lsort -h Usage: ./lsort -n Numerical sort -q Quick sort # Perform string sort using merge sort $ shuf -i1-10 -n 10 | ./lsort 1 10 2 3 4 5 6 7 8 9 # Perform numeric sort using merge sort $ shuf -i1-10 -n 10 | ./lsort -n 1 2 3 4 5 6 7 8 9 10 # Perform string sort using quick sort $ shuf -i1-10 -n 10 | ./lsort -q 1 10 2 3 4 5 6 7 8 9 # Perform numeric sort using quick sort $ shuf -i1-10 -n 10 | ./lsort -n -q 1 2 3 4 5 6 7 8 9 10
The lsort program reads in data from standard input one line at a time,
builds a singly-linked list, and then uses either qsort or your custom
merge sort to order the data. The user may specify either a string or
numerical based ordering based on the the -n flag.
lsort.cThe lsort.c file is contains the C99 implementation of the list sorting
tool described above.
In addition to implementing command line parsing in the main function and
you will need to implement the lsort function:
void lsort(FILE *stream, bool numeric, bool quicksort)
This function reads one line at a time from the
streamand builds a singly-linked list. It then sorts the list usinglist_qsortifquicksortistrue, otherwise it uses thelist_msort. To compare elements, it usesnode_compare_numberifnumericistrue, otherwise it usesnode_compare_string. Once the list is sorted, this function will print thestringfield of each element in the list.
test_lsort.shAs you implement lsort.c, you can test it by running the test-list target:
# Run test-lsort target $ make test-lsort make test-lsort Compiling lsort.o... Compiling node.o... Compiling list.o... Linking liblist.a... Linking lsort... Testing lsort... Testing lsort output (strings & msort)... Testing lsort output (numbers & msort)... Testing lsort output (strings & qsort)... Testing lsort output (numbers & qsort)... Testing lsort memory (strings & msort)... Testing lsort memory (numbers & msort)... Testing lsort memory (strings & qsort)... Testing lsort memory (numbers & qsort)... # View log $ cat test_lsort.log # Run test_lsort.sh manually $ ./test_lsort.sh Testing lsort output (strings & msort)... Testing lsort output (numbers & msort)... Testing lsort output (strings & qsort)... Testing lsort output (numbers & qsort)... Testing lsort memory (strings & msort)... Testing lsort memory (numbers & msort)... Testing lsort memory (strings & qsort)... Testing lsort memory (numbers & qsort)...
benchmark.{sh,py}Once you have a working lsort list sorting utility, you are to write a
benchmark script in either shell or Python that measures the amount of
time and space it takes to numerically sort the following number
of items:
1, 10, 100, 1000, 10000, 100000, 1000000, 10000000
To help you measure the time and space a program uses, you can utilize the measure.c program written by David Chiang. Once you have downloaded and compiled this program, you can use it in the following manner:
$ shuf -i1-100000000 -n 10000000 | ./measure ./lsort -n > /dev/null
11.706000 seconds 611.625000 Mbytes
This means that the lsort utility performed a numerical merge sort on
10000000 numbers in 11.7 seconds and used 611.6 megabytes of memory.
Your script, either benchmarch.sh or benchmark.py, should automate the
timing and measurement of these experiments to produce a markdown table
looks like the following:
| NITEMS | SORT | TIME | SPACE | | -------- | -------- |-----------|-----------| | 1 | Merge | 0.0 | 1.0 | | 1 | Quick | ... | ... | | ... | ... | ... | ... | | 10000000 | Merge | 11.7 | 611.6 | | 10000000 | Quick | ... | ... |
README.mdIn your README.md, include markdown table your benchmark script
generated and then answer the following questions:
Based on your experiments, what is the overall trend in terms of time and space as the number of items increases for both of your sorting methods?
Are the results surprising to you? Explain why or why not.
Given what you said about the space complexity and time complexity of
list_msort and list_qsort in Activity 1 and the experimental results
in Activity 2, what can you say is the relationship between theoretical
complexity and real world performance? Explain.
A great way to practice your data structures knowledge (say for preparing for a technical interview) is to work on some programming challenges. Hacker Rank is a website that allows users to practice all sorts of different types of programming concepts including data structures and algorithms.
For extra credit, you are to signup for Hacker Rank and solve at least three Linked Lists Challenges.
Afterwards, simply show the instructor or TA the Hacker Rank dashboard showing you successfully solved at least three challenges to receive credit.
To submit your assignment, please commit your work to the homework07 folder
in your assignments GitLab repository. Your homework07 folder should
only contain at the following files:
MakefileREADME.mdbenchmark.sh or benchmark.pylist.clist.hlsort.cnode.cnode.htest_list.ctest_lsort.shtest_node.c