Readings

This week the readings will focus on awk, an additional filter we can use in shell scripts for slicing and dicing data.

The readings for Monday, February 29 are:

Recommended Reading

Optional Resources

Optional Resources:

Questions

In your reading07 folder, write the following shell scripts:

  1. head.sh: Use awk to implement your own version of the head Unix filter:

    # Print usage
    $ ./head.sh -h
    usage: head.sh
    
          -n N    Display the first N lines
    
    # Print first 10 lines
    $ ./head.sh < /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    bin:x:1:1:bin:/bin:/usr/bin/nologin
    daemon:x:2:2:daemon:/:/usr/bin/nologin
    mail:x:8:12:mail:/var/spool/mail:/usr/bin/nologin
    ftp:x:14:11:ftp:/srv/ftp:/usr/bin/nologin
    http:x:33:33:http:/srv/http:/usr/bin/nologin
    uuidd:x:68:68:uuidd:/:/usr/bin/nologin
    dbus:x:81:81:dbus:/:/usr/bin/nologin
    nobody:x:99:99:nobody:/:/usr/bin/nologin
    systemd-journal-gateway:x:191:191:systemd-journal-gateway:/:/usr/bin/nologin
    
    # Print first 2 lines
    $ ./head.sh -n 2 < /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    bin:x:1:1:bin:/bin:/usr/bin/nologin
    
    # Download test script
    $ curl -O http://www3.nd.edu/~pbui/teaching/cse.20189.sp16/static/sh/test_head.sh
    
    # Make script executable
    $ chmod +x test_head.sh
    
    # Run test script
    $ ./test_head.sh
    head.sh test succesful!
    

    Hints

    1. Use awk -v name=value to pass variables from the shell script to the awk program.


  2. catalog_summary.sh: Write a script that uses awk to parse the contents of CCL Catalog Server:

    $ curl -s http://catalog.cse.nd.edu:9097/query.text
    

    The script should return the total number of cpus, the total number of unique machine names, and the most common service type as shown below:

    # Fetch and summarize data from default URL
    $ ./catalog_summary.sh
            Total CPUs: 12272
        Total Machines: 853
    Most Prolific Type: chirp
    
    # Fetch and summarize data from testing URL
    $ ./catalog_summary.sh http://www3.nd.edu/~pbui/teaching/cse.20189.sp16/static/txt/test_catalog_summary.txt
            Total CPUs: 6330
        Total Machines: 457
    Most Prolific Type: bobbit
    
    # Download test script
    $ curl -O http://www3.nd.edu/~pbui/teaching/cse.20189.sp16/static/sh/test_catalog_summary.sh
    
    # Make script executable
    $ chmod +x test_catalog_summary.sh
    
    # Run test script
    $ ./test_catalog_summary.sh
    catalog_summary.sh test succesful!
    

    The script should take one optional argument, URL, which specifies where to fetch the catalog data.

    Live Data

    The default URL, http://catalog.cse.nd.edu:9097/query.text, returns live data. This means that subsequent runs of the catalog_summary.sh script may yield different output.

    To simplify testing, the testing URL, http://www3.nd.edu/~pbui/teaching/cse.20189.sp16/static/txt/test_catalog_summary.txt, is a fixed snapshot that always returns the same information.

    Hints

    1. Use parameter expansion to handle the optional argument.

    2. Use pattern matching to handle the three different cases:

      /pattern/   {   ACTION  }
      
    3. For cpus, you simply need to add to a counter.

    4. For machines and types you need to use an associative array to track previous entries and you will need a counter to track unique entries. Note, that the expression key in array evaluates to 0 if key is not in the array.

    5. Use an END block to print out the totals. You may need to do some processing with a loop to compute the most prolific type.

    6. Look in the Lecture 10 slides for examples of awk.

Commands

In the reading07 folder, write one summary page called awk.md that contains common uses of the command. Be sure to cover the following:

  1. Printing specific fields.

  2. Modifying FS to control input field separator.

  3. Using BEGIN and END.

  4. Using pattern matching.

  5. Using special variables such as NF and NR.

  6. Using associative arrays.

Summaries

Note, your summaries should be in your own words and not simply copy and pasted from the manual pages. They should be short and concise and only include common use cases.

Feedback

If you have any questions, comments, or concerns regarding the course, please provide your feedback at the end of your response.

Submission

To submit your assignment, please commit your work to the reading07 folder in your Assignments Bitbucket repository by the beginning of class on Monday, February 29.