This is a general outline of the key concepts (arranged by topic) that you should know for Exam 02.


The exam will have the following format:

  1. Code Snippets: Write Python code snippets to perform certain tasks.

  2. Short Answers: Briefly answer questions about

  3. Translation: Convert Unix pipelines to Python code.

Parts 1 and 2 are to be done first on paper. After these parts are completed, part 3 can be done with the aid of your laptop and the Internet (but not other people).

Representative, but not Exhaustive

This check list is meant to be representative, rather than exhaustive (ie. there may be questions that show up on the exam that are not shown below).

Generating Documents


  1. What are the advantages and disadvantages of using command line tools such as pdflatex and gnuplot to create documents?

  2. How would you use ImageMagick to:

    1. Convert a PNG image to JPG?

    2. Resize a PNG image?

    3. Blend two PNG images?

    4. Create a GIF from a series of PNG images?


  1. gnuplot

  2. pdflatex

  3. convert

  4. composite

Scraping The Web


  1. How is Python different from the Bourne shell? How is it similar?

  2. How do we manage control flow in Python? How do we utilize these constructs?

    • Conditionals

    • Loops

    • Exceptions

    • Functions

  3. What data structures do we have in Python? What are their basic operations?

    • Lists

    • Dictionaries

    • Sets

  4. How do we do the following in Python?

    • Process command-line arguments

    • Read and write files

    • Read standard input

    • Use regular expressions

    • Check if a file exists

    • Execute an external command

    • Fetch data from the web

    • Process JSON data from the web

Sample Questions


Given the following Unix pipelines, write Python code that accomplishes the same task.

Note: No credit will be given for simply calling os.system on the given pipeline.

  1. cat /etc/passwd | cut -d : -f 1 | grep d$ | wc -l

  2. cat /etc/passwd | cut -d : -f 3 | grep -E '^[0-9]{2}$' | sort | uniq

  3. curl -s | cut -d , -f 3 | grep -Eo '^[^aeoiu]*@.*'

  4. curl -s | grep -Eo '^B.*' | cut -d , -f 1 | sort

  5. who | sed -rn 's|.*\((.*)\).*|\1|p' | sort | uniq

  6. ls -l /etc | awk '{print $2}' | sort | uniq -c

Processing Data


  1. What is the difference between procedural and functional programming?

  2. How do we use map, filter, reduce, and lambda to do functional programming in Python?

  3. How do we use list comprehensions to do functional programming in Python?

  4. What is an iterator and how is it different from a list?

    • What are the advantages of using an iterator?

    • What are the disadvantages of using an iterator?

  5. What is a generator and how it is different from a list?

  6. What is the difference between concurrency and parallelism?

  7. What is MapReduce?

    • What problem is MapReduce trying to solve?

    • What are the three phases of a typical MapReduce workflow?

    • How would you implement a simple MapReduce application such as word count?