Readings

This week we will be discussing Databases and ways to organize large collections of related data.

The readings for Tuesday, February 16 are:

  1. Chapter 14 Using databases and Structured Query Language (SQL)

  2. Introduction to Databases with Python

Optional resources:

  1. Python School - Databases

  2. A thorough guide to SQLite database operations in Python

  3. Python Advanced: Python and SQL

  4. sqlite3 - Embedded Relational Database

Questions

Once you have completed these readings, please answer the following questions:

  1. Define the following terms:

    a. Database

    b. Table

    c. Column

    d. Row

  2. What is SQL?

    Describe what the following four SQL verbs do:

    a. SELECT

    c. INSERT:

    b. UPDATE:

    d. DELETE:

  3. Download the following two files:

    1. cdt-30020-sp16.tsv

      This is a tab separated file that contains information about all the students in the class in the following format:

      NETID   CLASSIFICATION  FIRST_NAME  LAST_NAME
      
    2. students_database.py

      This is a script loads the file above and allows the user to search for information by field.

    To download the files via the command line, you can do the following:

    $ curl -O https://www3.nd.edu/~pbui/teaching/cdt.30020.sp16/static/tsv/cdt-30020-sp16.tsv
    $ curl -O https://www3.nd.edu/~pbui/teaching/cdt.30020.sp16/static/py/students_database.py
    

    To use the script, you to pass the following command line arguments:

    • TYPE: This should either be list or sql to choose between which database backend you wish to use.

    • TSV_FILE: This is the path to the cdt-30020-sp16.tsv file.

    • FIELD: This is one of the four fields to search (ie. netid, classification, first_name, or last_name).

    • QUERY: This what value to search for.

    For instance, to search for all students that are sophomores using the list database method you can do the following:

    $ python students_database.py list cdt-30020-sp16.tsv classification Sophomore
    

    Likewise, to search for all students with the first name Erica using the sql database method, you can do the following:

    $ python students_database.py sql cdt-30020-sp16.tsv first_name Erica
    

    Once you have the files downloaded, play around with the students_database.py script, read it, and then answer the following questions:

    a. Describe what the StudentListDatabase class does. How is the information stored? How does the search method work?

    b. Describe what the StudentSqliteDatabase class does. Explain how the information is stored by discussing the following blocks of code:

    # Block 1
    curs      = self.conn.cursor()
    statement = '''
    CREATE TABLE IF NOT EXISTS Students(
    netid           TEXT NOT NULL PRIMARY KEY,
    classification  TEXT NOT NULL,
    first_name      TEXT NOT NULL,
    last_name       TEXT NOT NULL
    )
    '''
    curs.execute(statement)
    
    # Block 2
    for line in open(path):
        netid, classification, first_name, last_name = line.strip().split('\t')
        statement = '''
        INSERT OR REPLACE INTO Students (netid, classification, first_name, last_name) VALUES (?, ?, ?, ?)
        '''
        curs.execute(statement, (netid, classification, first_name, last_name))
    

    Describe how the search method works by explaining the following block of code:

    # Block 3
    curs      = self.conn.cursor()
    statement = 'SELECT * FROM Students WHERE {}=?'.format(field)
    for row in curs.execute(statement, (query,)):
        matches.append(Student(row[0], row[1], row[2], row[3]))
    

    SQL Statements

    For each block of code, be sure to highlight the SQL statement and discuss what action it is performing.

Feedback

If you have any questions, comments, or concerns regarding the course, please provide your feedback at the end of your response.

Submission

To submit your response, please commit your work (responses and scripts) to the reading04 folder of your your Readings Bitbucket repository.