This week we will be discussing the relational nature of SQL. In particular, we will focus on SQL.
The readings for Thursday, February 25 are:
Once you have completed these readings, please complete the following tasks:
Download and extract the reading05.zip archive:
# Download archive $ curl -O https://www3.nd.edu/~pbui/teaching/cdt.30020.sp16/static/zip/reading05.zip # Extract archive $ unzip reading05.zip # Go to extract folder $ cd reading05
The archive contains the following files:
movies.dat
: This is the movies data from the MovieLens 1M dataset
that we have been using in class to explore databases.
movies.html
, groups.html
: These are the HTML templates for the
web application. They utilize some basic Bootstrap elements for
visual appeal. The theme is Paper from the Bootswatch collection.
movielens.py
: This is the Python web application. It utilizes both
SQLite and Tornado.
An example of the web application in action can be found at http://xavier.h4x0r.space:9999/.
You are to complete the eight TODOs
in the movielens.py
script.
These TODOs
primarily require you to complete the SQL statements. You
should not need to modify significant amounts of Python code or HTML
templates, though you may if you wish.
Here is a walk-through of what you need to complete:
# TODO 1: Connect to database, create tables, and load tables
For this
TODO
, you need to create a connection to the sqlite3 database and assign it to theconn
instance variable of theDatabase
class. Afterwards, you need to call the internal_create_tables
and_load_tables()
methods.
# TODO 2: Complete movies_sql and genres_sql
For this
TODO
, you write the SQL statements that create the Movies and Genres tables, which look like the following.
Movies: Columns: MovieID | Title | Year Primary Key: MovieID Genres: Columns: MovieID | Genre Primary Key: (MovieID, Genre) Foreign Key: MovieID
The Genres table is different from the previous SQL tables we have created. This table has a composite key as its primary key. That is, each row in Genres is uniquely identified by multiple columns: MovieID and Genre.
Additionally, the Genres has a foreign key that references another table in the database. In this case, Genres has MovieID which references the MovieID from the Movies table. For a reference on how to specify a foreign key, you can refer to Referential Integrity in SQLite or SQLite Foreign Key Support.
Be sure to consider the case where the tables are already in the database.
# TODO 3: Complete movie_sql and genre_sql
For this
TODO
, you need to write the SQL statements to insert data into the Movies and Genres table respectively.
Be sure to consider the case where the data is already in the database.
# TODO 4: Complete movie_sql
For this
TODO
, you need to write the SQL statements to select all the movies from the database that match (ie.LIKE
) the specifiedtitle
.
This SQL statement will require us to perform a JOIN, that is mix data from two different tables. Because this is a bit tricky, here is a sketch of what this SQL statement should look like:
SELECT Movies.MovieID, Title, Year, GROUP_CONCAT(Genres.Genre) AS Genres FROM ___________ JOIN ___________ ON ___________ = ___________ WHERE ___________ GROUP BY ___________ LIMIT ___________
As can be seen from the above sketch, we want to return a collection of
movies which consist of MovieID
, Title
, Year
, and Genres
.
To compute Genres
, we instruct SQLite to perform the GROUP_CONCAT
aggregate function. In conjunction with the GROUP BY, the
GROUP_CONCAT combines the column data from multiple related rows into
a single column entry (in this case, we are combining multiple
Genres.Genre values into a single Genres
value for each movie.
In order for us to associate each movie with its collection of
Genres
, we must perform a JOIN. When we perform this operation, we
must tell the database what table to JOIN and ON which column.
In this case, we know that that the MovieID from the Movies
table is associated with the MovieID from the Genres table.
# TODO 5: Complete genre_sql
For this
TODO
, you need to write the SQL statements to select all the movies from the database that have the specifiedgenre
.
This SQL statement will look like the previous movies_sql
except
the WHERE argument will be different and there should be no
LIMIT.
# TODO 6: Complete year_sql
For this
TODO
, you need to write the SQL statements to select all the movies from the database that have the specifiedyear
.
This SQL statement will look like the previous genre_sql
except
the WHERE argument will be different.
# TODO 7: Complete genres_sql
For this
TODO
, you need to write the SQL statements to select all the DISTINCT genres from the Genres table.
We can use the DISTINCT keyword with the SELECT statement to only retrieve the unique elements from a column. Be sure to order the results appropriately.
# TODO 8: Complete years_sql
For this
TODO
, you need to write the SQL statements to select all the DISTINCT years from the Movies table.
This SQL statement will look like the previous genres_sql
.
When you are finished with these tasks, please submit the complete movielens.py
:
$ git add movielens.py $ git commit -m "reading 05: completed" $ git push
If you have any questions, comments, or concerns regarding the course, please provide your feedback at the end of your response.
To submit your response, please commit your work (responses and scripts) to the reading05 folder of your your Readings Bitbucket repository.