Everyone:
Next week, we will have Exam 02, which will cover scripting in Python with a focus on: data structures, functional programming, and concurrency and parallelism. This reading assignment is meant to prepare you for this exam and is based on the items in Checklist 02.
The focus of this reading is to allow you to review for Exam 02.
The readings for Wednesday, March 7 are:
This week, the reading is split into two sections: the first part is a short
dredd quiz, while the second part involves four short Python scripts:
translate1.py
, translate2.py
, translate3.py
, and translate4.py
.
To test these scripts, you will need to download the Makefile and test scripts:
$ git checkout master # Make sure we are in master branch $ git pull --rebase # Make sure we are up-to-date with GitLab $ git checkout -b reading07 # Create reading07 branch and check it out $ cd reading07 # Go into reading07 folder # Download Reading 07 Makefile $ curl -LO https://gitlab.com/nd-cse-20289-sp18/cse-20289-sp18-assignments/raw/master/reading07/Makefile # Execute tests (and download them) $ make
Record the answers to the following Reading 07 Quiz questions in your
reading07
branch:
Given the following Unix pipelines, write Python scripts (ie.
translateX.py
) that accomplishes the same task.
translate1.py
: grep -Po '9\d*9' /etc/passwd | wc -l
translate2.py
: cat /etc/passwd | cut -d : -f 5 | grep -Po '[Uu]ser' | wc -l
translate3.py
: curl -sL http://yld.me/raw/lmz | cut -d , -f 2 | grep -Eo '^B.*' | sort
translate4.py
: /bin/ls -l /etc | awk '{print $2}' | sort | uniq -c
No credit will be given for simply calling os.system
on the given
pipeline.
Use functional programming whenever possible.
You do not need to do a literal translation (that is you don't have to replicate each portion of the pipeline); you just need to accomplish the same overall task and emit the same output.
Most of the scripts should only be 5
- 10
lines long.
For extra credit, you can brute-force attack passwords larger than length
6
utilizing by Makeflow to coordinate an army of hulk.py
's. As
discussed in class, Makeflow is a workflow execution system that models
applications in terms of a DAG. During execution, this graph is traversed
and independent nodes are executed in parallel if there are enough resources.
Before we can use Makeflow, we must first create a workflow DAG. To help
you get started, we have provided you with fury.py
:
# Download fury $ curl -LO https://gitlab.com/nd-cse-20289-sp18/cse-20289-sp18-assignments/raw/master/reading07/fury.py # Make it executable $ chmod +x fury.py
Internally, fury.py
contains the following Python code:
#!/usr/bin/env python3 import hulk import json # Constants HULK = 'hulk.py' HASHES = hulk.HASHES # Makeflow Class class Makeflow(object): def __init__(self): self.rules = [] def add_rule(self, command, inputs, outputs, local=False): rule = { 'command': command, 'inputs' : inputs, 'outputs': outputs, } if local: rule['local_job'] = True self.rules.append(rule) def __str__(self): return json.dumps({ 'rules' : self.rules, }, indent=4) # Main execution if __name__ == '__main__': makeflow = Makeflow() outputs = [] # Password of length 1 - 4 for length in range(1, 5): # TODO: do up through length 6 output = 'p.{}'.format(length) makeflow.add_rule( './{} -l {} -s {} > {}'.format(HULK, length, HASHES, output), [HULK, HASHES], [output], ) outputs.append(output) # Passwords of length 7, 8 # TODO: Add rules for lengths 7 and 8 by taking advantage of prefix arguments # Merge all passwords makeflow.add_rule( 'cat {} > passwords.txt'.format(' '.join(outputs)), outputs, ['passwords.txt'], True ) print(makeflow)
As can be seen, fury.py
defines a simple Makeflow
class that allows you
to add rules, each of which is a definition of a node in the graph (ie.
the command to run, a list of inputs, and a list of outputs).
Creating a workflow is just a matter of defining all the rules or commands
that need to be executed.
In the given starter code, we have defined the rules for running hulk.py
on
passwords of length 1
- 4
. Additionally, we have a final merge rule
that combines the output of all the previous commands into a single
passwords.txt
file.
To use fury.py
, you need to make sure you have a working hulk.py
and that
it is in the same directory as fury.py
. You will also need a copy of
hashes.txt
from [Homework 05]. Once all these conditions have been met, you can
use fury.py
to generate a Makeflow file by running fury.py
:
$ ./fury.py | tee Makeflow { "rules": [ { "command": "./hulk.py -l 1 -s hashes.txt > p.1", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.1" ] }, { "command": "./hulk.py -l 2 -s hashes.txt > p.2", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.2" ] }, { "command": "./hulk.py -l 3 -s hashes.txt > p.3", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.3" ] }, { "command": "./hulk.py -l 4 -s hashes.txt > p.4", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.4" ] }, { "command": "cat p.1 p.2 p.3 p.4 > passwords.txt", "inputs": [ "p.1", "p.2", "p.3", "p.4", ], "outputs": [ "passwords.txt" ], "local_job": true } ] }
As you can see, fury.py
generates a JSON document that contains the rules
or commands that need to be ran in the workflow.
To run Makeflow, you will first need to set some environmental variables:
# In Bash export PATH=~condor/software/sbin:$PATH export PATH=~condor/software/bin:$PATH export PATH=/afs/crc.nd.edu/group/ccl/software/x86_64/redhat6/cctools/current/bin:$PATH # Check $ which makeflow /afs/crc.nd.edu/group/ccl/software/x86_64/redhat6/cctools/current/bin/makeflow
Next, you can run the Makeflow generated by fury.py
on the local machine
by doing the following:
$ makeflow --jx -T local parsing ./Makeflow... local resources: 12 cores, 11908 MB memory, 8789 MB disk max running local jobs: 12 checking ./Makeflow for consistency... ./Makeflow has 5 rules. starting workflow.... submitting job: ./hulk.py -l 4 -s hashes.txt > p.4 submitted job 31128 submitting job: ./hulk.py -l 3 -s hashes.txt > p.3 submitted job 31129 submitting job: ./hulk.py -l 2 -s hashes.txt > p.2 submitted job 31130 submitting job: ./hulk.py -l 1 -s hashes.txt > p.1 submitted job 31131 job 31131 completed job 31130 completed job 31129 completed job 31128 completed submitting job: cat p.1 p.2 p.3 p.4 > passwords.txt submitted job 31170 job 31170 completed nothing left to do
When using the local batch system (ie. -T local
), Makeflow
automatically detects how many cores there are on the system and will execute
up to that many processes at once. In this case, since we only have 4
hulk.py
rules, each is run indepedently until a final project occurs at the
end of the workflow.
Now that you have an idea of what Makeflow does, you now need to modify
fury.py
so that it generates rules for passwords of length 5
, 6
, 7
,
and 8
. There are TODOS
indicating where you should modify the code.
Passwords of length 5
and 6
are straightforward, just extend the
current range
in the provided for loop.
For passwords of length 7
and 8
, you should take advantage of the
prefix command-line argument of hulk.py
. For instance, instead of a
single rule for passwords of length 7
, you should have multiple rules for
passwords of length 6
and a unique prefix:
{ "command": "./hulk.py -c 2 -l 6 -s hashes.txt -p a > p.7a", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.7a" ] }, { "command": "./hulk.py -c 2 -l 6 -s hashes.txt -p b > p.7b", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.7b" ] }, ... { "command": "./hulk.py -c 2 -l 6 -s hashes.txt -p 9 > p.79", "inputs": [ "hulk.py", "hashes.txt" ], "outputs": [ "p.79" ] }, ...
The same thing applies for passwords of length 8
, except you should have
prefixes of length 2
. In total, your resulting Makeflow should have
1339
rules or jobs.
Once you have a Makeflow with all the rules necessary to brute-force
passwords of length 1
- 8
, you can now execute it on the Condor cluster
using Work Queue. As discussed in class, Condor is a system for managing
a large number of machine resources, while Work Queue is a framework for
building large scale master-work applications.
Using one of the student machines, you can start your Makeflow with the following command:
# Start Makeflow with Work Queue batch system $ makeflow --jx -T wq -N fury-$NETID # Replace $NETID with your NETID
This will start your Makeflow with the Work Queue engine. Unfortunately, nothing will happen until you submit workers to the Makeflow. To do this, you can run the following command (from another terminal or shell):
$ condor_submit_workers -N fury-$NETID 50 # Replace $NETID with your NETID Creating worker submit scripts in /tmp/pbui-workers... Submitting job(s).................................................. 50 job(s) submitted to cluster 647011.
This will submit 50
workers to the Condor pool. To check on the status
of the workers, you can run the following command:
$ condor_q -submitter pbui -- Submitter: pbui@nd.edu : <129.74.152.75:9618?... : student02.cse.nd.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 647011.0 pbui 2/26 21:05 0+00:00:00 I 0 1.0 work_queue_worker 647011.1 pbui 2/26 21:05 0+00:00:00 I 0 1.0 work_queue_worker 647011.2 pbui 2/26 21:05 0+00:00:00 I 0 1.0 work_queue_worker 647011.3 pbui 2/26 21:05 0+00:00:00 I 0 1.0 work_queue_worker ...
To check on the status of the Work Queue system, you can use the following command:
$ work_queue_status PROJECT HOST PORT WAITING RUNNING COMPLETE WORKERS dihedral 128.120.146.4 9251 70 0 303 0 dihedral 128.120.146.4 7325 0 6 802 83 forcebalance chem9165.ucdavis.edu 50123 241 30 873 30 wq_test_shore js-129-114-104-114.je 9155 0 34 62 15 wq_test_Pisa2dvru1 js-157-111.jetstream- 9155 22 0 0 0 forcebalance skyholder.ucdavis.edu 50123 246 30 868 30 fury-pbui student02.cse.nd.edu 9002 950 50 0 50
As can be seen in the display above, the fury-pbui
project has 950
rules
left to run. It is currently running 50
jobs on 50
workers (the ones we
submitted to Condor earlier).
To start a local worker for testing purposes you can use the following command:
$ work_queue_worker -d all -N fury-$NETID
If you have your own machines, you can download CCTools and run the work_queue_worker or work_queue_factory from your own machine. This will allow you to add additional resources to your Work Queue pool and thus help you complete the brute-force attack sooner.
Cracking all 10419
passwords took the instructor about six hours using
200
+ workers.
Since this will take a while, you will probably want to run Makeflow inside of either a tmux or [screen] session:
# Start tmux
$ tmux
To make sure you have permissions to write your date, make sure you grab tokens before you run the Makeflow:
# Grab AFS tokens $ kinit -l30d $ aklog $ tokens # Check if you have tokens # Start Makeflow $ makeflow --jx -T wq -N fury-$NETID # Replace $NETID with your NETID
With this setup, you can then disconnect from student02. To return to your Makeflow, you can just do:
$ tmux attach
To monitor the Makeflow, you can either look at the output of
work_queue_status
or you can use the makeflow_monitor script:
$ makeflow_monitor Makeflow.makeflowlog
If your workflow fails or gets interrupted, you can always restart your Makeflow by using the command above and it should resume where it left off. That is, it will not repeat any tasks it has successfully completed.
If you modify the Makeflow file, however, you will need to remove the
Makeflow.makeflowlog
and restart the whole process over.
To get credit for this Guru Point, you must show a TA or the instructor
your completed fury.py
and get at least 10,000
passwords on the
deadpool.
To submit you work, follow the same process outlined in Reading 01:
$ git checkout master # Make sure we are in master branch $ git pull --rebase # Make sure we are up-to-date with GitLab $ git checkout -b reading07 # Create reading07 branch and check it out $ cd reading07 # Go into reading07 folder $ $EDITOR answers.json # Edit your answers.json file $ ../.scripts/submit.py # Check reading07 quiz Submitting reading07 assignment ... Submitting reading07 quiz ... Q01 0.20 Q02 0.20 Q03 0.20 Q04 0.20 Q05 0.20 Q06 0.20 Q07 0.20 Q08 0.20 Q09 0.20 Q10 0.20 Score 2.00 $ git add answers.json # Add answers.json to staging area $ git commit -m "Reading 07: Quiz" # Commit work $ $EDITOR translate1.py # Edit your translate1.py file $ $EDITOR translate2.py # Edit your translate2.py file $ $EDITOR translate3.py # Edit your translate3.py file $ $EDITOR translate4.py # Edit your translate4.py file $ make # Test all scripts Testing translations ... translate1.py ... Success translate2.py ... Success translate3.py ... Success translate4.py ... Success Score 2.00 $ git add Makefile # Add Makefile to staging area $ git add translate1.py # Add translate1.py to staging area $ git add translate2.py # Add translate2.py to staging area $ git add translate3.py # Add translate3.py to staging area $ git add translate4.py # Add translate4.py to staging area $ git commit -m "Reading 07: Scripts" # Commit work $ git push -u origin reading07 # Push branch to GitLab
Remember to create a merge request and assign the appropriate TA from the Reading 07 TA List.