Madeline Sample Experiment

TraceLab Components for Generating Speech Act Types in Developer Question/Answer Conversations

How to Reproduce the Experiment

This page contains the Madeline experiment with the ND components. To reproduce Madeline experiment in TraceLab, below are the steps to run the experiment:

Open VirtualBox and import the VirtualMachine(Debian) "madeline_tracelab.ova" found here. The successful import is shown in VM. Then run tracelab.
Open the terminal (highlighted in red) and type: tracelab.
If an error occurs with a message "canberra-gtk-module" (as highlighted in red), ignore it (i.e., the module is alreday installed) and re-run it again. The new tracelab screen opens (highlighted in orange)
The madeline_complete_exp.teml is the executable file that runs the madeline experiment. To locate this file, follow the steps below:

Open icon -- label (1)

Open everyman folder -- label (2)
Follow the directory path -- label (3).
Open madeline_complete_exp.teml file -- label (4).

This snapshot shows the ND Madeline sample experiment for TraceLab. All components available in this experiment were developed by the ND TraceLab.
To run the experiment: select the start node (1), then click the green run button at the top left corner (2).
The results are reported within the Tracelab both in workspace -- label (1) and log space -- label (2). Note that the output shows the generated speech acts (e.g., apiAnswer, clarificationQuestion, confirmation, etc) of different types and their corresponding performance metrics (ie., precision, recall, f-1-score and support).
The same output results also can be found in the directory: /home/everyman/Madeline/extension/results/experiment_1. To locate the output files, follow steps labeled (1) -- (5):

Open icon label (1)
Open everyman folder -- label (2)
Follow the directory path -- label (3).
Change default ''Experiment files (.teml)" to "All files" -- label (4)
Open output files:

nb_5_fold_cv_results.txt -- label (5)
lr_5_fold_cv_results.txt -- label (5)

Details about Data

Skype transcripts (found at: /home/everyman/Madeline/data/SkypeTranscripts). Skype transcripts are dialogues collected by virtual assistant and programmer. Programmer's role was to ask "virutal assistant" questions during the bug repair. Questions were of different type such as: implementation questions or clarification questions, .etc. The "virtual assistant's" role was to assist with hints to guide the programmer to repair the bug.

codes.csv -- is the file that contains the listing of all participants along with annotation labels (e.g, clarificationQuestion, implementationQuestion, etc).
participant1.txt. . .participant30.txt - are the files that provide conversations collected between participant and virtual assistant.
results.csv -- is the file that contains each line of text (i.e. speech act) that needs to be classified and its corresponding annotation label that is expected

Performance metrics calculated for each speech act type (found at: /home/everyman/Madeline/extension/results/).

nb_5_fold_cv_results.txt
lr_5_fold_cv_results.txt

Prediction where each speech act type is the text to be classified and the annotated label is computed indirectly through a python call via "ND Shallow Attribute Calculations Component".

shallow_attribute_calculations.py (found at: /home/everyman/Madeline/extension)

shallow_attribute_calculations.py - invokes:

experiment_handles.create_cv_experiment_function(experiment_classifiers) from:
function_handles.py (found at: /home/everyman/Madeline/experiment_handles/) . Finally the resulting list of classifiers are then tested through 5 fold cross-validation.

Further details about data interpretation are provided in the FSE'18 paper. Please, see section 9 (page 9 and 10), respectively, subsections: 9.1, 9.2 and 9.3.

Component Configurations

Here we show in detail the configurations (ie., input/output) of each component.

The first component "ND Madeline Transcript Conversation Reader Component" is selected in this picture. It requires as output the path to the folder of transcript reader script. For more details on the "ND Madeline Transcript Conversation Reader Component" click here.
The "ND Shallow Attribute Calculations Component" is selected in this picture. It requires as output the path to the folder of shallow attribute scripts. For more details on the "ND Shallow Attribute Calculations Component" click here.
The "ND MultiLabel SMOTE Component" is selected in this picture. It requires the user to enter which prediction model to run. In this case, if we choose to compute NB (i.e., Naive Bayes) model, then from "ND MultiLabel SMOTE Component" entry we type "NB Model". Otherwise, if we choose to compute LR (i.e., Logistic Regression) model, then from "ND MultiLabel SMOTE Component" entry, we type "LR Model". For more details on the "ND MultiLabel SMOTE Component" click here.
The "ND NB File Parser Component" is selected in this picture. It loads from the workspace the entry that was processed by the ND MultiLabel SMOTE Component. It requires the user to inform the path to directory in order to store NB results. For more details on the "ND NB File Parser Component" click here.
The "ND NB Printer Component" is selected in this picture. It will received from the workspace the contents of the srcml file that was processed by the ND NB File Parser Component. The ND NB Printer Component will print the contents of the of the workspace to TraceLab's log space. For more details on the "ND Printer Component" click here.
The "ND LR File Parser Component" is selected in this picture. It will receive from the workspace the entry that was processed by the ND MultiLabel SMOTE Component. It requires the user to inform the path to directory in order to print the LR results. The path where the results are stored are shown in the workspace of Tracelab. For more details on the "ND LR File Parser Component" click here.
The "ND LR Printer Component" is selected in this picture. It will received from the workspace the contents of the srcml file that was processed by the ND LR File Parser Component. The ND LR Printer Component will print the contents of the of the workspace to TraceLab's log space. For more details on the "ND LR Printer Component" click here.

Downloads

ND Madeline Transcript Conversation Reader-File size 3.3MB
ND Shallow Attribute Calculations-File size 3.3MB
ND MultiLabel SMOTE-File size 3.2MB
ND NB File Parser-File size 3.0MB
ND NB Printer-File size 4.1MB
ND LR File Parser-File size 3.0MB
ND LR Printer-File size 4.1MB
Virtual Machine (Debian)-File size 12.8GB

TraceLab Components for Generating Speech Act Types in Developer Question/Answer Conversations

How to Reproduce the Experiment

Details about Data

Component Configurations

Table of Contents

Downloads