Cataloguing User Error Types on EIWL Exercises
Author
Roel Fledderman
Title
Cataloguing User Error Types on EIWL Exercises
Description
Our study aims at finding patterns of errors and error types within the data available, using machine learning and other methods.
Category
Essays, Posts & Presentations
Keywords
EIWL, parsing, data science, syntax errors, teaching programming
URL
http://www.notebookarchive.org/2021-07-6g4gm4e/
DOI
https://notebookarchive.org/2021-07-6g4gm4e
Date Added
2021-07-14
Date Last Modified
2021-07-14
File Size
1.15 megabytes
Supplements
Rights
Redistribution rights reserved



WOLFRAM SUMMER SCHOOL 2021
Cataloguing User Error Types on EIWL Exercises
Cataloguing User Error Types on EIWL Exercises
Roel Fledderman
NA
Users can check their answers to exercises from An Elementary Introduction to the Wolfram Language in the Wolfram Cloud. Our study aims at finding patterns of errors and error types within the data available, using machine learning and other methods. Within this context, we also aim to identify additional pointers to address misunderstandings.
Intro
Intro
In addition to reading An Elementary Introduction to Wolfram Language online, users can find additional exercises and check their solutions in the Wolfram Cloud.
Picture 1: Doing an exercise online.At the time of writing, a user receives as feedback either CORRECT or TRY AGAIN. In order to teach Wolfram Language optimally, additional methods could be implemented such as autocompletion and other forms of automated feedback/assistance.Initially we conceived of two main types of user errors: semantic errors and syntactic errors (a third category could be labeled wrong question). There could be some overlap between the two types, but in general a semantic error originates from a misunderstanding of the relevant concept being taught. In other words, the user needs some additional reading or thinking, or concrete examples, to correct the error. On the other hand, a syntactic error could be considered as a mistake when writing the code - kind of like a typo.For a brand new user, going through the first few chapters, there may be no real difference yet between a semantic or syntactic error. However, making syntactic errors can prevent grasping the main concept as the code is not evaluated correctly and the user is merely prompted to try again. Fixing this layer of errors should enable faster understanding of the semantics at hand. In addition, detecting and correcting syntax errors uses a relatively small set of patterns and rules. It may therefore be easier to automatically address than conceptual errors. The low hanging fruit, so to speak.So in this study we focused primarily on addressing the syntactic error type. First by summarizing a bunch of user error data, and then by applying machine learning to determine practical potential.
Syntactic Errors
Syntactic Errors
An Example
An Example
As an example, consider the following exercise and submitted error:
Out[]=
12.1 Generate the sequence of notes with pitches 0, 4 and 7. |
Sound[{SoundNote[0],SoundNote[4],SoundNote[7]} |
Working from the Wolfram Cloud, the user would have seen an indication of their error while typing their code:
Picture 2: Emphasized error while typing.Maybe the emphasis wasn’t obvious enough, or the user didn’t know how to correct it. Either way, the rest of the syntax seems rather correct and the user was probably able to immediately submit correct code upon trying again.
Dataset and Method
Dataset and Method
Our main dataset contained 5670 recent submissions that had been flagged by the scoring engine to contain a syntax error. The syntax errors vary from the previous example to somewhat more severe cases.
Then we ran CodeParser (available with Version 12.3) against those lines and analyzed the different categories.
As a final step we applied machine learning on the general data (Classify), and then on a set of manually curated data (Transformer NN).
Then we ran CodeParser (available with Version 12.3) against those lines and analyzed the different categories.
As a final step we applied machine learning on the general data (Classify), and then on a set of manually curated data (Transformer NN).
Results
Results
The following graph shows an overview of submissions per CodeParser syntax error type:
Out[]=
|
The category answerEmpty has been removed from these graphics. These were instances were the user didn’t enter any code and then wanted to check their answer. We suspect that in these cases users were testing the platform, maybe looking for some hints. But since they don’t provide a body to do syntax analysis on they have been dropped from further investigation.
As explained in this video parsing code can be a tricky business, which explains the parsingError category. In most cases, the parser returned multiple inspection objects. We’ll come back on these in a later section.
Of the resulting types, three main types stand out: UnterminatedGroup, OpenSquare and CommaTopLevel:
As explained in this video parsing code can be a tricky business, which explains the parsingError category. In most cases, the parser returned multiple inspection objects. We’ll come back on these in a later section.
Of the resulting types, three main types stand out: UnterminatedGroup, OpenSquare and CommaTopLevel:
Out[]=
|
Let’s have closer look at each.
UnterminatedGroup
UnterminatedGroup
What we see in this example is a closing “]” at the end of the input:
Out[]=
|
In other words, the user neglected to close the brackets for Join[], giving us this bracket pattern:
Out[]=
[[][[]]
It tends to happen a lot at the end of a line, but not always. In some cases, especially in the first chapters, users seem confused about which type of brackets to use too. For example:
Out[]=
|
Relatively many examples seem to happen during the first few chapters, which seems to suggest users quickly learn to check their brackets.
OpenSquare
OpenSquare
In below case the number of opening and closing brackets match, but there appears to be a ‘headless’ list of arguments [__]:
Out[]=
|
Sometimes removing the opening bracket leads to improved syntax. But note that in the example above this would reveal a semantic error.
In other cases it’s unclear wether the user forgot to include a head:
In other cases it’s unclear wether the user forgot to include a head:
Out[]=
|
Other causes could be confusion around which bracket type to use, or missing the shift key when typing.
CommaTopLevel
CommaTopLevel
This error seems to be related to lists, and listable functions, and how to properly use them:
Out[]=
|
Machine Learning
Machine Learning
Training a Classifier
Training a Classifier
We trained a classifier to a set of 900 examples, 300 for each error type. The results are below, and they didn’t seem discouraging so we decided to do a quick pilot on training a transformer neural net using a manually curated dataset.
Out[]=
Classifier Measurements | ||||||||||||||||||
|
Training a Transformer Neural Net
Training a Transformer Neural Net
Below you’ll find some examples of manually curated syntax. The idea was to only fix the syntax in a minimal way, while trying to be conscious of the exercise the student was working on, and also the level of knowledge the student may have had. For example, in chapter one functions haven’t been introduced yet, so “1,2” will be corrected as 1+2 instead of Plus[1,2].
On top is listed for each example the user syntax and at the bottom the improved syntax:
Out[]=
RandomInteger [Range,[10]] |
RandomInteger [Range[10]] |
Out[]=
Join[Reverse[Range[20],[Range[20]]] |
Join[Reverse[Range[20]],[Range[20]]] |
Out[]=
Table[i*[i+1],{i,1,1000}] |
Table[i*(i+1),{i,1,1000}] |
The results object of the training process, based on a set of ~1100 training examples:
NetTrain Results | |||||||||||||||||||||||
|
What is fun to see is that the transformer is actually doing at least some transformations right, as illustrated below:
"ListPlot[Join]Range[100],Reverse[Range[100]]"
"ListPlot[Join[Range[100],Reverse[Range[100]]]]"
"ListPlot[Join[Range[100],Reverse[Range[100]]]]"
codeCorrector@"Join[[Range[20]],Reverse[Range[20]]]"
"Join[Range[20],Reverse[Range[20]]"
"Join[Range[20],Reverse[Range[20]]"
"StringTake[[StringJoin[Alphabet[]]],5]"
"Sort[Table[Style[5],List[]]]"
"Sort[Table[Style[5],List[]]]"
"WordCloud[#]&/@[\"apple\", \"peach\", \"pear\"]"
"Tolumn[Stylph\"Ule[\"UpharColurarte[\"g\", \", Uph\", \"], \"], \"]"
"Tolumn[Stylph\"Ule[\"UpharColurarte[\"g\", \", Uph\", \"], \"], \"]"
"Sound[{SoundNote[0],SoundNote[4],SoundNote[7]}"
"Sound[{Stylus[],Redoloundowandour[4,\"],\"]"
"Sound[{Stylus[],Redoloundowandour[4,\"],\"]"
"List[Range[RandomInteger[10]]"
"List[RandomInteger[10],RandomInteger[10]]"
"List[RandomInteger[10],RandomInteger[10]]"
"Column[ListPlot[Range[5],ListPlot[Range[5]]]"
"Column[ListPlot[Range[5]],ListPlot[Range[5]]]]"
"Column[ListPlot[Range[5]],ListPlot[Range[5]]]]"
"Times [6,8], Times [5,9] , Max []"
"Times[Plus[8, Times[5, 9]] Powes ] "
"Times[Plus[8, Times[5, 9]] Powes ] "
"Join[Range[4],Reverse[Range[4]]"
"Join[Range[4],Reverse[Range[4]]]"
"Join[Range[4],Reverse[Range[4]]]"
"Table[Part[Table[{Yellow,Red,Green},RandomInteger[{1,3}],{100}]"
"Table[Part[Table[{1,2,3],Reverse[RandomIntegerse[n,{n,0}]]]}]}"
"Table[Part[Table[{1,2,3],Reverse[RandomIntegerse[n,{n,0}]]]}]}"
"Table[i,{i,10]"
"Table[Table[n,{1}]]"
"Table[Table[n,{1}]]"
"Range[10"
"Range[10]"
"Range[10]"
"Times 2,Plus[3,4]=14"
"Times[2,Plus[3,4]]"
"Times[2,Plus[3,4]]"
Admittedly, some transformations are still plain wrong. Some improvements will be suggested in the next section.
Conclusions
Conclusions
Some patterns in user error were observed, mainly in different types of syntax errors.
The machine learning results seem promising, and it would be interesting to train a neural net on a bigger set of curated data (in addition to using a more standardized approach). In addition, perhaps syntax errors can be introduced manually/programmatically in submissions that are known to be correct, such as done in other studies (e.g. here).
Also the omitted data deserves a closer look. Most of the syntax returned two inspection objects when parsed, which seemed too complicated for analysis when starting out (some returned up to seventeen inspection objects). Combining the parser results for all 700+ omissions shows some additional syntax error classes and gives us this chart:
The machine learning results seem promising, and it would be interesting to train a neural net on a bigger set of curated data (in addition to using a more standardized approach). In addition, perhaps syntax errors can be introduced manually/programmatically in submissions that are known to be correct, such as done in other studies (e.g. here).
Also the omitted data deserves a closer look. Most of the syntax returned two inspection objects when parsed, which seemed too complicated for analysis when starting out (some returned up to seventeen inspection objects). Combining the parser results for all 700+ omissions shows some additional syntax error classes and gives us this chart:
Out[]=
|
Finally, most available data has remained largely untouched - such as code without errors . Perhaps these can be of interest as well, for example to study the ratios of correct vs. incorrect submissions as a way to rank difficulty. Next, identifying exit points (where do users drop out) could reveal some additional topics users are struggling with. Another way could be to filter all available correct syntax on unique inputs then use Nearest on faulty submissions to train a neural net.
Keywords
Keywords
◼
EIWL, Elementary Introduction
◼
Teaching Wolfram Language Optimally
◼
Machine Learning
◼
Data Science
◼
Syntax Errors
◼
Parsing
◼
Misunderstanding
Acknowledgment
Acknowledgment
Mentor: Jofre Espigule-Pons
I’d like to thank Jofre for his mentorship, as he was kind enough to provide gentle feedback on my coding style, on specific functions and on project focus. Also, he provided input on machine learning and did some initial training on the curated pilot set.
Others that have contributed to this project through their comments or assistance: Richard Hennigan, Silvia Hao, Jesse Friedman, Brenton Bostick & Stephen Wolfram.
Others that have contributed to this project through their comments or assistance: Richard Hennigan, Silvia Hao, Jesse Friedman, Brenton Bostick & Stephen Wolfram.
References
References
◼
Elementary Introduction to Wolfram Language, 2/E: https://www.wolfram.com/language/elementary-introduction/2nd-ed/
◼
Code Parser: https://community.wolfram.com/groups/-/m/t/1931315
◼
Learning to Fix Programs from Error Messages: http://ai.stanford.edu/blog/DrRepair/


Cite this as: Roel Fledderman, "Cataloguing User Error Types on EIWL Exercises" from the Notebook Archive (2021), https://notebookarchive.org/2021-07-6g4gm4e

Download

