PERSONALIZING EXPLANATIONS IN ONLINE PROBLEMS USING MULTI-ARMED CONTEXTUAL BANDITS

Abstract

Online educational resources are providing learning support to hundreds of millions of people??think of resources as diverse as Khan Academy, MOOCs, Duolingo, Canvas, and Moodle.However, these online resources, even if highquality,may have limited effectiveness inpromoting learning if they present a frozen curriculum that does not improve.For digital education to play a substantive role, datadrivenmethods for improvement areneeded. How can educational resources like explanations or hints to problems be continuallyenhanced and personalized as more students use them?In our approach, we will develop workflows for crowdsourcing instructional actions frominstructors and students, test out these actions using randomized controlled trials, usemultiarmedbandit algorithms to analyze data about which actions are effective, in order topresent effective actions more frequently to future learners.In the crowdsourcing phase, we will examine the most effective way to crowdsourceexplanations through a series of experiments with three key sets of experimental manipulations.First, we will vary the questions used to elicit explanations. Second, we will vary the realworldimpact of the explanations that students give (and explain these to students accordingly). Third,we will adapt methods from behavioral economics to encourage crowdsourced contributions.We will then build a personalized learning system that selflearnsthe best explanations to showto students. Central to this system will be multiarmedcontextual bandits. In our context, amultiarmedbandit is an algorithm that learns the best explanation to deliver to students overtime. Explanations are typically shown at random initially, but as time passes, explanations thatare evaluated to be better have a higher likelihood to be displayed. A multiarmedcontextualbandit is more powerful in that it can take into account contextual factors (e.g. gender, age, andprior performance of learner) in selecting explanations.We will explore three sets of research questions regarding our contextual bandit algorithms. Thefirst set of questions revolves around how best to handle the potentially large number ofexplanations submitted by learners, so that the contextual bandit can distinguish between themeven with a relatively small pool of learners. For example, should we eliminate explanationswhere the explainer has poor knowledge about how to solve this type of problem, and only useexplanations generated by learners with high prior knowledge? Or might the explanations fromlow prior knowledge learners be useful, because they try to communicate concepts withappropriately simple vocabulary, and avoid unnecessary complexity?The second set of questions involves the explorationexploitationtradeoff. We will examinequestions such as: Is it better to completely explore for some time and then completely exploitthereafter, or is it better to reduce exploration gradually as time progresses? If the former, whenshould the cutoff be? If the latter, at what rate should one reduce exploration?The third set of questions involves studying how best to elicit responses about explanationusefulness. For example, should we ask learners to rate explanation usefulness on a fivepointscale, or a sevenpointscale? Or should we ask learners to rank explanations against each other?Our work can potentially impact broad segments of society. There is scope for making MOOCs,educational apps (e.g. Duolingo), university courses, and even traditional classrooms better.More importantly, our tool will not only benefit students but also teachers. Our work will alsolikewise contribute to Department of Defense capabilities?? a backoftheenvelopecalculationsuggests that tens of thousands of students could save over a million manhours.The solutionswe propose will result in enhancements to specific content, such as lessons or problems in aMOOC on Calculus. In addition, the method we outline can more generally be used to optimizeany kind of educational resource

Document Details

Document Type: DoD Grant Award
Publication Date: Sep 04, 2018
Source ID: N000141812755

Entities

People

Joseph Williams

Organizations

Office of Naval Research
United States Navy
University of Toronto

PERSONALIZING EXPLANATIONS IN ONLINE PROBLEMS USING MULTI-ARMED CONTEXTUAL BANDITS

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers