Conversation

Replying to
I can approximate a continuous measure by looking at populations: “X% of users in situation Y remembered.” Compare that % for situations Y and Y’ to sorta measure an effect. This works reasonably well when many users are “just on the edge” of remembering, and poorly otherwise…
1
5
It’s a threshold function on the underlying distribution. Imagine that a person will remember something iff their depth-of-encoding (a hidden variable)—plus some random noise (situation)—is greater than some threshold. Our population measure can distinguish A vs A’, not B vs B’.
Image
2
6
So it works pretty well initially, when the distribution’s spread out. e.g.: I’ve been running an RCT on retry mechanics. Of readers who forget an answer while reading an essay, about 20% more will succeed in their first review if the in-essay prompt gave them a chance to retry.
2
1
6
But it doesn’t work well when the distribution’s skewed to one side. eg: I’ve run RCTs manipulating schedules. You might think shortened intervals would help struggling readers, but it has little effect on the population measure—just (likely) nudges some closer to the threshold.
1
2
Lack of a good continuous measure makes it hard to characterize the dynamics of what’s going on, which makes it hard to make iterative improvements. I’ll need to find some good solution here. Unfortunately, response times are (AFAICT) not a strong enough predictor to use.
2
6
Incidentally, this is part of why Ebbinghaus used nonsense syllables: he was memorizing sequences he’d *never* remember on the first try in subsequent tests. But it’d take less time to re-learn well-rehearsed sequences—time savings as a continuous proxy for depth of encoding.
1
3
(Yes, I’m aware that some memory systems ask users to subjectively “grade” their memory 1-5, which would be slightly less discrete. I suspect it probably doesn’t add enough measurement resolution to be worth the user burden, but could be worth trying.)
1
3
The thing I have to keep reminding myself about a statement like this is that it does *not* mean that the mechanic causes 20% increase in depth-of-encoding. It's more likely a fairly small increase for a large number of people right below the threshold.
Quote Tweet
So it works pretty well initially, when the distribution’s spread out. e.g.: I’ve been running an RCT on retry mechanics. Of readers who forget an answer while reading an essay, about 20% more will succeed in their first review if the in-essay prompt gave them a chance to retry.
Show this thread
3
6
Replying to
You could potentially give them multiple choice answers with diffenent distances to the right answer so that way if they get it wrong you have an idea of how wrong it is or how close they were to remembering properly.
2
Replying to
Tangentially spanner you may choose to insert into your works: Measure people’s ability to answer a question with the availability of notes, Google, etc. At some point, I get good at using a reference resource, and eventually do away with it in some cases.
1
Show replies