I’ve been studying dynamics of reader memory with the mnemonic medium, running experiments on interventions, etc. A big challenge has been that I'm roughly trying to understand changes in a continuous value (depth of encoding) through discrete measurements (remembered / didn’t).
Conversation
I can approximate a continuous measure by looking at populations: “X% of users in situation Y remembered.” Compare that % for situations Y and Y’ to sorta measure an effect. This works reasonably well when many users are “just on the edge” of remembering, and poorly otherwise…
1
5
It’s a threshold function on the underlying distribution. Imagine that a person will remember something iff their depth-of-encoding (a hidden variable)—plus some random noise (situation)—is greater than some threshold. Our population measure can distinguish A vs A’, not B vs B’.
2
6
So it works pretty well initially, when the distribution’s spread out. e.g.: I’ve been running an RCT on retry mechanics. Of readers who forget an answer while reading an essay, about 20% more will succeed in their first review if the in-essay prompt gave them a chance to retry.
Replying to
But it doesn’t work well when the distribution’s skewed to one side. eg: I’ve run RCTs manipulating schedules. You might think shortened intervals would help struggling readers, but it has little effect on the population measure—just (likely) nudges some closer to the threshold.
1
2
Lack of a good continuous measure makes it hard to characterize the dynamics of what’s going on, which makes it hard to make iterative improvements. I’ll need to find some good solution here. Unfortunately, response times are (AFAICT) not a strong enough predictor to use.
2
6
Incidentally, this is part of why Ebbinghaus used nonsense syllables: he was memorizing sequences he’d *never* remember on the first try in subsequent tests. But it’d take less time to re-learn well-rehearsed sequences—time savings as a continuous proxy for depth of encoding.
1
3
(Yes, I’m aware that some memory systems ask users to subjectively “grade” their memory 1-5, which would be slightly less discrete. I suspect it probably doesn’t add enough measurement resolution to be worth the user burden, but could be worth trying.)
1
3
The thing I have to keep reminding myself about a statement like this is that it does *not* mean that the mechanic causes 20% increase in depth-of-encoding. It's more likely a fairly small increase for a large number of people right below the threshold.
Quote Tweet
So it works pretty well initially, when the distribution’s spread out. e.g.: I’ve been running an RCT on retry mechanics. Of readers who forget an answer while reading an essay, about 20% more will succeed in their first review if the in-essay prompt gave them a chance to retry.
Show this thread
3
6
Replying to
Subjectively this significantly helps me! Really made a difference when you added it (I think!)
1
1
Yes, reader interviews were pretty positive about it! Though sometimes (as you’ve pointed out) it can perhaps be too pushy / literal. I’m mostly interested in it because it shows that “simple” mechanism changes can move the needle (so may be more low-hanging fruit).
1
1
Show replies

