Do you have a question about memory, learning or SuperMemo? Write to Dr Wozniak |
Last modified: |
- Intervals used in SuperMemo are not optimum intervals!
- Higher grades can produce shorter intervals
- Even low forgetting index can produce long intervals
- First repetition does not have to take place on the next day
- Why isn't first repetition followed by interval=1?
- Intervals are slightly randomized
- The algorithm used in SuperMemo is not "fixed"
- The more time you give to SuperMemo, the closer it will approximate your memory needs
- SuperMemo contradicts some results reported by Tony Buzan
- Grades in final drill do not affect the interval
- Your response time does not matter
- You can compute your retention from the forgetting index
- Different intervals used in different SuperMemos
- Use Simulation to estimate workload
- SuperMemo ain't science! It's just ad junk
- SuperMemo is better than re-wise
- Mid-interval repetitions do not bias your measured forgetting index
- Simulation of learning process in SuperMemo may be inaccurate
- Short-term memory requires no spaced repetition
- Multitasking is not recommended in learning
- Repetition category is used to update optimization matrices
- First Grade vs. A-Factor graph data is kept as a collection of trailing averages
- The first experiments helped predict the most likely length of successive inter-repetition intervals without actually measuring retention beyond weeks
- Setting the forgetting index to 9% and later increasing it to 12% will likely produce a 2% drop in retention
- Why don't you continue your experiment with neural networks?
- Why does not SuperMemo experiment with shorter intervals?
- Comparing SuperMemo on paper with freeware applications
- OF matrix is 20x20
- After repetition, SuperMemo modifies OF matrix entry that was used in computing the interval
- Changing the OF matrix in SuperMemo
- How is the RF matrix computed?
- Licensing Algorithm SM-15
- Recursive interval function
- Only one forgetting curve is updated at each repetition
SuperMemo Algorithm
(Manfred
Kremer,
Germany, Sep 7, 1998)
Question:
I
noticed that frequently I get Optimum
Interval in Element
Data window
shorter than the last interval displayed as Interval.
Is it a bug in SuperMemo?
Answer:
No. If
your forgetting index
is very low, e.g. 3%, SuperMemo will
often conclude that you will stand 97% chance of remembering a given
element only if your
next interval is shorter than the presently used one. In such cases, it
will not accept
the new value and the new interval will be at least 5% longer than the
previous interval.
Please note that the forgetting index equal 3% should only be used for
selected
high-priority items. Keeping the forgetting index at this level
throughout the collection
will make repetitions annoyingly frequent and
ineffective
Higher grades can produce shorter intervals
(Mohamad Syafri, Aug 05, 2004,
16:36:09)
Question:
I think that lower grades, e.g. Pass
(3), should produce shorter intervals in comparison to
higher grades, e.g. Bright (5). It is not always so
in the SuperMemo method.
Why can I not see the correlation between intervals and the grades
given in learning?
Answer:
This is not an error in the algorithm. In
SuperMemo, lower grades may produce longer intervals because of
a number of reasons:
- Random dispersion: All intervals are always slightly dispersed around the optimum value. This results in more accurate plotting of the forgetting curves. In addition, interval dispersion prevents large variations in the number of repetitions executed on a single day. If you cancel the grade a few times, you will see that at each try, SuperMemo will provide a slightly different interval. Those intervals are normally distributed around the optimum. This means that most of the times, the deviation is little, but occasionally SuperMemo will schedule the element at an interval that is quite different from the optimum interval. Even if all the remaining reasons listed below do not apply, it may happen that lower grade will result in a longer interval by mere chance (esp. for lower forgetting index values that may approach bordering conditions on interval change)
- Incomplete information: Early in the learning process, SuperMemo is gradually modifying the function of optimum intervals. The speed at which the change occurs, depends on the number of repetitions falling into a given category. By definition of SuperMemo, good grades are prevalent and optimization for the corresponding A-Factors proceeds faster. It may happen that for a few weeks, for a range of A-Factors, lower grades will produce longer interval
- Spacing effect: Grade Pass (3) may result in the enhancement of the so-called spacing effect, which may be less visible for Bright (5). The spacing effect says that longer intervals, and consequently greater recall efforts, produce more stable memory engrams. SuperMemo does not arbitrarily set the function of optimal intervals. It computes intervals which are most likely to result in the requested forgetting index
- First grade: At memorizing pending items, SuperMemo ignores grades as it has no way of knowing how a given items found its way into the collection. For example, a good grade might result from the fact that the item has just been introduced to the collection
Your impression of no correlation between grades and intervals is quite common among those who begin their work with SuperMemo. If you are not sure if the reasons listed above are legitimate, you can monitor your retention with Tools : Statistics : Analysis : Use : Efficiency. This way you can be sure that SuperMemo will keep its promise of reaching your desired knowledge retention (assuming appropriate formulation of the learning material)
(Matt Cassidy,
New Zealand, Sep 11, 1997)
Question:
Is it possible that with forgetting index equal to 3% I get the first
interval equal to 6
days?
Answer:
Yes. Especially if the material you work with is relatively easy. You
should also remember
about random dispersion of intervals. In isolated cases, dispersion
might produce
intervals substantially longer (or shorter) than the optimum interval.
For more read about
SuperMemo Algorithm
(David Mckenzie,
New Zealand,
Apr 8, 1998)
Question:
Why does not the first repetition after forgetting occur
the
next day after the unsuccessful repetition (this is advised by Tony
Buzan and others)?
Answer:
In SuperMemo, the length of the first interval is computed from the
forgetting curve
plotted in the course of repetitions. This is to make sure that a
defined proportion of
items is remembered (usually 80-97%). This proportion is programmed by
means of the forgetting
index. Depending on the forgetting index, the
length of the first interval may range from 1 to 20 days, and is not
set arbitrarily. It
is computed from the record of repetitions and determined by the
desired forgetting index
(requested forgetting index is the proportion of items that are not
remembered at
repetitions). While Buzan’s recommendation is valid in many cases, you
should not
forget that SuperMemo computes intervals with a high degree of accuracy
that cannot
otherwise be easily achieved
Why isn't first repetition followed by interval=1?
(Rick Natt, South Africa, Jun 30,
2007, 13:02:38)
Question:
I use SuperMemo mainly for memorizing foreign languages and my
mathematics notes. When I learn a new word, how come the first
repetition is about 8 days from now? Surely the first repetition should
be the next day and only after that should the interval get longer
Answer:
Having intervals of one day would have the following negative
consequences:
- you would increase your workload
- memory imprint produced would be weaker due to the spacing effect
- successive intervals would also need to get shorter multiplying the negative effects
SuperMemo produces intervals that are as long as possible within the limits of your desired level of recall. For example, if you plan to recall 95% of foreign vocabulary, the first interval is more likely to be around 8 than to be around 1.
Question:
Why is the first interval after which the first repetition takes place
not equal in all
cases?
Answer:
It is randomly modified to speed up computing its optimal value.
Additionally, random
dispersion of intervals around the optimum value prevents repetitions
from being packed on
a given day, while neighboring days have lots of room to accommodate
new items.
The algorithm used in SuperMemo is not "fixed"
(DaVinci, Oct 04, 2004, 13:20:30)
Question:
SuperMemo is always increasing the intervals at the same rate (given
the same series of grades), while
FullRecall is smarter and adapts to the learner
Answer:
This is
not true. The rate of interval increase is determined by the matrix of
optimum intervals and is by no means constant. Moreover, the matrix of
optimum interval changes in time depending on user's performance. You
may have an impression of a fixed or rigid algorithm only after months
or years of use (the speed of change is
inversely proportional to the available learning data). This
convergence reflects the
invariability of the human memory system. It does not matter if you use
algebraic or neural approach to the optimization problem. In the end,
you will arrive at the spaced repetition function that reflects the
true properties of your memory. In that light, the speed of convergence
should be held as a benchmark of algorithm's quality. In other words,
the faster the interval function becomes "fixed", the
better
The more time you give to SuperMemo, the closer it
will approximate
your memory needs
(Luis Gustavo Neves da Silva, Brazil, Thursday, January 04, 2001 4:29
PM)
Question:
If I memorize a collection of 200 items with SuperMemo and
make regular repetitions, when will my
measured forgetting index
be closest to the requested forgetting index: after a day, after a
week, after a month or after a year?
Answer:
In conditions of no outside interference, the more time you
give to SuperMemo, the better it will approach the requested level of
retention. However, if you encounter this knowledge in real life (i.e.
outside SuperMemo), the result cannot be predicted. For example,
interference early in the interval may have no effect, while
interference later in the interval may increase retention in the
repetition to come and reduce the retention in the repetition that will
follow yet another interval. The outcome will depend on the timing
relationship of interference and measurement
Question:
Tony Buzan claims that 75% of information is lost if not reviewed in 24
hours. Does it not
defeat the validity of SuperMemo in which the first interval is often
longer than a week?
Answer:
No. Buzan's claim may refer to textbook knowledge or complex
knowledge structures
(e.g. large mind maps). However, it does not seem accurate in reference
to simple
well-structured material in the light of results obtained with
SuperMemo. In SuperMemo, if
the student chooses the retention of 95%, the typical value of the
first interval falls in
the range 2-5 days depending on the student and the difficulty of the
learned material.
For retention 25%, the same interval might be as long as one month,
though it cannot be
verified experimentally with SuperMemo which limits the range of the
forgetting index from
3-20%, which implies the overall retention in the range of 89-99%. For
more see: Theoretical
background of SuperMemo
(Grzegorz
Malewski, Poland, Dec 10, 1997)
Question:
Do grades at final drill affect the learning process?
Answer:
No. They are only used to eliminate items from the final
drill queue.
(Ryszard Siwczyk,
Poland, Nov 4, 1997),
Poland, Nov 4, 1997)
Question:
Does the response time at repetitions influence the next interval?
Answer:
No. Repetition timer is only used to compute the average response time
and Workload.
Setting the forgetting index to 9% and later
increasing it to 12% will likely produce a 2% drop in retention
(Tomasz Szynalski,
Poland,
Oct 18, 1998)
Question:
What retention can I obtain with the forgetting index
set to 9%? What if I then change it to
12%?
Answer:
The formula that relates the forgetting index to the retention looks
like this (source):
retention = -(forgetting index)/ln(1-(forgetting index))
If you accomplish the forgetting index of 9%, the retention will equal 95.4%. For 12%, the same figure will be 93.9%. Note that if your material is very difficult, your measured forgetting index may be higher than the requested forgetting index. This comes from the fact that SuperMemo imposes some boundary conditions on the increase of intervals. Elements that have been forgotten more than five times should be reformulated with a view to reducing their difficulty or increasing their mnemonic component.
If you initially set the forgetting index to 9% and later on increase it to 12%, you will probably start with retention of 94-95% which will later gradually decrease to 92-93% (after the change)
Different intervals used in different SuperMemos
(Stracner, Jason, United States,
Jul 13, 2004, 06:39:47)
Question:
Why are there different intervals used in paper SuperMemo, SuperMemo
for Palm Pilot, SuperMemo for Pocket PC and SuperMemo for Windows?
Answer:
Only SuperMemo for Windows computes accurate intervals that
ensure achieving the desired forgetting index (assuming no overload and
assuming the learning material is correctly formulated).
For technical reasons, other SuperMemos use simplified algorithms
(usually based on E-Factors, which lead to minimum computing). In
addition, authors of individual SuperMemos often introduce their own
modifications to the algorithm (e.g. limits on E-factor change,
arbitrary first interval, interpretation of memory lapses, etc.).
Paper SuperMemo is the least accurate. It uses fixed intervals for
conglomerate pages of items. In such circumstances, it is not possible
to differentiate between difficult and easy material. Nor is it easy
(or worthwhile) to modify the function of optimum intervals on paper
based on
the feedback from the learning process
Use Simulation to estimate workload
(Piotr Wasik, Poland, Tue, Apr 24,
2001 14:26)
Question:
I would like to know how to estimate my workload on a large collection
if I commit 40 items per day and keep the forgetting index at the
default ten percent
Answer:
If you use
Tools : Statistics : Simulation and set: (1) the
forgetting index to 10%, and (2) daily repetitions to 230 items, you
will get 40 new items memorized per day. In other words, your workload
might roughly be 230 items/day (this will vary greatly depending on the
quality of your learning material)
(Ben
Lister,
Apr 5, 2000)(comment placed at PalmGear
in reference to SuperMemo for Palm Pilot)
Comment:
[…] this "scientific study" thing [in reference to SuperMemo],
that's a bunch of advertisement junk. It's not science that they use,
but simple
variable calculations. The program detects the number of times you have
gotten
something right or wrong and then uses this number to decide when it
should next
be on a test, like a priority list
Rejoinder:
Ben Lister's observations were surprisingly derived from
SuperMemo for Palm Pilot, which reveals little of its internally used
optimization procedures (which are slightly simplified as compared with
the
Windows version). The timing of repetitions may remotely be correlated
with the
number of bad grades, but is actually derived from the currently
estimated
so-called memory
stability (which reflects how quickly memory traces
volatilize)
and knowledge difficulty (which reflects the increase in stability as a
result
of a repetition). Consequently, items with a large number of memory
lapses can
be repeated in longer intervals than items with fewer memory lapses
(unlike it
is claimed by Ben Lister). SuperMemo does not use a priority list in
scheduling
repetitions. The
scheduling algorithm does not consider the relationship
between items but
their reference to time, their difficulty and the history of
repetitions. For
example, if repetitions were to be delayed, the "priorities" would
change. So would the scheduling. The central point of the algorithm is
based on
plotting the forgetting
curve and predicting the optimum timing of the next
repetition. The greatest merit of SuperMemo is in fine-tuning the
algorithm so
that it can adapt to individual skills and knowledge in shortest
possible time.
No algorithm running in abstraction from the actual forgetting curve
stands a
chance to compete with SuperMemo in its efficiency (this includes neural
algorithms that indirectly will predict the probability of
forgetting)
(Tomasz Strzelczyk, Poland,
Jan 31, 2001)
Question:
I would like to invest in a Langmaster course of English.
How would you
convince me that SuperMemo is superior to the rewise method?
Answer:
English courses by Dr Lang have a very good reputation for
quality and
they can be recommended independent of the question on the efficiency
of the
re-wise method. You could invest in a Langmaster course that best suits
your
needs and boost your vocabulary with SuperMemo and Advanced
English. Alternatively, you could use stand-alone
SuperMemo for learning the material from the Langmaster
course. As the for
re-wise method (developed in Czech Republic around 1994), its
principles are similar to SuperMemo; however, we have no doubts as to
the
superiority of SuperMemo technology that encompasses far more than the
repetition scheduling. We have not tested re-wise extensively, but our
customers
who also use Langmaster CD-ROM titles, unanimously confirm that they
prefer to
keep their learning material in SuperMemo. Combining Langmaster courses
with
stand-alone SuperMemo or with Advanced English would probably be the
recommended
course of action
Mid-interval repetitions do not bias your measured
forgetting index
(Robert
Drzd, Poland, Friday, November 04, 2005 6:16 PM)
Question:
The measured forgetting index is not updated when I mid-review items
from the future.
For instance when I repeat those items collected with the Filter
option. A retention value in the Workload
window is updated correctly
Answer:
To prevent improving the measured forgetting index with early
repetitions, it is not updated if the repetition occurs before its
scheduled date. This selective measurement is not applied to "today's"
measured forgetting index nor to the retention data in displayed in the
Workload window. Consequently, the latter
two may show better readings if you use mid-interval review a lot
(Terje A.
Tonsberg, Kuwait,
Jan 13, 2001)
Question:
I received seemingly wrong results when using Tools : Statistics
: Simulation in SuperMemo. I got the minimum speed of
learning for the forgetting
index of 8%. Once I reduced the forgetting index to 5% and
further to 3%,
the expected speed of learning increased substantially!
Answer:
Your observation is accurate. Indeed, there is a discrepancy
between the simulation
procedure and SuperMemo
Algorithm used in
computing the optimum schedule of repetitions. In many texts about
SuperMemo you
will find that SuperMemo ensures the retention programmed with the
forgetting
index on the assumption there is no delay in repetitions. This is
however
imprecise. To prevent clogging up the learning process, Algorithm SM-15
makes an
assumption for grades Pass and above (3-5) that the
interval must
increase by at least one day (i.e. it cannot decrease nor can it stay
the same).
For low values of the forgetting index and for difficult learning
material this
assumption actually means a significant departure from the expected
retention
level. Unfortunately, the simulation procedure does not take this fact
into the
account, and it does not attempt to correct the forgetting rate which
increases
as a result of the rigid limitation imposed on the increase in
intervals.
Consequently, for a low value of the requested forgetting index, the
measured
forgetting index may be higher than the one assumed for the simulation
purposes.
This will naturally produce skewed simulation results: repetitions
scheduled
late by Algorithm SM-15 will still produce high retention as programmed
by the
forgetting index fed into the simulation input data. For that reason,
simulation
may be inaccurate for a low forgetting index if your learning material
is
difficult.
In the future, we hope to make it possible to adjust retention for the
departure
of Algorithm SM-15 from the optimum repetition schedule, as well as to
make it
possible to simulate repetitions at intervals that fully comply with
the
forgetting index. That latter option would naturally slow down the
learning
process even further
Short-term memory requires no spaced repetition
(Mark G. Patterson,
USA, Wednesday, July 18, 2001 6:24 AM)
Question:
The encoding phase of SuperMemo could be dramatically improved by
providing a
micro-spacing algorithm that presented each new item for recall 3 to 4
times during a 30 minute interval in an expanding pattern. For example,
0, 5, 15, 30 minutes
Answer:
Spaced repetition is valid for long-term processes and its
purpose is to minimize the number of presentations and maximize the
memory effect by sufficient spacing. However, improved recall within
the span of short-term memory can be accomplished only in cases where
the initial encoding was incomplete or insufficient. In other words, an
important assumption in SuperMemo is that the first exposure should be
used to formulate a valid memory engram that will last until the first
repetition. Ideally, even the concept of final drill is excessive and
serves solely as an insurance against imperfect concentration on the
memory task. Sufficiently encoded short-term memories will always be
converted to long-term memories and will likely last a few days until
the moment of the first repetition
Multitasking is not recommended in learning
(dansujp, Sun, Sep 16, 2001 3:07
PM)
Question:
Here is another improvement for SuperMemo. When I reviewed the
flashcards, I would lay them out on a large table so that I could see
30 at a time, and would pick up the cards for which I knew the answer.
Sometimes the answer takes a few seconds to surface. In the mean time I
can be looking at other cards and thinking about them in a multitasking
fashion. In SuperMemo there is only one question at a
time, so it is frustrating to sit there and wait and not have anything
else to do until the answer appears
Answer:
Research shows that multitasking considerably reduces
cognitive powers. Optimally you should be able to focus on a single
recall at a time. In addition, recall should, ideally, be
instantaneous. Long and frustrating retrieval times would typically
indicate ill-formulated items of high complexity. Your solution might
increase the fun of learning for overly complex material, but if you
apply the minimum information principle along with other pivotal rules
of knowledge representation, multitasking would reduce your processing
speed . In the past, we have added a number of options to SuperMemo by
sheer user pressure; however, it can be demonstrated that in many cases
this have actually done harm to user learning process. We consequently
remove options that are frequently misused (e.g.
Batch Repetitions, Background Repetitions,
some rescheduling tools, and more)
Repetition
category is used to update optimization matrices
(Steven
Trezise, USA, Apr 20, 1999)
Question:
In my collection, I have items for which I have done
between 1 and 8 repetitions.
However, when I look at the Cases matrix, there are no entries beyond
repetition 3
Answer:
The algorithm
used by
SuperMemo updates all optimization matrices using repetition
categories, not the
actual repetition number (you can view the optimization matrices with Tools : Statistics : Analysis : Matrices).
A repetition category is an expected number of repetitions needed to
reach the currently
used interval. Once the matrices change, the estimation of repetition
category may change
too. If, for example, you score well in repetitions and your intervals
become longer, it
will take fewer repetitions to get to a given interval. In such a case,
you might be at
8-th repetition while your repetition category will be 3. All matrices
such as OF matrix,
RF matrix, etc. will be updated in the third row (not in the 8-th row)
First Grade vs. A-Factor graph data is kept as a
collection of trailing averages
(WangDong, China, Dec 08, 2003)
Question:
You say that "At each repetition, the current element's old A-Factor
estimation is removed from the G-AF graph and the new estimation is
added". But I found a different result. I have used a collection which
has only one item. After many repetitions, I saw the G-AF graph. I
found many changes (not a single point change). Only one point in the
G-AF graph should change, because there was only one item in the
collection. I guess your algorithm is: "In the G-AF graph, for every
A-Factor value(1.2-6.9), calculate the average first grade of the items
which have this certain A-Factor. When the A-Factor of an item is
changed, calculate the average first grade of the corresponding
A-Factor again. Is it right?
Answer:
Your reasoning is absolutely correct! However, to minimize
the size of data kept by the algorithm, SuperMemo does not use averages
to compute the first grade for each A-Factor category. Instead, it uses
trailing averages. This way, instead of storing a large sum and the
number of cases recorded, SuperMemo stores a single short number
expressing the average first grade. This approach saves space, but does
not make it possible to correct averages once A-Factor changes.
SuperMemo simply computes the new trailing average for the new
A-Factor. This is why a single item can introduce a lot of noise in the
graph. Please note that the weight for averaging is highest for
Repetition=2 where A-Factor is the same as O-Factor. For that single
repetition, you should see the greatest change in the graph
The first experiments helped predict the most
likely length of successive inter-repetition intervals without actually
measuring retention beyond weeks
(Tomasz
Szynalski, Poland,
Oct 18, 1998)
Question:
When first versions of SuperMemo were released, how could
SuperMemo predict
intervals that were many years long if it had only been researched for
a couple of years?
I read that the first version was released after just 3-4 years of
research on the length
of intervals.
Answer:
The first experiments in reference to the length
of optimum interval resulted
in conclusions that made it possible to predict the most likely length
of successive
inter-repetition intervals without actually measuring retention beyond
weeks! In short, it
could be illustrated with the following reasoning, if the first months
of research yielded
the following optimum intervals: 1, 2, 4, 8, 16 and 32 days, you could
with confidence
hope that the successive intervals would increase by a factor of two.
To better understand
what reasoning lead to the first formulation of SuperMemo read: First experiments:
1982-1985
Why don't you continue your experiment with neural
networks?
(Bartosz,
Poland, Nov 01, 2006, 14:08:40)
Question:
Why don't you continue your experiment with neural networks? I agree
with
MemAid
in that your models might be wrong, and a neural network can find the
real truth about how memory works? Neural networks are unprejudiced
Answer:
It is not true that SuperMemo is prejudiced while a neural
network is not. Nothing prevents the optimization matrices in SuperMemo
to depart from the memory model and produce an unexpected result. It is
true, that over years, with more and more knowledge of how memory
works, the algorithm used in SuperMemo has been armed with restrictions
and custom-made components. None of these were a result of a wild guess
though. The progression of "prejudice" in SuperMemo algorithms is only
a reflection of findings from previous years. The same would inevitably
affect any neural network
implementation if it wanted to maximize its performance.
It is also not true that the original pre-set values of optimization matrices in SuperMemo are a form of prejudice. These are an equivalent of pre-training in a neural network. Moreover, a neural net that has not been pre-trained will be slower to converge to the optimum model. This is why SuperMemo is "pre-trained" with the model of an average student.
Finally, there is another area where neural networks must either use the existing knowledge of memory models (i.e. carry a dose of prejudice) or lose out on efficiency. The experimental neural network SuperMemo, MemAid, as well as FullRecall have all exhibited an inherent weakness. The network achieves the stability when the intervals produce a desired effect (e.g. specific level of the measured forgetting index). Each time the network departs from the optimum model it is fed with a heuristic guess on the value of the optimum interval depending on the grade scored during repetitions (e.g. grade=5 would correspond with 130% of the optimum interval in SuperMemo NN or 120% in MemAid). Algebraic SuperMemo, on the other hand, can compute an accurate value of A-Factor, use the accurate retention measurement, and produce an accurate adjustment of the value of the OF matrix. In other words, it does not guess on the optimal interval. It computes its exact value for that particular repetition. The adjustments to the OF matrix are weighted and produce a stable non-oscillating convergence. In other words, it is the memory model that makes it possible to eliminate the guess factor. With that respect, algebraic SuperMemo is less prejudiced than the neural network SuperMemo.
Neural network SuperMemo was a student project with a sole intent to verify the viability of neural networks in spaced repetition. Needless to say, neural networks are a viable tool. Moreover, all imaginable valid optimization tools, given sufficient refinement, are bound to produce similar results to those currently accomplished by SuperMemo. In other words, as long as the learning program is able to quickly converge to the optimum model and produce the desired level of knowledge retention, the optimization tool used to accomplish the goal is of secondary importance.
Why does not
SuperMemo experiment with shorter intervals?
(Jeremias
Sauceda, , Tuesday, November 10, 2009 9:14)
Question:
I have been using SuperMemo to learn Japanese,
it's been great! I have also used it to teach my children math and they
enjoy it very much. When I started learning Japanese I used the
Pimsleur tapes which use very short intervals during each lesson; 5
seconds, 25 seconds, 2 minutes, 10 minutes, etc. according to
Wikipedia. Also Anki reviews missed repetitions after 20 minutes. I am
curious if there is a reason SuperMemo does not use a similar approach
for learning new material or dealing with the drill queue? Or is it
just a feature you have not explored at this time?
Classical SuperMemo adds a concept of the final drill, which also can be considered a form of microspacing. Final drill might be useful if you work on high volume of high interference memories (e.g. vocabulary of a foreign language). Its main role, however, should not be as a reinforcer of memories, but as a tool to establish or re-establish memories in cases of poor recall. In other words, final drill is not to be used for microspacing, but to build the first imprint of memories that one failed to imprint at the first attempt. As compared with standard repetitions, final drill can also produce less value per time, and it is recommended for advanced user to never use the final drill. The most efficient approach is to develop a habit of never passing a failed grade repetition without an attempt to establish some form of a correct memory imprint. This may be difficult for beginners. But this habit can be developed overtime, and an advanced student should never need to use the final drill. If you learn a foreign language, you may still want to use the final drill as a form of battle against memory interference. However, if you use incremental reading to gain structured knowledge (e.g. of sciences), going through the final drill will only add to your learning time. If you feel you still need the final drill in incremental reading, look closely at the quality of the learning material and your repetition habits, esp. the ability to focus on each and every repetition.
All the claims above are not to say that Pimsleur and/or Anki take a wrong approach. Each method and application have their own focus, goals and usership. If you learn high-interference knowledge (e.g. vocabulary), or add a degree of procedural learning (e.g. pronunciation), or add a degree of pattern recognition (e.g. comprehension), or limit the size of the body of knowledge (e.g. core vocabulary), optimum strategies will change. Experimenting with different approaches can serve as a rich cross-fertilization for the future (as evidenced by your interesting question). Moreover, SuperMemo works on the assumption that the user is aware of the optimum learning techniques. However, there are psychological effects related to concepts such as the final drill, or microspacing. If the user experiences a false sense of higher productivity, or just plain enjoyment, the benefits may outweigh the costs. This is why having many systems with many approaches allows of richer comparisons that are not just limited to dry theory.
Comparing SuperMemo
on paper with freeware applications
(Dan S.K.,
Russian Federation, Dec 28, 2009, 10:03:32)
Question:
Is figure 2 table from
http://www.supermemo.com/articles/paper.htm applicable for words
cramming? Paper-and-pencil SuperMemo article is soon 20 years old and I
believe you have more accurate data on time spacing when it comes to
using no algorithms and computers, but just constant schedules. I also
came across this http://www.learnwords.ru/repeat.html. And it is stated
here that:
1st repetition should be scheduled 30 minutes after reading words for
the first time
2 - after an hour
3 - after 9 hours
4 - after 24 hours
5- after 3 days
6- after 6 days
7- after 12 days
Answer:
The term optimum interval used by SuperMemo is misleading and inaccurate as optimum intervals do not exist until you define the criteria for computing the optimality. SuperMemo uses the concept of the forgetting index to define the optimum interval for a given desired retention level. SuperMemo on paper was defined in 1985 and roughly approximates optimum intervals for heterogeneous material based on the minimum information principle and the forgetting index of 10%. This definition strongly depends on how heterogeneous material is defined as distribution of material difficulty can dramatically impact the value of intervals. The original definition of SuperMemo was based on the material largely composed of word pairs and should be applicable to most selections of foreign vocabulary material within the same language family. This means that it could work nice for Poles who learn English, but might need some revision for Swedes who learn Russian, and a major revision for Spaniards who learn Korean. Naturally, all those theoretical divagations are of little relevance today when we can employ computers to compute the optimum spacing of repetitions for all individual pieces of information.
As for the quoted Russian software for learning words, it seems to approximate SuperMemo with an extra criterion of adding a bit of "mental comfort". It takes intervals used by SuperMemo on papers (some shortened by a day), and adds four extra microspaced repetitions before the first repetition that would normally be scheduled by SuperMemo. This resolves the problem of mental discomfort experienced by those who encounter SuperMemo for the first time where the first repetition is often scheduled at a far later time than expected by the user. A frequent complaint from beginners is "why are intervals in SuperMemo so long? by the time I review, I no longer remember". Microspacing has been proposed over and over again from various quarters (incl. more advanced users who believe in theories that back the value of microspacing beyond the simple extra added mental comfort).
The benefit of spaced repetition as compared to traditional learning is substantial enough to make the above differences of less significance. You can use SuperMemo on paper, simple freeware applications or the newest SuperMemo algorithms. These all are likely to be highly beneficial.
OF matrix is 20x20
(saeideh
monfared, Iran , Nov 07, 2009, 14:34:41)
Question:
How big is the OF matrix used by SuperMemo
(what are its dimensions)? Is it the same for SuperMemo 5 as
it is
for SuperMemo 15?
Answer:
The matrix has 20 rows and 20 columns. It can be inspected in SuperMemo
15 with Tools :
Statistics : Analysis : Matrices : OF Matrix.
It is the same matrix as used in SuperMemo 5, however, its entries are
processed differently in successive algorithms. The 20 rows correspond
to 20 repetitions (or repetition categories in later SuperMemos). The
20 columns correspond to A-Factors in newer SuperMemos (or E-Factors in
older SuperMemos). The entries of the matrix determine the intervals
between repetitions. Each item difficulty (A-Factor or E-factor) has a
different set of intervals for different repetitions (or repetition
categories).
After repetition,
SuperMemo modifies OF matrix entry that was used in computing the
interval
(saeideh
monfared, Nov 19, 2009, 23:17:02)
Question:
In SM-5 algorithm which is explained here:
http://www.supermemo.com/english/ol/sm5.htm
in step 7: what do you mean by relevant entry of the OF matrix?
Answer:
SuperMemo
uses a simple principle: "use, verify and correct". After a repetition,
new interval is computed with the help of the OF matrix. The "relevant
entry" to compute the interval depends on the repetition (category) and
item difficulty. After the interval elapses, SuperMemo calls for the
next repetition. The grade is used to tell SuperMemo how well the
interval "performed". If the grade is low, we have reasons to believe
that the interval is too long and the OF matrix entry is too high. In
such cases, we reduce the OF entry slightly. The "relevant entry" here
is the one that was used previously in computing the interval (i.e.
before the interval started). In other words, in both cases, the
"relevant entry" is the entry that is used to compute the interval
(after n-th repetition) and then to correct the OF matrix (after the
n+1 repetition).
Changing the OF
matrix in SuperMemo
(saeideh
monfared, Nov 19, 2009, 23:17:02)
Question:
Would you please give me an OF matrix sample
and explain how it will be changing, and why we need to create this
matrix fully at first?
Answer:
You can see the OF matrix in SuperMemo in Tools : Statistics : Analysis :
Matrices : OF Matrix.
This matrix changes in time to make sure that the intervals produced
meet the optimization criteria (in the latest SuperMemos, the only
criterion is the requested forgetting index). We set the original value
of the matrix to be sure we can use it in computing initial intervals
that will later be used in verifying and modifying the matrix. The
initial value of the matrix is taken from earlier versions of
SuperMemo. In the late 1980s, some experiments were made to prove that
this matrix can easily be produced within a couple of months from a
constant matrix composed of all entries equal to 1.5, 2.0, 3.0 or
similar using early SuperMemo algorithms (e.g. Algorithm SM-5).
How is the RF
matrix computed?
(Edward
Douglas, Jul 31, 2010, 21:19:41)
Question:
I have not managed to find information on the functions of
approximating the RF-matrix in SuperMemo
Answer:
The
RF-matrix contains actual data/measurements, and thus is not an
approximation. Naturally, this data can be considered the approximation
of how your memory works: the more data you get, the closer you get to
the real reflection of your memory. In simplest terms, RF matrix
contains columns that reflect item difficulty, and rows that reflect
the strength of memory. For each entry SuperMemo collects repetition
data and plots a forgetting curve, i.e. how much you forget with the
passage of time. Each entry of the RF matrix corresponds with the point
in which forgetting reaches the level defined by the forgetting index.
This way, if you know how difficult an item is and how well it is
remembered at the moment of the repetition (or how many repetitions it
went through), we can predict (roughly) at which moment of time the
probability of forgetting will equal the forgetting index. That's the
time we want to have the next repetition
Licensing Algorithm
SM-15
(Krzysztof
Brzezina, Sep 29, 2011, 08:13:06)
Question:
Do I need to buy a license to implement
Algorith SM-15?
Answer:
We no longer support projects based on the newest SuperMemo algorithm
unless on a basis of a separate contract with major nationwide or
global implementation
in mind. The main reason for this is (1) the complexity of transferring
know-how and trade secrets, and (2) expensive support and testing
concerned with proper implementation.
This does not mean that you cannot use SuperMemo algorithm in your
projects. The simplest solution is to implement
the Algorithm SM-2, which is described at supermemo.com. Our
only requirement for such cases is a prominent credit given to the
authors of SuperMemo. You have to include the following copyright note
and site reference regarding the Algorithm SM-2:
If your project is successful and gets a substantial following, you can consider getting in touch again to integrate newer SuperMemo technologies and receive promotional support.
Recursive Interval Function
(Joseph Freeman, Oct 06, 2011, 21:01:08)
Question:
1. on http://www.supermemo.com/help/smalg.htm you write that the Optimal Interval is found by the following function
I(n) =I(n-1) * OF[n,AF]
This seems to be a recursive function.
Answer:
Your derivation is correct, however, not only it is less readable than the original formula, it would actually produce a different algorithmic outcome. The reason is that you compute an interval only when you need it, not in advance. In the meantime, between repetitions, the OF matrix keeps changing. This is why OF[n,AF] is different at different points of time. Computing the third interval at the beginning of the process, using your formula, would produce a suboptimum result (discarding the acutal measurements of forgetting). Computing the third interval when it is needed, using your formula, would also be invalid as the current interval, as opposed to the multiple of OF entries, is actually the best expression of memory strength, even if it was computed suboptimally.
Only one forgetting curve is updated at each repetition
(Joseph Freeman, Oct 06, 2011, 21:01:08)
Question:
You maintain an internal collection of "forgetting curves" for each A-Factor and each Repetition, thus you have 400 curves maintained, and constantly being recalculated for every repetition?
Answer:
There are 400 entries in the RF matrix. However, not all entries can have their R-Factors computed by filling the forgetting curve array with data. For example, easy items rarely go beyond 10 repetitions (e.g. items with AF=6.9 and just 5 repetitions are very rare). At each repetition, only a single data point is added. This means that an exponential approximation of only a single forgetting curve can and needs to be updated