SuperMemo.com

Do you have a question about memory, learning or SuperMemo? Write to Dr Wozniak

Last modified:

SuperMemo Algorithm

Intervals used in SuperMemo are not optimum intervals!
Higher grades can produce shorter intervals
Even low forgetting index can produce long intervals
First repetition does not have to take place on the next day
Why isn't first repetition followed by interval=1?
Intervals are slightly randomized
The algorithm used in SuperMemo is not "fixed"
The more time you give to SuperMemo, the closer it will approximate your memory needs
SuperMemo contradicts some results reported by Tony Buzan
Grades in final drill do not affect the interval
Your response time does not matter
You can compute your retention from the forgetting index
Different intervals used in different SuperMemos
Use Simulation to estimate workload
SuperMemo ain't science! It's just ad junk
SuperMemo is better than re-wise
Mid-interval repetitions do not bias your measured forgetting index
Simulation of learning process in SuperMemo may be inaccurate
Short-term memory requires no spaced repetition
Multitasking is not recommended in learning
Repetition category is used to update optimization matrices
First Grade vs. A-Factor graph data is kept as a collection of trailing averages
The first experiments helped predict the most likely length of successive inter-repetition intervals without actually measuring retention beyond weeks
Setting the forgetting index to 9% and later increasing it to 12% will likely produce a 2% drop in retention
Why don't you continue your experiment with neural networks?
Why does not SuperMemo experiment with shorter intervals?
Comparing SuperMemo on paper with freeware applications
OF matrix is 20x20
After repetition, SuperMemo modifies OF matrix entry that was used in computing the interval
Changing the OF matrix in SuperMemo
How is the RF matrix computed?
Licensing Algorithm SM-15
Recursive interval function
Only one forgetting curve is updated at each repetition

(Manfred Kremer, Germany, Sep 7, 1998)
Question:
I noticed that frequently I get Optimum Interval in Element Data window shorter than the last interval displayed as Interval. Is it a bug in SuperMemo?
Answer:
No. If your forgetting index is very low, e.g. 3%, SuperMemo will often conclude that you will stand 97% chance of remembering a given element only if your next interval is shorter than the presently used one. In such cases, it will not accept the new value and the new interval will be at least 5% longer than the previous interval. Please note that the forgetting index equal 3% should only be used for selected high-priority items. Keeping the forgetting index at this level throughout the collection will make repetitions annoyingly frequent and ineffective

Higher grades can produce shorter intervals
(Mohamad Syafri, Aug 05, 2004, 16:36:09)
Question:
I think that lower grades, e.g. Pass (3), should produce shorter intervals in comparison to higher grades, e.g. Bright (5). It is not always so in the SuperMemo method. Why can I not see the correlation between intervals and the grades given in learning?
Answer:
This is not an error in the algorithm. In SuperMemo, lower grades may produce longer intervals because of a number of reasons:

Random dispersion: All intervals are always slightly dispersed around the optimum value. This results in more accurate plotting of the forgetting curves. In addition, interval dispersion prevents large variations in the number of repetitions executed on a single day. If you cancel the grade a few times, you will see that at each try, SuperMemo will provide a slightly different interval. Those intervals are normally distributed around the optimum. This means that most of the times, the deviation is little, but occasionally SuperMemo will schedule the element at an interval that is quite different from the optimum interval. Even if all the remaining reasons listed below do not apply, it may happen that lower grade will result in a longer interval by mere chance (esp. for lower forgetting index values that may approach bordering conditions on interval change)
Incomplete information: Early in the learning process, SuperMemo is gradually modifying the function of optimum intervals. The speed at which the change occurs, depends on the number of repetitions falling into a given category. By definition of SuperMemo, good grades are prevalent and optimization for the corresponding A-Factors proceeds faster. It may happen that for a few weeks, for a range of A-Factors, lower grades will produce longer interval
Spacing effect: Grade Pass (3) may result in the enhancement of the so-called spacing effect, which may be less visible for Bright (5). The spacing effect says that longer intervals, and consequently greater recall efforts, produce more stable memory engrams. SuperMemo does not arbitrarily set the function of optimal intervals. It computes intervals which are most likely to result in the requested forgetting index
First grade: At memorizing pending items, SuperMemo ignores grades as it has no way of knowing how a given items found its way into the collection. For example, a good grade might result from the fact that the item has just been introduced to the collection

Your impression of no correlation between grades and intervals is quite common among those who begin their work with SuperMemo. If you are not sure if the reasons listed above are legitimate, you can monitor your retention with Tools : Statistics : Analysis : Use : Efficiency. This way you can be sure that SuperMemo will keep its promise of reaching your desired knowledge retention (assuming appropriate formulation of the learning material)

(Matt Cassidy, New Zealand, Sep 11, 1997)
Question:
Is it possible that with forgetting index equal to 3% I get the first interval equal to 6 days?
Answer:
Yes. Especially if the material you work with is relatively easy. You should also remember about random dispersion of intervals. In isolated cases, dispersion might produce intervals substantially longer (or shorter) than the optimum interval. For more read about SuperMemo Algorithm

(David Mckenzie, New Zealand, Apr 8, 1998)
Question:
Why does not the first repetition after forgetting occur the next day after the unsuccessful repetition (this is advised by Tony Buzan and others)?
Answer:
In SuperMemo, the length of the first interval is computed from the forgetting curve plotted in the course of repetitions. This is to make sure that a defined proportion of items is remembered (usually 80-97%). This proportion is programmed by means of the forgetting index. Depending on the forgetting index, the length of the first interval may range from 1 to 20 days, and is not set arbitrarily. It is computed from the record of repetitions and determined by the desired forgetting index (requested forgetting index is the proportion of items that are not remembered at repetitions). While Buzan’s recommendation is valid in many cases, you should not forget that SuperMemo computes intervals with a high degree of accuracy that cannot otherwise be easily achieved

Why isn't first repetition followed by interval=1?
(Rick Natt, South Africa, Jun 30, 2007, 13:02:38)
Question:
I use SuperMemo mainly for memorizing foreign languages and my mathematics notes. When I learn a new word, how come the first repetition is about 8 days from now? Surely the first repetition should be the next day and only after that should the interval get longer
Answer:
Having intervals of one day would have the following negative consequences:

you would increase your workload
memory imprint produced would be weaker due to the spacing effect
successive intervals would also need to get shorter multiplying the negative effects

SuperMemo produces intervals that are as long as possible within the limits of your desired level of recall. For example, if you plan to recall 95% of foreign vocabulary, the first interval is more likely to be around 8 than to be around 1.

Question:
Why is the first interval after which the first repetition takes place not equal in all cases?
Answer:
It is randomly modified to speed up computing its optimal value. Additionally, random dispersion of intervals around the optimum value prevents repetitions from being packed on a given day, while neighboring days have lots of room to accommodate new items.

The algorithm used in SuperMemo is not "fixed"
(DaVinci, Oct 04, 2004, 13:20:30)
Question:
SuperMemo is always increasing the intervals at the same rate (given the same series of grades), while FullRecall is smarter and adapts to the learner
Answer:
This is not true. The rate of interval increase is determined by the matrix of optimum intervals and is by no means constant. Moreover, the matrix of optimum interval changes in time depending on user's performance. You may have an impression of a fixed or rigid algorithm only after months or years of use (the speed of change is inversely proportional to the available learning data). This convergence reflects the invariability of the human memory system. It does not matter if you use algebraic or neural approach to the optimization problem. In the end, you will arrive at the spaced repetition function that reflects the true properties of your memory. In that light, the speed of convergence should be held as a benchmark of algorithm's quality. In other words, the faster the interval function becomes "fixed", the better

The more time you give to SuperMemo, the closer it will approximate your memory needs
(Luis Gustavo Neves da Silva, Brazil, Thursday, January 04, 2001 4:29 PM)
Question:
If I memorize a collection of 200 items with SuperMemo and make regular repetitions, when will my measured forgetting index be closest to the requested forgetting index: after a day, after a week, after a month or after a year?
Answer:
In conditions of no outside interference, the more time you give to SuperMemo, the better it will approach the requested level of retention. However, if you encounter this knowledge in real life (i.e. outside SuperMemo), the result cannot be predicted. For example, interference early in the interval may have no effect, while interference later in the interval may increase retention in the repetition to come and reduce the retention in the repetition that will follow yet another interval. The outcome will depend on the timing relationship of interference and measurement

Question:
Tony Buzan claims that 75% of information is lost if not reviewed in 24 hours. Does it not defeat the validity of SuperMemo in which the first interval is often longer than a week?
Answer:
No. Buzan's claim may refer to textbook knowledge or complex knowledge structures (e.g. large mind maps). However, it does not seem accurate in reference to simple well-structured material in the light of results obtained with SuperMemo. In SuperMemo, if the student chooses the retention of 95%, the typical value of the first interval falls in the range 2-5 days depending on the student and the difficulty of the learned material. For retention 25%, the same interval might be as long as one month, though it cannot be verified experimentally with SuperMemo which limits the range of the forgetting index from 3-20%, which implies the overall retention in the range of 89-99%. For more see: Theoretical background of SuperMemo

(Grzegorz Malewski, Poland, Dec 10, 1997)
Question:
Do grades at final drill affect the learning process?
Answer:
No. They are only used to eliminate items from the final drill queue.

(Ryszard Siwczyk, Poland, Nov 4, 1997), Poland, Nov 4, 1997)
Question:
Does the response time at repetitions influence the next interval?
Answer:
No. Repetition timer is only used to compute the average response time and Workload.

Setting the forgetting index to 9% and later increasing it to 12% will likely produce a 2% drop in retention
(Tomasz Szynalski, Poland, Oct 18, 1998)
Question:
What retention can I obtain with the forgetting index set to 9%? What if I then change it to 12%?
Answer:
The formula that relates the forgetting index to the retention looks like this (source):

retention = -(forgetting index)/ln(1-(forgetting index))

If you accomplish the forgetting index of 9%, the retention will equal 95.4%. For 12%, the same figure will be 93.9%. Note that if your material is very difficult, your measured forgetting index may be higher than the requested forgetting index. This comes from the fact that SuperMemo imposes some boundary conditions on the increase of intervals. Elements that have been forgotten more than five times should be reformulated with a view to reducing their difficulty or increasing their mnemonic component.

If you initially set the forgetting index to 9% and later on increase it to 12%, you will probably start with retention of 94-95% which will later gradually decrease to 92-93% (after the change)

Different intervals used in different SuperMemos
(Stracner, Jason, United States, Jul 13, 2004, 06:39:47)
Question:
Why are there different intervals used in paper SuperMemo, SuperMemo for Palm Pilot, SuperMemo for Pocket PC and SuperMemo for Windows?
Answer:
Only SuperMemo for Windows computes accurate intervals that ensure achieving the desired forgetting index (assuming no overload and assuming the learning material is correctly formulated). For technical reasons, other SuperMemos use simplified algorithms (usually based on E-Factors, which lead to minimum computing). In addition, authors of individual SuperMemos often introduce their own modifications to the algorithm (e.g. limits on E-factor change, arbitrary first interval, interpretation of memory lapses, etc.). Paper SuperMemo is the least accurate. It uses fixed intervals for conglomerate pages of items. In such circumstances, it is not possible to differentiate between difficult and easy material. Nor is it easy (or worthwhile) to modify the function of optimum intervals on paper based on the feedback from the learning process

Use Simulation to estimate workload
(Piotr Wasik, Poland, Tue, Apr 24, 2001 14:26)
Question:
I would like to know how to estimate my workload on a large collection if I commit 40 items per day and keep the forgetting index at the default ten percent
Answer:
If you use Tools : Statistics : Simulation and set: (1) the forgetting index to 10%, and (2) daily repetitions to 230 items, you will get 40 new items memorized per day. In other words, your workload might roughly be 230 items/day (this will vary greatly depending on the quality of your learning material)

(Ben Lister, Apr 5, 2000)(comment placed at PalmGear in reference to SuperMemo for Palm Pilot)
Comment:
[…] this "scientific study" thing [in reference to SuperMemo], that's a bunch of advertisement junk. It's not science that they use, but simple variable calculations. The program detects the number of times you have gotten something right or wrong and then uses this number to decide when it should next be on a test, like a priority list
Rejoinder:
Ben Lister's observations were surprisingly derived from SuperMemo for Palm Pilot, which reveals little of its internally used optimization procedures (which are slightly simplified as compared with the Windows version). The timing of repetitions may remotely be correlated with the number of bad grades, but is actually derived from the currently estimated so-called memory stability (which reflects how quickly memory traces volatilize) and knowledge difficulty (which reflects the increase in stability as a result of a repetition). Consequently, items with a large number of memory lapses can be repeated in longer intervals than items with fewer memory lapses (unlike it is claimed by Ben Lister). SuperMemo does not use a priority list in scheduling repetitions. The scheduling algorithm does not consider the relationship between items but their reference to time, their difficulty and the history of repetitions. For example, if repetitions were to be delayed, the "priorities" would change. So would the scheduling. The central point of the algorithm is based on plotting the forgetting curve and predicting the optimum timing of the next repetition. The greatest merit of SuperMemo is in fine-tuning the algorithm so that it can adapt to individual skills and knowledge in shortest possible time. No algorithm running in abstraction from the actual forgetting curve stands a chance to compete with SuperMemo in its efficiency (this includes neural algorithms that indirectly will predict the probability of forgetting)

(Tomasz Strzelczyk, Poland, Jan 31, 2001)
Question:
I would like to invest in a Langmaster course of English. How would you convince me that SuperMemo is superior to the rewise method?
Answer:
English courses by Dr Lang have a very good reputation for quality and they can be recommended independent of the question on the efficiency of the re-wise method. You could invest in a Langmaster course that best suits your needs and boost your vocabulary with SuperMemo and Advanced English. Alternatively, you could use stand-alone SuperMemo for learning the material from the Langmaster course. As the for re-wise method (developed in Czech Republic around 1994), its principles are similar to SuperMemo; however, we have no doubts as to the superiority of SuperMemo technology that encompasses far more than the repetition scheduling. We have not tested re-wise extensively, but our customers who also use Langmaster CD-ROM titles, unanimously confirm that they prefer to keep their learning material in SuperMemo. Combining Langmaster courses with stand-alone SuperMemo or with Advanced English would probably be the recommended course of action

Mid-interval repetitions do not bias your measured forgetting index
(Robert Drzd, Poland, Friday, November 04, 2005 6:16 PM)
Question:
The measured forgetting index is not updated when I mid-review items from the future. For instance when I repeat those items collected with the Filter option. A retention value in the Workload window is updated correctly
Answer:
To prevent improving the measured forgetting index with early repetitions, it is not updated if the repetition occurs before its scheduled date. This selective measurement is not applied to "today's" measured forgetting index nor to the retention data in displayed in the Workload window. Consequently, the latter two may show better readings if you use mid-interval review a lot

(Terje A. Tonsberg, Kuwait, Jan 13, 2001)
Question:
I received seemingly wrong results when using Tools : Statistics : Simulation in SuperMemo. I got the minimum speed of learning for the forgetting index of 8%. Once I reduced the forgetting index to 5% and further to 3%, the expected speed of learning increased substantially!
Answer:
Your observation is accurate. Indeed, there is a discrepancy between the simulation procedure and SuperMemo Algorithm used in computing the optimum schedule of repetitions. In many texts about SuperMemo you will find that SuperMemo ensures the retention programmed with the forgetting index on the assumption there is no delay in repetitions. This is however imprecise. To prevent clogging up the learning process, Algorithm SM-15 makes an assumption for grades Pass and above (3-5) that the interval must increase by at least one day (i.e. it cannot decrease nor can it stay the same). For low values of the forgetting index and for difficult learning material this assumption actually means a significant departure from the expected retention level. Unfortunately, the simulation procedure does not take this fact into the account, and it does not attempt to correct the forgetting rate which increases as a result of the rigid limitation imposed on the increase in intervals. Consequently, for a low value of the requested forgetting index, the measured forgetting index may be higher than the one assumed for the simulation purposes. This will naturally produce skewed simulation results: repetitions scheduled late by Algorithm SM-15 will still produce high retention as programmed by the forgetting index fed into the simulation input data. For that reason, simulation may be inaccurate for a low forgetting index if your learning material is difficult.
In the future, we hope to make it possible to adjust retention for the departure of Algorithm SM-15 from the optimum repetition schedule, as well as to make it possible to simulate repetitions at intervals that fully comply with the forgetting index. That latter option would naturally slow down the learning process even further

Short-term memory requires no spaced repetition
(Mark G. Patterson, USA, Wednesday, July 18, 2001 6:24 AM)
Question:
The encoding phase of SuperMemo could be dramatically improved by providing a micro-spacing algorithm that presented each new item for recall 3 to 4 times during a 30 minute interval in an expanding pattern. For example, 0, 5, 15, 30 minutes
Answer:
Spaced repetition is valid for long-term processes and its purpose is to minimize the number of presentations and maximize the memory effect by sufficient spacing. However, improved recall within the span of short-term memory can be accomplished only in cases where the initial encoding was incomplete or insufficient. In other words, an important assumption in SuperMemo is that the first exposure should be used to formulate a valid memory engram that will last until the first repetition. Ideally, even the concept of final drill is excessive and serves solely as an insurance against imperfect concentration on the memory task. Sufficiently encoded short-term memories will always be converted to long-term memories and will likely last a few days until the moment of the first repetition

Multitasking is not recommended in learning
(dansujp, Sun, Sep 16, 2001 3:07 PM)
Question:
Here is another improvement for SuperMemo. When I reviewed the flashcards, I would lay them out on a large table so that I could see 30 at a time, and would pick up the cards for which I knew the answer. Sometimes the answer takes a few seconds to surface. In the mean time I can be looking at other cards and thinking about them in a multitasking fashion. In SuperMemo there is only one question at a time, so it is frustrating to sit there and wait and not have anything else to do until the answer appears
Answer:
Research shows that multitasking considerably reduces cognitive powers. Optimally you should be able to focus on a single recall at a time. In addition, recall should, ideally, be instantaneous. Long and frustrating retrieval times would typically indicate ill-formulated items of high complexity. Your solution might increase the fun of learning for overly complex material, but if you apply the minimum information principle along with other pivotal rules of knowledge representation, multitasking would reduce your processing speed . In the past, we have added a number of options to SuperMemo by sheer user pressure; however, it can be demonstrated that in many cases this have actually done harm to user learning process. We consequently remove options that are frequently misused (e.g. Batch Repetitions, Background Repetitions, some rescheduling tools, and more)

Repetition category is used to update optimization matrices
(Steven Trezise, USA, Apr 20, 1999)
Question:
In my collection, I have items for which I have done between 1 and 8 repetitions. However, when I look at the Cases matrix, there are no entries beyond repetition 3
Answer:
The algorithm used by SuperMemo updates all optimization matrices using repetition categories, not the actual repetition number (you can view the optimization matrices with Tools : Statistics : Analysis : Matrices). A repetition category is an expected number of repetitions needed to reach the currently used interval. Once the matrices change, the estimation of repetition category may change too. If, for example, you score well in repetitions and your intervals become longer, it will take fewer repetitions to get to a given interval. In such a case, you might be at 8-th repetition while your repetition category will be 3. All matrices such as OF matrix, RF matrix, etc. will be updated in the third row (not in the 8-th row)

First Grade vs. A-Factor graph data is kept as a collection of trailing averages
(WangDong, China, Dec 08, 2003)
Question:
You say that "At each repetition, the current element's old A-Factor estimation is removed from the G-AF graph and the new estimation is added". But I found a different result. I have used a collection which has only one item. After many repetitions, I saw the G-AF graph. I found many changes (not a single point change). Only one point in the G-AF graph should change, because there was only one item in the collection. I guess your algorithm is: "In the G-AF graph, for every A-Factor value(1.2-6.9), calculate the average first grade of the items which have this certain A-Factor. When the A-Factor of an item is changed, calculate the average first grade of the corresponding A-Factor again. Is it right?
Answer:
Your reasoning is absolutely correct! However, to minimize the size of data kept by the algorithm, SuperMemo does not use averages to compute the first grade for each A-Factor category. Instead, it uses trailing averages. This way, instead of storing a large sum and the number of cases recorded, SuperMemo stores a single short number expressing the average first grade. This approach saves space, but does not make it possible to correct averages once A-Factor changes. SuperMemo simply computes the new trailing average for the new A-Factor. This is why a single item can introduce a lot of noise in the graph. Please note that the weight for averaging is highest for Repetition=2 where A-Factor is the same as O-Factor. For that single repetition, you should see the greatest change in the graph

The first experiments helped predict the most likely length of successive inter-repetition intervals without actually measuring retention beyond weeks
(Tomasz Szynalski, Poland, Oct 18, 1998)
Question:
When first versions of SuperMemo were released, how could SuperMemo predict intervals that were many years long if it had only been researched for a couple of years? I read that the first version was released after just 3-4 years of research on the length of intervals.
Answer:
The first experiments in reference to the length of optimum interval resulted in conclusions that made it possible to predict the most likely length of successive inter-repetition intervals without actually measuring retention beyond weeks! In short, it could be illustrated with the following reasoning, if the first months of research yielded the following optimum intervals: 1, 2, 4, 8, 16 and 32 days, you could with confidence hope that the successive intervals would increase by a factor of two. To better understand what reasoning lead to the first formulation of SuperMemo read: First experiments: 1982-1985

Why don't you continue your experiment with neural networks?
(Bartosz, Poland, Nov 01, 2006, 14:08:40)
Question:
Why don't you continue your experiment with neural networks? I agree with MemAid in that your models might be wrong, and a neural network can find the real truth about how memory works? Neural networks are unprejudiced
Answer:
It is not true that SuperMemo is prejudiced while a neural network is not. Nothing prevents the optimization matrices in SuperMemo to depart from the memory model and produce an unexpected result. It is true, that over years, with more and more knowledge of how memory works, the algorithm used in SuperMemo has been armed with restrictions and custom-made components. None of these were a result of a wild guess though. The progression of "prejudice" in SuperMemo algorithms is only a reflection of findings from previous years. The same would inevitably affect any neural network implementation if it wanted to maximize its performance.

It is also not true that the original pre-set values of optimization matrices in SuperMemo are a form of prejudice. These are an equivalent of pre-training in a neural network. Moreover, a neural net that has not been pre-trained will be slower to converge to the optimum model. This is why SuperMemo is "pre-trained" with the model of an average student.

Finally, there is another area where neural networks must either use the existing knowledge of memory models (i.e. carry a dose of prejudice) or lose out on efficiency. The experimental neural network SuperMemo, MemAid, as well as FullRecall have all exhibited an inherent weakness. The network achieves the stability when the intervals produce a desired effect (e.g. specific level of the measured forgetting index). Each time the network departs from the optimum model it is fed with a heuristic guess on the value of the optimum interval depending on the grade scored during repetitions (e.g. grade=5 would correspond with 130% of the optimum interval in SuperMemo NN or 120% in MemAid). Algebraic SuperMemo, on the other hand, can compute an accurate value of A-Factor, use the accurate retention measurement, and produce an accurate adjustment of the value of the OF matrix. In other words, it does not guess on the optimal interval. It computes its exact value for that particular repetition. The adjustments to the OF matrix are weighted and produce a stable non-oscillating convergence. In other words, it is the memory model that makes it possible to eliminate the guess factor. With that respect, algebraic SuperMemo is less prejudiced than the neural network SuperMemo.

Neural network SuperMemo was a student project with a sole intent to verify the viability of neural networks in spaced repetition. Needless to say, neural networks are a viable tool. Moreover, all imaginable valid optimization tools, given sufficient refinement, are bound to produce similar results to those currently accomplished by SuperMemo. In other words, as long as the learning program is able to quickly converge to the optimum model and produce the desired level of knowledge retention, the optimization tool used to accomplish the goal is of secondary importance.

Why does not SuperMemo experiment with shorter intervals?
(Jeremias Sauceda, , Tuesday, November 10, 2009 9:14)
Question:
I have been using SuperMemo to learn Japanese, it's been great! I have also used it to teach my children math and they enjoy it very much. When I started learning Japanese I used the Pimsleur tapes which use very short intervals during each lesson; 5 seconds, 25 seconds, 2 minutes, 10 minutes, etc. according to Wikipedia. Also Anki reviews missed repetitions after 20 minutes. I am curious if there is a reason SuperMemo does not use a similar approach for learning new material or dealing with the drill queue? Or is it just a feature you have not explored at this time?

Answer:

This is one of the most frequently questions about the SuperMemo method. SuperMemo is all about the maximum speed of learning understood as the increase in the total body of knowledge. Repetitions at intervals shorter than those computed by SuperMemo are sometimes referred to as microspacing. Those repetitions may operate on short-term memory (in intervals of minutes or hours), or on long-term memory with a strong impact of the spacing effect. As such, those repetitions are of far less value than the usual repetitions served by SuperMemo. If you learn for an exam, or strive at high fluency or speed of recall, microspacing may play its role. However, if you work on long-term memories that are to last for years, those additional repetitions are either of lesser value per memory effect, or even counterproductive (i.e. not only reducing the inflow of new knowledge, but also adding extra work to retaining the equivalent body of material). Boosting short-term memory with microspacing has a negligible long-term effect (esp. for long-interval repetitions). Any attempt at repetition ahead of time inevitably evokes the spacing effect and thus reduces the impact of the repetition (even though many users just cannot resist the temptation and abuse the tools for early repetitions provided by SuperMemo). The value of microspacing cannot be demonstrated with SuperMemo, or with memory models used in SuperMemo, nor with any notable research in the field of spaced repetition (if you happen to encounter sources claiming otherwise, please get in touch).

Classical SuperMemo adds a concept of the final drill, which also can be considered a form of microspacing. Final drill might be useful if you work on high volume of high interference memories (e.g. vocabulary of a foreign language). Its main role, however, should not be as a reinforcer of memories, but as a tool to establish or re-establish memories in cases of poor recall. In other words, final drill is not to be used for microspacing, but to build the first imprint of memories that one failed to imprint at the first attempt. As compared with standard repetitions, final drill can also produce less value per time, and it is recommended for advanced user to never use the final drill. The most efficient approach is to develop a habit of never passing a failed grade repetition without an attempt to establish some form of a correct memory imprint. This may be difficult for beginners. But this habit can be developed overtime, and an advanced student should never need to use the final drill. If you learn a foreign language, you may still want to use the final drill as a form of battle against memory interference. However, if you use incremental reading to gain structured knowledge (e.g. of sciences), going through the final drill will only add to your learning time. If you feel you still need the final drill in incremental reading, look closely at the quality of the learning material and your repetition habits, esp. the ability to focus on each and every repetition.

All the claims above are not to say that Pimsleur and/or Anki take a wrong approach. Each method and application have their own focus, goals and usership. If you learn high-interference knowledge (e.g. vocabulary), or add a degree of procedural learning (e.g. pronunciation), or add a degree of pattern recognition (e.g. comprehension), or limit the size of the body of knowledge (e.g. core vocabulary), optimum strategies will change. Experimenting with different approaches can serve as a rich cross-fertilization for the future (as evidenced by your interesting question). Moreover, SuperMemo works on the assumption that the user is aware of the optimum learning techniques. However, there are psychological effects related to concepts such as the final drill, or microspacing. If the user experiences a false sense of higher productivity, or just plain enjoyment, the benefits may outweigh the costs. This is why having many systems with many approaches allows of richer comparisons that are not just limited to dry theory.

Comparing SuperMemo on paper with freeware applications
(Dan S.K., Russian Federation, Dec 28, 2009, 10:03:32)
Question:
Is figure 2 table from http://www.supermemo.com/articles/paper.htm applicable for words cramming? Paper-and-pencil SuperMemo article is soon 20 years old and I believe you have more accurate data on time spacing when it comes to using no algorithms and computers, but just constant schedules. I also came across this http://www.learnwords.ru/repeat.html. And it is stated here that:
1st repetition should be scheduled 30 minutes after reading words for the first time
2 - after an hour
3 - after 9 hours
4 - after 24 hours
5- after 3 days
6- after 6 days
7- after 12 days
Answer:

The term optimum interval used by SuperMemo is misleading and inaccurate as optimum intervals do not exist until you define the criteria for computing the optimality. SuperMemo uses the concept of the forgetting index to define the optimum interval for a given desired retention level. SuperMemo on paper was defined in 1985 and roughly approximates optimum intervals for heterogeneous material based on the minimum information principle and the forgetting index of 10%. This definition strongly depends on how heterogeneous material is defined as distribution of material difficulty can dramatically impact the value of intervals. The original definition of SuperMemo was based on the material largely composed of word pairs and should be applicable to most selections of foreign vocabulary material within the same language family. This means that it could work nice for Poles who learn English, but might need some revision for Swedes who learn Russian, and a major revision for Spaniards who learn Korean. Naturally, all those theoretical divagations are of little relevance today when we can employ computers to compute the optimum spacing of repetitions for all individual pieces of information.

As for the quoted Russian software for learning words, it seems to approximate SuperMemo with an extra criterion of adding a bit of "mental comfort". It takes intervals used by SuperMemo on papers (some shortened by a day), and adds four extra microspaced repetitions before the first repetition that would normally be scheduled by SuperMemo. This resolves the problem of mental discomfort experienced by those who encounter SuperMemo for the first time where the first repetition is often scheduled at a far later time than expected by the user. A frequent complaint from beginners is "why are intervals in SuperMemo so long? by the time I review, I no longer remember". Microspacing has been proposed over and over again from various quarters (incl. more advanced users who believe in theories that back the value of microspacing beyond the simple extra added mental comfort).

The benefit of spaced repetition as compared to traditional learning is substantial enough to make the above differences of less significance. You can use SuperMemo on paper, simple freeware applications or the newest SuperMemo algorithms. These all are likely to be highly beneficial.

OF matrix is 20x20
(saeideh monfared, Iran , Nov 07, 2009, 14:34:41)
Question:
How big is the OF matrix used by SuperMemo (what are its dimensions)? Is it the same for SuperMemo 5 as it is for SuperMemo 15?
Answer:
The matrix has 20 rows and 20 columns. It can be inspected in SuperMemo 15 with Tools : Statistics : Analysis : Matrices : OF Matrix. It is the same matrix as used in SuperMemo 5, however, its entries are processed differently in successive algorithms. The 20 rows correspond to 20 repetitions (or repetition categories in later SuperMemos). The 20 columns correspond to A-Factors in newer SuperMemos (or E-Factors in older SuperMemos). The entries of the matrix determine the intervals between repetitions. Each item difficulty (A-Factor or E-factor) has a different set of intervals for different repetitions (or repetition categories).

After repetition, SuperMemo modifies OF matrix entry that was used in computing the interval
(saeideh monfared, Nov 19, 2009, 23:17:02)
Question:
In SM-5 algorithm which is explained here: http://www.supermemo.com/english/ol/sm5.htm in step 7: what do you mean by relevant entry of the OF matrix?
Answer:
SuperMemo uses a simple principle: "use, verify and correct". After a repetition, new interval is computed with the help of the OF matrix. The "relevant entry" to compute the interval depends on the repetition (category) and item difficulty. After the interval elapses, SuperMemo calls for the next repetition. The grade is used to tell SuperMemo how well the interval "performed". If the grade is low, we have reasons to believe that the interval is too long and the OF matrix entry is too high. In such cases, we reduce the OF entry slightly. The "relevant entry" here is the one that was used previously in computing the interval (i.e. before the interval started). In other words, in both cases, the "relevant entry" is the entry that is used to compute the interval (after n-th repetition) and then to correct the OF matrix (after the n+1 repetition).

Changing the OF matrix in SuperMemo
(saeideh monfared, Nov 19, 2009, 23:17:02)
Question:
Would you please give me an OF matrix sample and explain how it will be changing, and why we need to create this matrix fully at first?
Answer:
You can see the OF matrix in SuperMemo in Tools : Statistics : Analysis : Matrices : OF Matrix. This matrix changes in time to make sure that the intervals produced meet the optimization criteria (in the latest SuperMemos, the only criterion is the requested forgetting index). We set the original value of the matrix to be sure we can use it in computing initial intervals that will later be used in verifying and modifying the matrix. The initial value of the matrix is taken from earlier versions of SuperMemo. In the late 1980s, some experiments were made to prove that this matrix can easily be produced within a couple of months from a constant matrix composed of all entries equal to 1.5, 2.0, 3.0 or similar using early SuperMemo algorithms (e.g. Algorithm SM-5).

How is the RF matrix computed?
(Edward Douglas, Jul 31, 2010, 21:19:41)
Question:
I have not managed to find information on the functions of approximating the RF-matrix in SuperMemo
Answer:
The RF-matrix contains actual data/measurements, and thus is not an approximation. Naturally, this data can be considered the approximation of how your memory works: the more data you get, the closer you get to the real reflection of your memory. In simplest terms, RF matrix contains columns that reflect item difficulty, and rows that reflect the strength of memory. For each entry SuperMemo collects repetition data and plots a forgetting curve, i.e. how much you forget with the passage of time. Each entry of the RF matrix corresponds with the point in which forgetting reaches the level defined by the forgetting index. This way, if you know how difficult an item is and how well it is remembered at the moment of the repetition (or how many repetitions it went through), we can predict (roughly) at which moment of time the probability of forgetting will equal the forgetting index. That's the time we want to have the next repetition

Licensing Algorithm SM-15
(Krzysztof Brzezina, Sep 29, 2011, 08:13:06)
Question:
Do I need to buy a license to implement Algorith SM-15?
Answer:
We no longer support projects based on the newest SuperMemo algorithm unless on a basis of a separate contract with major nationwide or global implementation in mind. The main reason for this is (1) the complexity of transferring know-how and trade secrets, and (2) expensive support and testing concerned with proper implementation. This does not mean that you cannot use SuperMemo algorithm in your projects. The simplest solution is to implement the Algorithm SM-2, which is described at supermemo.com. Our only requirement for such cases is a prominent credit given to the authors of SuperMemo. You have to include the following copyright note and site reference regarding the Algorithm SM-2:

Algorithm SM-2, Copyright SuperMemo World, 1991. www.supermemo.com, www.supermemo.eu

If your project is successful and gets a substantial following, you can consider getting in touch again to integrate newer SuperMemo technologies and receive promotional support.

Recursive Interval Function
(Joseph Freeman, Oct 06, 2011, 21:01:08)
Question:

1. on http://www.supermemo.com/help/smalg.htm you write that the Optimal Interval is found by the following function

I(n) =I(n-1) * OF[n,AF]

This seems to be a recursive function.

l(3) expands to:

l(3) = l(3-1) * OF[3,AF] = ( l(2-1) * OF[2,AF] ) * OF[3,AF] = ( OF[1,L+1] * OF[2,AF] ) * OF[3,AF]

Correct?

Answer:

Your derivation is correct, however, not only it is less readable than the original formula, it would actually produce a different algorithmic outcome. The reason is that you compute an interval only when you need it, not in advance. In the meantime, between repetitions, the OF matrix keeps changing. This is why OF[n,AF] is different at different points of time. Computing the third interval at the beginning of the process, using your formula, would produce a suboptimum result (discarding the acutal measurements of forgetting). Computing the third interval when it is needed, using your formula, would also be invalid as the current interval, as opposed to the multiple of OF entries, is actually the best expression of memory strength, even if it was computed suboptimally.

Only one forgetting curve is updated at each repetition
(Joseph Freeman, Oct 06, 2011, 21:01:08)
Question:

You maintain an internal collection of "forgetting curves" for each A-Factor and each Repetition, thus you have 400 curves maintained, and constantly being recalculated for every repetition?

Answer:

There are 400 entries in the RF matrix. However, not all entries can have their R-Factors computed by filling the forgetting curve array with data. For example, easy items rarely go beyond 10 repetitions (e.g. items with AF=6.9 and just 5 repetitions are very rare). At each repetition, only a single data point is added. This means that an exponential approximation of only a single forgetting curve can and needs to be updated

Frequently asked questions about the SuperMemo algorithm

SuperMemo Algorithm

See also: