Rick Carlino

Co-founder at FarmBot and Fox.Build Makerspace,

computer history buff, Open Source tinkerer.

[ 💬 Contact ] [ 👤 About ] [ 📰 Recent Reading ] [ ⌚ Now ]


Problems and Solutions for Spaced Repetition Software

Spaced repetition is a technique for the efficient long-term memorization of facts. It precisely schedules review times (usually, but not always, via software) to minimize studying time while maximizing retention time. It is most often used for foreign language vocabulary acquisition.

This article describes my experience with commonly used spaced repetition systems (SRS) as it relates to foreign language acquisition (as opposed to memorization of general knowledge). It analyzes a number of observed problems with current spaced repetition software and proposes possible solutions.

This article is a planning document for an Anki addon I am writing. I hope that this article helps guide other authors of SRS software as well.

Question: Is Spaced Repetition Effective?

I studied Korean in university. This often meant memorizing large word lists for both short and long term retention.

After trying various SRSes, I eventually found Memrise. I would use Memrise for 20-40 minutes a day and quickly found myself able to memorize large wordsets. After a few semesters of daily use, I had memorized several thousand words and it was rare for me to score lower than 95% on vocabulary exams. Having a large set of words in memory meant I was able to understand texts and extract the key ideas from written articles.

Memrise was an effective tool for memorizing large vocabulary sets. I have no doubts that spaced repetition is based on real science and is a proven means of long term memorization. A number of formal studies have also been published supporting the use of spaced repetition. This article does not aim to challenge the efficacy of spaced repetition. Instead, this article will present a number of problems and possible solutions that could benefit users and authors of SRS software.

Question: Why Not Just Practice?

The best way to learn a language is to use a language. If the goal is to attain language proficiency, then spaced repetition can only be used as a supplement to deliberate practice of actual language use.

With that being said, spaced repetition systems still provide value as a supplemental study aid because:

  • You can practice without the presence of a native speaker or instructor.
  • For students in formal classes, the ability to “solve” memorization problems allows them to focus on other course material, such as grammar and intentional practice.
  • You can use spaced repetition in times that would otherwise be wasted, such as waiting rooms and public transit.
  • You can focus on specialized vocabulary that is harder to memorize due to infrequent occurrence.

Problem: Your Ears Can’t Read Flashcards

During my time with Memrise, I did notice one peculiarity- my listening skills did not keep pace with my reading skills. Despite having a fairly large vocabulary, I still had problems understanding all but the slowest of native speakers. Unlike written articles, I found it difficult to understand TV shows and conversation.

My problems stemmed from the fact that listening and reading are two separate skills. I could visually identify printed words instantaneously. When those same words were spoken, however, things were not so easy. Because I spent all my time memorizing printed words, when I heard a new Korean word audibly, my brain needed to sound it out mentally, then transcribe it to letters and finally visualize the word to identify it. Improving my reading skills did not noticeably improve my listening skills

A similar situation happened when studying for the listening section of a Morse code exam. I was warned by many to not memorize written dots and dashes and that I should instead only use audio tapes or audio training software. The guidance held true and I ultimately passed the exam. If you were to ask me how to write the letter “Q” in Morse code on a sheet of paper, it would take me a moment. Conversely, my ability to identify the same letter by ear was instantaneous. As with foreign language vocabulary, the brain must “convert audio to text” before it can come up with a written answer. The latency incurred by this process will impede a learner’s ability to speak and listen effectively.

Language learners seeking conversational proficiency must place higher emphasis on listening and speaking over reading and writing. This is also a key weaknesses of paper-based systems such as Leitner boxes.

Problem: Vocabulary is Multifaceted

When you memorize foreign language vocabulary, you’re memorizing more than just a two sided note card.

Despite the appearance of a single fact, there are actually 6 “facets” to new vocabulary words.

A single vocabulary word could be expanded to the following memorization strategies:

# Quiz Against Example
1 A word’s sound A word’s spelling Transcription test
2 A word’s sound A word’s definition Listening comprehension test
4 A word’s definition A word’s sound Speaking test
3 A word’s definition A word’s spelling Writing test*
5 A word’s spelling A word’s definition Reading test*
6 A word’s spelling A word’s sound Dictation test*

* In languages with extremely difficult writing systems (Chinese, Japanese) these tests are even more important. In languages with uniform writing systems (Spanish, Korean, Esperanto), these tests are of less importance.

Some of these facets are not easily implemented in software. In the case of speaking and dictation tests, I am only aware of one commercially available software package (DuoLingo). I have also seen some Anki users write custom Javascript tools that tap into Chrome’s speech recognition API. Outside of these use cases, software assisted speech drills seem to be in their early stages.

The table above also illustrates why I do not view paper-based systems (such as the Leitner system) as a good tool for foreign language learners. That is not to say the Leitner system is ineffective- it is still a good tool for college students reviewing class notes in their native language.

Problem: Curated Content is of Limited Use

By now, it should be apparent that 2-sided review systems (simplistic flash card apps) are not a complete solution for language learners. When removing flash card style SRS software from the list, only a few commercially available choices remain.

Although multifaceted reviews offer a more complete memorization solution, it suffers a major tradeoff- data entry becomes cumbersome. Since a student will be expected to enter thousands or even tens of thousands of facts into an SRS, it is extremely important that data entry is streamlined and customizable to the learners needs.

Some strategies used to increase the volume of data entry are:

  • Browser extensions that save unknown vocabulary words as the users finds them in webpages (example).
  • Using text-to-speech systems to eliminate the need to upload custom audio files in listening tests (Quizlet).
  • Providing specialized document readers that are well adapted to vocabulary entry (LingQ, BliuBliu, FLTR)
  • Allowing users to enter vocabulary via large spreadsheets rather than cumbersome forms (Anki).

For creators of paid SRS products, this challenge is particularly thorny. If your users do not want to spend time entering data or are unsure of which data to enter, there is a risk of losing customers. Authors of SRS software have an incentive to cater to entry level students, and an easy path to mass adoption is to create curated content that simplifies the on-boarding process for new learners. To address this concern, some publishers have adopted the anti-pattern of only allowing curated content.

On the surface, having an SRS that offers curated content seems like a very good idea. Curated content is ready to be used, and is often complete with native speaker audio and other multimedia assets. It is typically reviewed by a qualified instructor and offers extremely high quality content. For novice language learners with a vocabulary in the single or double digits, this is a perfectly adequate solution. When a learner enters the advanced intermediate stages, however, curated content becomes less useful.

Once a learner has mastered the first thousand words or so, the learning path becomes increasingly specific. A mechanical engineer will have different learning needs than a literature undergrad. Since software publishers are incapable of anticipating the needs of every learner, curated content falls flat.

For an SRS package to be truly useful, it is not enough to simply allow custom content. It also needs to provide first-class support for custom content in a way that is fast and scales to a vocabulary count in the thousands.

Other Anti-Patterns

The issues noted above are not the only problems I’ve seen, but they are the most pressing. Below are other SRS problems that I could not dedicate an entire section to:

Only using audio as a supplement. I’ve seen some flashcard apps show a word’s audio alongside its spelling. This is wrong because it assumes that audio does not need to be quizzed independently.

Using the mouse during reviews. Since a language learner will need to memorize a five figure vocabulary set, it is essential that each review takes as little time as possible. The user interface of an SRS must respond quickly to the user. Although a mouse makes a UI more discoverable, it is much slower than keyboard short cuts. Losing three seconds per review to mouse movement will add up over time and reduce the effectiveness of an SRS.

Asking the user if they know a word instead of quizzing them. This is more of a personal preference. Some users will disagree with this point. When recalling words in conversation, you either know a word or you don’t. Review software should tell the user if they were correct rather than asking them.

SRS Features That Got It Right

To avoid sounding like a complete pessimist, I’ve compiled a list of SRS features that have benefited me greatly:

Peer reviewed content. Allowing users to share and review content created by others is a great alternative to curated content. It provides greater flexibility than curated content, and higher quality than user generated content. Memrise once did a great job of this, although I’ve heard that community-based content has recently been split off into a separate app that I have yet to try.

Microphone assisted speaking reviews. With the advent of the Web Speech API, it is possible for SRS packages to review verbal speaking skills. The only package I am aware of that does this today is DuoLingo.

Whole sentence review instead of single word review. This strategy is hard to get right, but offers additional benefits over memorizing single vocabulary words. It usually involves importing custom news articles. The best implementation I’ve seen is LingQ.

Hands-free user interfaces and reviews. The whole point of spaced repetition is to optimize time. You want to memorize as much as possible, for as long as possible, in as little time as possible. A spaced repetition system that allows reviews while driving or walking helps meet those goals. The best example I’ve seen is Gradint. Although less popular than other options, Gradint implements some very innovative ideas and is commendable for its listening-first approach.

Third party developer integrations. No one package will satisfy the needs of every learner. Providing a plugin system for software developers to customize application behavior is essential. In this regard, Anki is the clear winner, as you can customize every aspect of the platform via Python or Javascript.

Where Do I Go From Here?

Based on the advantages and disadvantages of available SRS software, I have concluded that Anki is the best solution for advanced SRS users. Despite an extremely high learning curve, Anki offers the best feature set to meet every need listed above. The things that I don’t like about Anki can easily be fixed with addons, thanks to a vibrant open source ecosystem.

I appreciate community feedback on the points I’ve outlined in this article. Please let me know what you think by sending me a message on Reddit or Lobste.rs. Thanks for making it this far into my article!

Appendix of SRS Packages

NOTE: If you would like a package added to this list, please shoot me a message on Reddit. This list intentionally avoids flashcard-only apps, as there are plenty of options out there and they are all relatively easy to find.

  • Anki - The gold standard of SRSes.
  • Polar Bookshelf - Not specificity focused on language learners, but worth mentioning. Polar is a spaced repetition ebook reader, essentially.
  • Duolingo - The only SRS I am aware of that offers speaking practice.
  • Memrise - A personal favorite.
  • Supermemo
  • LingQ - Great for learning grammar.
  • FLTR - An open source reading app similar to LingQ. This is abandonware, unfortunately.
  • Mnemosyne
  • Gradint - A listening-first SRS that works particularly well for visually impaired learners.
  • iKnow.jp - An SRS for learners of English, Chinese and Japanese and one of the earliest SRS packages I am aware of. Although they were very innovative ten years ago, it seems that their competition has caught up with them in terms of features.
  • Quizlet - Offers space repetition with their premium version.
  • Rosetta Stone

Developer Resources

  • Ebisu - A scheduling algorithm. Thanks to /u/BonoboBanana of Reddit for suggesting this.