About this idea
To motivate the topic of spaced-repetition learning software, here are two
quotes from the Wikipedia pages on the German psychologist Herman Ebbinghaus,
and the concept of spaced-repetition, respectively:
In 1885, Herman Ebbinghaus published his groundbreaking Über das Gedächtnis
("On Memory", later translated to English as "Memory. A Contribution to
Experimental Psychology") in which he described experiments he conducted on
himself to describe the processes of learning and forgetting.
Ebbinghaus made several findings that are still relevant and supported to this
day. First, Ebbinghaus made a set of 2,300 three letter syllables to measure
mental associations that helped him find that memory is orderly. Second, and
arguably his most famous finding, was the forgetting curve. The forgetting
curve describes the exponential loss of information that one has learned.
The sharpest decline occurs in the first twenty minutes and the decay is
significant through the first hour. The curve levels off after about one day.
Spaced repetition is a method that uses knowledge of this forgetting curve to
increase the efficiency of the learning process, where the subject is asked to
remember a certain fact with the time intervals increasing each time the fact
is presented or said. If the subject is able to recall the information
correctly the time is doubled to further help them keep the information fresh
in their mind to recall in the future. With this method, the patient is able to
place the information in their long-term memory. If they are unable to remember
the information they go back to the previous step and continue to practice to
help make the technique lasting (Vance & Farr, 2007).
Anki is an open-source spaced repetition flashcard program for Windows, macOS,
Linux, Android, and iOS. Widely used by the medical student community [1] (some
estimates suggest roughly half of all US medical students use Anki for
studying) and gaining popularity in the hard sciences [2], this software exists
to help users retain large swaths of material over very long time periods.
However, the internal data format suffers from a number of flaws, chief among
them its incompatibility with most existing collboration software (e.g. version
control systems like git, svn, hg, fossil, or document editors like overleaf,
google docs, etc). The Anki community has already experienced growing pains
related to this issue, and the largest collaborative deck effort, the massive
AnKing deck created for 1st-3rd year medical students, uses a complicated
installation/update process in order to faciliate user-provided corrections to
its content. [3] A system for converting Anki collections to and from
traditional version control formats would go a long way in resolving this
issue.
More concerningly, the user community of Anki is largely at the mercy of
organizations with no oversight [4], who control and gatekeep access to
user-created content on privately-owned servers. Without exception, all
user-created content is hosted on a single server owned by a single individual,
and if this server goes down, loses funding, or experiences data loss, the
community could permanently lose access to decades of its own data. On top of
this, organizations like the AnKing group [5] have begun work on monetizing
this user-created content, breaking the tradition of keeping Anki-related
software free and open source, and charging subscription fees for access to
collaborative deck edits via their commercial, proprietary tool.
The `ki` command-line tool aims to solve all of these issues via a simple
invertible transformation of the underlying SQLite3 database (a
machine-readable-only format) into a directory tree of markdown files (a very
human-readable format). By enabling conversions to and from a format that can
easily be put under version control, `ki` will allow collaboration on
spaced-repetition decks in exactly the same way millions of software engineers
collaborate on open-source projects: by making peer-reviewed edits to a single
source of ground-truth hosted on a public, resilient archive (e.g. GitHub.com)
under an open-source license (enforcing free access to all derivative works).
Hundreds of hours of work have already been put into developing this tool, all
unpaid. It is already a minimum viable product, and usable by developers
willing to put up with a rough-around-the-edges UI and the potential for
occasional bugs and crashes. The source code is freely available on GitHub [6].
A particular emphasis has been placed on writing clean, user-friendly
documentation [7] which demonstrates usage patterns and details the tool's
capabilities.
[1] (https://www.reddit.com/r/medicalschoolanki/)
[2] See Michel Nielsen's essay here:
https://cognitivemedium.com/srs-mathematics
[3] https://www.ankipalace.com/step-1-deck
[4] https://ankiweb.net/account/privacy
[5] https://www.ankihub.net/coming-soon
[6] https://github.com/langfield/ki/blob/main/ki/__init__.py
[7] https://langfield.github.io/ki/
Impact
The userbase of Anki consists largely of students, and a significant proportion
of them are medical students studying for exams like the USMLE STEP 1. Almost
all of these users are already in hundreds of thousands of dollars worth of
debt. They do not need to be charged large, monthly subscription fees to work
with each other on content that they themselves created. The adoption of the
`ki` tool can prevent this. Other user communities include the
language-learning folks, and in fact the Anki program was originally created so
that the author could learn Japanese.
More generally, the availability of Anki decks in a readable format on a
platform as public and widely used as GitHub could drastically increase both
the availability of existing learning resources, and the rate of creation of
new data. It could also help push spaced-repetition into the mainstream, which
could eventually lead to reform in formal educational institutions towards more
effective learning and away from inefficient teacher and student practices
(e.g. lecturing, note-taking, re-reading, highlighting) [1].
[1] Brown, Peter C. Make It Stick : the Science of Successful Learning.
Cambridge, Massachusetts :The Belknap Press of Harvard University Press, 2014.
What I'll do with $5,000
The $5000 will be used for USER TESTING above all else. A technically perfect,
feature-rich software project is pointless if people do not know how to use it.
As much money as is useful will be put towards incentivizing users to give
detailed feedback on usage patterns, frustrations, wants/needs, and bugs in the
tool's operation. This can be done by recruiting on existing online Anki
communities hosted on discord, reddit, and the Anki forums. Users will be paid
a small fee ($10-$50) for giving feedback, and larger bug bounties may be used
to find critical security flaws). Amazon mechanical turk may also be used if
larger sample sizes are necessary. The remainder of the funds will be put
towards developing accessibility features and integration with GitHub's pages
service, which as of May 2022 supports LaTeX rendering of mathematical
equations, which see highly frequent use in Anki decks. These services, if
built-out, could serve as a front-end and previewer for hosted decks, making it
easier for users to see what they're getting before they download content.
Quick Bio
Links