Announcing the launch of vocab!
I am pleased to announce that my command line utitlity
vocab
is available on github and ready
for use by the community!
vocab
is a tool to help you expand your vocabulary, designed with simplicity in
mind.
Motivation
Being an avid reader of non fiction I often come across words I don’t know. Sometimes I can infer the meaning of an unknown word from its context and sometimes I can’t. When I can’t, I’m forced to pause my reading and look it up. Sometimes the word will come up a second time and I’ve already forgot its meaning! I want to better remember those words so I can expand my vocabulary and not be forced to interrupt my reading flow.
Being a developer I spend a lot of my computer time in the terminal. I created
vocab
as a flashcard-like vocabulary tool for the terminal designed with
simplicity in mind. The goal of vocab
is to help the user remember words they
want to remember. Implementation details below!
Implementation Details
This section will describe the inner workings of vocab such as data
persistence, how it selects words for a practice session, and functional
programming techniques that were used. If you haven’t read vocab
’s
README
please do that before reading this section.
Data Persistence
Word and practice session data are stored in CSV files on your machine. The path to these files is OS dependent and I used the directories-jvm project to abstract that away from my program.
Ad-hoc polymorphism for converting a data object to its CSV representation
Ad-hoc polymorphism is a technique for adding functionality to types on the fly
as an alternative to subclassing. In scala the type class pattern is one
technique to achieve ad-hoc polymorphism. Let’s see how type classes are used in
vocab
to achieve ad-hoc polymorphism.
There are two main case classes in vocab representing the data model:
case class Word(
word: String,
definition: String,
partOfSpeech: Option[SpeechPart],
numTimesPracticed: Int
)
case class PracticeSession(
sessionType: PracticeSessionType,
numWords: Int,
duration: Int,
timestamp: Int,
didFinish: Boolean
)
Before writing Word
and PracticeSession
objects to the storage files they
need to be converted to comma separated values. Using scala’s implicits we can
wrap Word
and PracticeSession
objects in objects with types that provide
this behavior:
sealed trait ToCSVRepr {
// A comma separated value representation of the implementing class
def toCSVRepr(): String
}
// A wrapper class for converting a word to its CSV representation
implicit class WordToCSVRepr(word: Word) extends ToCSVRepr {
def toCSVRepr: String = ???
}
// A wrapper class for converting a practice session to its CSV representation
implicit class PracticeSessionToCSVRepr(practiceSession: PracticeSession) extends ToCSVRepr {
def toCSVRepr: String = ???
}
We’ve defined an two implicit classes, one to wrap Word
s and one to wrap
PracticeSession
’s and now we can call toCSVRepr
directly on Word
and
PracticeSession
objects, even though toCSVRepr
isn’t defined in those
classes!
Using Phantom Types To Make The Application Class Safer
Phantom types are useful for enforcing an ordering in a workflow. We can tell the compiler to only allow certain actions to occur in a specific order.
There are two distinct steps in running the vocab
application:
- Interpreting the command line arguments
- Running the provided command
The steps are encapsulated in functions called on an instance of the
Application
class.
// Parses the command line arguments
def parseArgs(...) = ???
// Runs the command generated from command line parsing
def runCommand(...) = ???
It obviously doesn’t make sense to call runCommand
before calling parseArgs
.
We can tell the compiler to make an invalid ordering like this impossible
using phantom types.
Let’s define a trait for each state of the Application: parsing the arguments, running the command, and being done:
object Application {
sealed trait State
object State {
sealed trait ParseArgs extends State
sealed trait RunCommand extends State
sealed trait PostCommand extends State
}
}
Now let’s make the Application
class generic and impose the restriction that
the type parameter must be a subtype of Application.State
:
case class Application[S <: Application.State](...) { ... }
Using the concept of implicit evidence we tell the compiler that
methods defined on Application
can only be called for certain subtypes of S
.
// Application must be in ParseArgs state to call this method
def parseArgs(args: Seq[String])(implicit ev: S =:= ParseArgs): Application[RunCommand]
// Application must be in RunCommand state to call this method
def runCommand(implicit ev: S =:= RunCommand): Application[PostCommand]
This tells the compiler that parseArgs
can only be called on Applications in
the ParseArgs
state and runCommand
can only be called on Applications in the
RunCommand
state.
Trying to call runCommand
before parseArgs
would yield the following
error:
> val app = Application[Application.State.ParseArgs]()
> app.runCommand
Cannot prove that application.Application.State.ParseArgs =:= application.Application.State.RunCommand.
While this example is almost trivial, enforcing an ordering on method calls could add a great deal of safety for a complex application with many states.
Reflection
The project’s first realease has 6,332 lines of code (LOC) written and 4,035 LOC removed. Roughly 2 lines of code were removed for every 3 written! I attribute this to a lot of unnecessary complexity and over-engineering in the beginning that I eventually removed and refactored. I think I started planning and generalizing the application for use cases that were never going to exist. Once I fleshed out a reasonable feature set and stuck to implementing only that feature set I was no longer over-engineering or over-generalizing.
Functional handling of side effects
Cats and
Scalaz offer datatypes for handling side
effects and I’m aware there is a great benefit to using IO monads for your
program’s input/ouput. This is something I hope to understand better in the
future but I prioritized launching the project over learning how to work with
side effect mondas because I was eager to start using vocab
myself!
Manually reading/writing CSV files
CSV reading and writing is a solved problem and there are many libraries available. I wrote my own as an exercise in understanding the type class pattern, which I’m happy to say I’m now comfortable with.
Manually Parsing Command Line Arguments
vocab
was a simple enough program for this to be tenable but I really should
invest time into learning how to use a solid argument parsing framework. I
recently discoverd scopt and it looks
promising. I will definitely use a third party argument parsing framework next
time I build a command line utility.