Extensible Type-Safe Error Handling in Haskell
A colleague of mine introduced me to the benefits of
error-chain
, a crate which aims to
implement “consistent error handling” for Rust. I found the overall design
pretty convincing, and in his use case, the crate really makes error handling
clearer and flexible. I knew Pijul was also using
error-chain
at that time, but I never had the opportunity to dig more into it.
At the same time, I have read quite a lot about extensible effects in
Functional Programming, for an academic article I have submitted to Formal
Methods 2018 The odds were in my favor: the aforementioned academic article has
been accepted.
. In particular, the
freer package provides a very nice
API to define monadic functions which may use well-identified effects. For
instance, we can imagine that Console
identifies the functions
which may print to and read from the standard output. A function
askPassword
which displays a prompt and get the user password would
have this type signature:
askPassword :: Member Console r => Eff r ()
Compared to IO
, Eff
allows for meaningful type
signatures. It becomes easier to reason about function composition, and you
know that a given function which lacks a given effect in its type signature
will not be able to use them. As a predictable drawback, Eff
can
become burdensome to use.
Basically, when my colleague showed me his Rust project and how he was using
error-chain
, the question popped out. Can we use an approach similar to
Eff
to implement a Haskell-flavored error-chain
?
Spoiler alert: the answer is yes. In this post, I will dive into the resulting API, leaving for another time the details of the underlying implementationFor once, I wanted to write about the result of a project, instead of how it is implemented. . Believe me, there is plenty to say. If you want to have a look already, the current implementation can be found on GitHub .
In this article, I will use several “advanced” GHC pragmas. I will not explain each of them, but I will try to give some pointers for the reader who wants to learn more.
State of the Art
This is not an academic publication, and my goal was primarily to explore the arcane of the Haskell type system, so I might have skipped the proper study of the state of the art. That being said, I have written programs in Rust and Haskell before.
Starting Point
In Rust, Result<T, E>
is the counterpart of Either E T
in
HaskellI wonder if they deliberately choose to swap the two type arguments.
. You can use it to model to wrap either the result of a
function (T
) or an error encountered during this computation (~E~). Both
Either
and Result
are used in order to achieve the same
end, that is writing functions which might fail.
On the one hand, Either E
is a monad. It works exactly as
Maybe
(returning an error acts as a shortcut for the rest of the
function), but gives you the ability to specify why the function has failed.
To deal with effects, the mtl
package provides EitherT
, a
transformer version of Either
to be used in a monad stack.
On the other hand, the Rust language provides the ?
syntactic sugar,
to achieve the same thing. That is, both languages provide you the means to
write potentially failing functions without the need to care locally about
failure. If your function f
uses a function g
which might fail, and want to
fail yourself if f
fails, it becomes trivial.
Out of the box, neither EitherT
nor Result
is extensible.
The functions must use the exact same E
, or errors must be converted
manually.
Handling Errors in Rust
Rust and the error-chain
crate provide several means to overcome this
limitation. In particular, it has the Into
and From
traits to
ease the conversion from one error to another. Among other things, the
error-chain
crate provides a macro to easily define a wrapper around many
errors types, basically your own and the one defined by the crates you are
using.
I see several drawbacks to this approach. First, it is extensible if you take the time to modify the wrapper type each time you want to consider a new error type. Second, either you can either use one error type or every error type.
However, the error-chain
package provides a way to solve a very annoying
limitation of Result
and Either
. When you “catch” an
error, after a given function returns its result, it can be hard to determine
from where the error is coming from. Imagine you are parsing a very complicated
source file, and the error you get is SyntaxError
with no additional
context. How would you feel?
error-chain
solves this by providing an API to construct a chain of errors,
rather than a single value.
my_function().chain_err(|| "a message with some context")?;
The chain_err
function makes it easier to replace a given error in its
context, leading to be able to write more meaningful error messages for
instance.
The ResultT
Monad
The ResultT
is an attempt to bring together the extensible power of
Eff
and the chaining of errors of chain_err
. I will admit that,
for the latter, the current implementation of ResultT
is probably
less powerful, but to be honest I mostly cared about the “extensible” thing, so
it is not very surprising.
This monad is an alternative to neither Monad Stacks a la mtl nor to the
Eff
monad. In its current state, it aims to be a more powerful and
flexible version of EitherT
.
Parameters
As often in Haskell, the ResultT
monad can be parameterized in
several ways.
data ResultT msg (err :: [*]) m a
msg
is the type of messages you can stack to provide more context to error handlingerr
is a row of errorsYou might have noticederr
is of kind[*]
. To write such a thing, you will need theDataKinds
GHC pragmas. , it basically describes the set of errors you will eventually have to handlem
is the underlying monad stack of your application, knowing thatResultT
is not intended to be stacked itselfa
is the expected type of the computation result
achieve
and abort
The two main monadic operations which comes with ~ResultT~ are ~achieve~ and ~abort~. The former allows for building the context, by stacking so-called messages which describe what you want to do. The latter allows for bailing on a computation and explaining why.
achieve :: (Monad m)
=> msg
-> ResultT msg err m a
-> ResultT msg err m a
achieve
should be used for do
blocks. You can use
<?>
to attach a contextual message to a given computation.
The type signature of abort
is also interesting, because it
introduces the Contains
typeclass (e.g., it is equivalent to
Member
for Eff
).
abort :: (Contains err e, Monad m)
=> e
-> ResultT msg err m a
This reads as follows: “you can abort with an error of type e
if
and only if the row of errors err
contains the type
e
.”
For instance, imagine we have an error type FileError
to describe
filesystem-related errors. Then, we can imagine the following function:
readContent :: (Contains err FileError, MonadIO m)
=> FilePath
-> ResultT msg err m String
We could leverage this function in a given project, for instance, to read its
configuration files (for the sake of the example, it has several configuration
files). This function can use its own type to describe ill-formed description
(ConfigurationError
).
parseConfiguration :: (Contains err ConfigurationError, MonadIO m)
=> String
-> String
-> ResultT msg err m Configuration
To avoid repeating Contains
when the row of errors needs to
contains several elements, we introduce :<
If you are confused by :<
, it is probably because you were
not aware that the
TypeOperators
GHC pragma was a thing.
(read subset or
equal):
getConfig :: ( '[FileError, ConfigurationError] :< err
, MonadIO m)
=> ResultT String err m Configuration
getConfig = do
achieve "get configuration from ~/.myapp directory" $ do
f1 <- readContent "~/.myapp/init.conf"
<?> "fetch the main configuration"
f2 <- readContent "~/.myapp/net.conf"
<?> "fetch the net-related configuration"
parseConfiguration f1 f2
You might see, now, why I say ~ResultT~ is extensible. You can use two functions with totally unrelated errors, as long as the caller advertises that with ~Contains~ or ~:<~.
Recovering by Handling Errors
Monads are traps, you can only escape them by playing with their
rules. ResultT
comes with runResultT
.
runResultT :: Monad m => ResultT msg '[] m a -> m a
This might be surprising: we can only escape out from the ResultT
if we do not use any errors at all. That is, ResultT
forces us to
handle errors before calling runResultT
.
ResultT
provides several functions prefixed by recover
.
Their type signatures can be a little confusing, so we will dive into the
simpler one:
recover :: forall e m msg err a.
(Monad m)
=> ResultT msg (e ': err) m a
-> (e -> [msg] -> ResultT msg err m a)
-> ResultT msg err m a
recover
allows for removing an error type from the row of errors,
To do that, it requires to provide an error handler to determine what to do
with the error raised during the computation and the stack of messages at that
time. Using recover
, a function may use more errors than advertised
in its type signature, but we know by construction that in such a case, it
handles these errors so that it is transparent for the function user. The type
of the handler is e -> [msg] -> ResultT msg err m a
, which means
the handler can raise errors if required.
recoverWhile msg
is basically a synonym for achieve msg $ recover
. recoverMany
allows for doing the same with a
row of errors, by providing as many functions as required. Finally,
recoverManyWith
simplifies recoverMany
: you can provide
only one function tied to a given typeclass, on the condition that the handling
errors implement this typeclass.
Using recover
and its siblings often require to help a bit the
Haskell type system, especially if we use lambdas to define the error handlers.
Doing that is usually achieved with the Proxy a
dataype (where
a
is a phantom type). I would rather use the
TypeApplications
pragmaThe
TypeApplications
pragmas is probably one of my favorites.
When I use it, it feels almost like if I were writing a Coq document.
.
recoverManyWith @[FileError, NetworkError] @DescriptiveError
(do x <- readFromFile f
y <- readFromNetwork socket
printToStd x y)
printErrorAndStack
The DecriptiveError
typeclass can be seen as a dedicated
Show
, to give textual representation of errors. It is inspired by
the macros of error_chain
.
We can start from an empty row of errors, and allows ourselves to
use more errors thanks to the recover*
functions.
cat
in Haskell using ResultT
ResultT
only cares about error handling. The rest of the work is up
to the underlying monad m
. That being said, nothing forbids us to
provide fine-grained API, e.g., for Filesystem-related functions. From an
error handling perspective, the functions provided by Prelude
(the standard
library of Haskell) are pretty poor, and the documentation is not really
precise regarding the kind of error we can encounter while using it.
In this section, I will show you how we can leverage ResultT
to
(i) define an error-centric API for basic file management functions and
(ii) use this API to implement a cat
-like program which read a file and
print its content in the standard output.
(A Lot Of) Error Types
We could have one sum type to describe in the same place all the errors we can find, and later use the pattern matching feature of Haskell to determine which one has been raised. The thing is, this is already the job done by the row of errors of ~ResultT~. Besides, this means that we could raise an error for being not able to write something into a file in a function which /opens/ a file.
Because ~ResultT~ is intended to be extensible, we should rather define several types, so we can have a fine-grained row of errors. Of course, too many types will become burdensome, so this is yet another time where we need to find the right balance.
newtype AlreadyInUse = AlreadyInUse FilePath
newtype DoesNotExist = DoesNotExist FilePath
data AccessDeny = AccessDeny FilePath IO.IOMode
data EoF = EoF
data IllegalOperation = IllegalRead | IllegalWrite
To be honest, this is a bit too much for the real life, but we are in a blog post
here, so we should embrace the potential of ResultT
.
Filesystem API
By reading the
System.IO
documentation, we can infer what our functions type signatures should look
like. I will not discuss their actual implementation in this article, as this
requires me to explain how IO
deals with errors itself (and this
article is already long enough to my taste). You can have a look at this
gist if you
are interested.
openFile :: ( '[AlreadyInUse, DoesNotExist, AccessDeny] :< err
, MonadIO m)
=> FilePath -> IOMode -> ResultT msg err m Handle
getLine :: ('[IllegalOperation, EoF] :< err, MonadIO m)
=> IO.Handle
-> ResultT msg err m Text
closeFile :: (MonadIO m)
=> IO.Handle
-> ResultT msg err m ()
Implementing cat
We can use the ResultT
monad, its monadic operations and our
functions to deal with the file system in order to implement a cat
-like
program. I tried to comment on the implementation to make it easier to follow.
cat :: FilePath -> ResultT String err IO ()
cat path =
-- We will try to open and read this file to mimic
-- `cat` behaviour.
-- We advertise that in case something goes wrong
-- the process.
achieve ("cat " ++ path) $ do
-- We will recover from a potential error,
-- but we will abstract away the error using
-- the `DescriptiveError` typeclass. This way,
-- we do not need to give one handler by error
-- type.
recoverManyWith @[Fs.AlreadyInUse, Fs.DoesNotExist, Fs.AccessDeny, Fs.IllegalOperation]
@(Fs.DescriptiveError)
(do f <- Fs.openFile path Fs.ReadMode
-- `repeatUntil` works like `recover`, except
-- it repeats the computation until the error
-- actually happpens.
-- I could not have used `getLine` without
-- `repeatUntil` or `recover`, as it is not
-- in the row of errors allowed by
-- `recoverManyWith`.
repeatUntil @(Fs.EoF)
(Fs.getLine f >>= liftIO . print)
(\_ _ -> liftIO $ putStrLn "%EOF")
closeFile f)
printErrorAndStack
where
-- Using the `DescriptiveError` typeclass, we
-- can print both the stack of Strings which form
-- the context, and the description of the generic
-- error.
printErrorAndStack e ctx = do
liftIO . putStrLn $ Fs.describe e
liftIO $ putStrLn "stack:"
liftIO $ print ctx
The type signature of cat
teaches us that this function handles any
error it might encounter. This means we can use it anywhere we want: both in
another computation inside ResultT
which might raise errors
completely unrelated to the file system, or we can use it with
runResultT
, escaping the ResultT
monad (only to fall
into the IO
monad, but this is another story).