Myth and truth in Haskell asynchronous exceptions

This article explains my best current practice on asynchronous exceptions in Haskell using the standard library - Control.Exception. Other libraries for safe exceptions are out of scope. The followings are the definitions of two kinds of exceptions.

Synchronous exceptions are ones raised by your actions. You can throw a synchronous exception to yourself via throwIO.
Asynchronous exception are ones thrown by other threads. You can re-throw them to yourself via throwIO when you catch them.

Before talking about asynchronous exceptions, let's start with synchronous exceptions.

Synchronous exceptions

You can catch synchronous exceptions by catch, handle and try:

catch :: Exception e => IO a -> (e -> IO a) -> IO a
handle :: Exception e => (e -> IO a) -> IO a -> IO a
try :: Exception e => IO a -> IO (Either e a)

As you can see from this signature, the handler can take only one type. Here is an example:

import Control.Exception
import GHC.IO.Exception
import System.IO.Error

handled_action :: IO ()
handled_action = your_action `catch` handler
  where
    handler :: IOException -> IO ()
    handler e
        -- Bad file descriptor
        | ioeGetErrorType e == InvalidArgument = return ()
        -- Connection refused
        | ioeGetErrorType e == NoSuchThing = return ()
        -- Print it for debugging
        | otherwise = print e

You can define your own exception as follows:

import Control.Exception

data EarlyReturn = EarlyReturn deriving (Eq, Show)
instance Exception EarlyReturn

The important methods of the Exception class are as follows:

class (Typeable e, Show e) => Exception e where
    toException   :: e -> SomeException
    fromException :: SomeException -> Maybe e

instance Exception EarlyReturn derives the methods properly.

The following code implements early return like other languages:

import Control.Monad (when)

early_return :: IO ()
early_return = handle ignore $ do
    when condition1 $ throwIO EarlyReturn
    when concition2 $ throwIO EarlyReturn

    action1
    action2
    action3
    ...
  where
    ignore EarlyReturn = return ()

If you want to catch two or more types, you can use catches:

catches :: IO a -> [Handler a] -> IO a

In my opinion, catches is a little bit hard to use. Rather, I usually use PatternGuard. The following example defines the Break exception as well as EarlyReturn:

{-# LANGUAGE PatternGuards #-}

data Break = Break deriving (Eq, Show)
instance Exception Break

handled_action2 :: IO ()
handled_action2 = your_action `catch` handler
  where
    handler :: SomeException -> IO ()
    handler se
        | Just Break <- fromException se = return ()
        | Just EarlyReturn <- fromException se = return ()
        -- Print it for debugging
        | otherwise = print se

The super data type, SomeException is defined as follows:

{-# LANGUAGE ExistentialQuantification #-}

data SomeException = forall e . Exception e => SomeException e

The constructor SomeException is a kind of container which can contain several types and provides a single type, SomeException. The following example converts IOError and ErrorCall to SomeException:

ghci> import Control.Exception

ghci> :type userError "foo"
userError "foo" :: IOError
ghci> :type SomeException (userError "foo")
SomeException (userError "foo") :: SomeException

ghci> :type ErrorCall "bar"
ErrorCall "bar" :: ErrorCall
ghci> :type SomeException (ErrorCall "bar")
SomeException (ErrorCall "bar") :: SomeException

fromException returns Just if SomeException contains an expected exception type. Otherwise, it returns Nothing:

ghci> let se = SomeException $ ErrorCall "bar"
ghci> fromException se :: Maybe ErrorCall
Just bar
ghci> fromException se :: Maybe IOError 
Nothing

A downside of SomeException is that asynchronous exceptions are caught as well as synchronous exceptions. This could introduce nasty bugs. We will resolve this issue according to rule 3 later.

Asynchronous exceptions

If your code acquires a resource, it must be released after your job. The following is an example of unsafe code to obtain a resource:

unsafe_action_with_resource = do
   x <- acquire_resource
   use x
   release_resource x

If use receives an asynchronous exception, x is not released, thus leaked. To prevent this resource leak, you should use bracket:

bracket :: IO a -> (a -> IO b) -> (a -> IO c) -> IO c

The example can be convert into asynchronous-exception-safe code with bracket:

safe_action_with_resource = bracket acquire_resource release_resource use

Some people (including me) misunderstand/misunderstood that asynchronous exceptions are not delivered into the resource release clause. But this is not quit right.

Consider the following situation:

You run a program in Haskell in a terminal
Your program is blocked in the resource release clause of bracket.
You type Control-C to stop the program on the terminal.

Control-C is an asynchronous exception (in particular UserInterrupt :: AsyncException). If this asynchronous exception is not delivered to the resource release clause, you cannot stop your program.

But Control.Exception is well-defined. Blocking actions such as takeMVar is interruptible by asynchronous exceptions. In other words, non blocking code cannot be interrupted by asynchronous exceptions in the clause. As you will see later, timeout is defined with an asynchronous exception. You can use even timeout in the resource release clause. This is a crucial point to implement graceful-closing of network connections as it is used in the resource release clause typically.

As you understood, the semantics of Control.Exception is clear and well-defined. That's why this article is stick to it.

As described above, if you catch asynchronous exceptions which others define via SomeException, nasty bugs happens. For instance, the async package uses an asynchronous exception, AsyncCanceled. If you run two threads which ignore AsyncCanceled via concurrently, these threads would be leaked.

So, you should obey the following three rules:

Rule 1

If you define an asynchronous exception, you must explicitly define the methods of the Exception class:

import Control.Exception

data Timeout = Timeout deriving (Eq, Show)

instance Exception Timeout where
    toException = asyncExceptionToException
    fromException = asyncExceptionFromException

If we follow this rule, asynchronous exceptions can be distinguished from synchronous exceptions with the following function:

isAsyncException :: Exception e => e -> Bool
isAsyncException e =
    case fromException (toException e) of
        Just (SomeAsyncException _) -> True
        Nothing -> False

Rule2: catch the asynchronous exception which you are using

If you don't catch, the exceptions are leaked from your threads. GHC RTS catches them and displays them to stdout.

The following is an example to define naive timeout:

import Control.Concurrent
import Control.Exception

data Timeout = Timeout deriving (Eq, Show)

instance Exception Timeout where
    toException = asyncExceptionToException
    fromException = asyncExceptionFromException

timeout :: Int -> IO a -> IO (Maybe a)
timeout t action = do
    pid <- myThreadId
    handle handler $ bracket (spawn pid) kill $ \_ -> Just <$> action
  where
    spawn pid = forkIO $ do
        threadDelay t
        throwTo pid Timeout
    kill tid = throwTo tid ThreadKilled
    handler Timeout = return Nothing

In this example, an asynchronous exception, ThreadKilled, is defined and caught by handler.

Rule 3: don't catch asynchronous exceptions of others

Exceptions of others have their own semantics. If you catch them, your code does not work well.

If you use SomeException for catch, handle or try, check if the caught exception is asynchronous. And re-throw it via throwIO if asynchronous. You can use the following pattern for this purpose:

{-# LANGUAGE PatternGuards #-}

handled_action3 :: IO ()
handled_action3 = your_action `catch` handler
  where
    handler :: SomeException -> IO ()
    handler se
        | isAsyncException se = throwIO se -- HERE HERE HERE
        | Just Break <- fromException se = return ()
        | Just EarlyReturn <- fromException se = return ()
        -- Print it for debugging
        | otherwise = print se

Event-poll programming

Recently, Kei Hibino, my colleage, spoke eloquently to me about the dangers of asynchronous exceptions. Asynchronous exceptions are thrown to us even when we are not expected them. Rather, he suggested event-poll programming.

Let's consider the recv function of the network library with timeout. This is an essential part of the graceful-closing function (read Implementing graceful-close in Haskell network library in detail):

import Data.ByteString
import Network.Socket
import Network.Socket.ByteString
import System.Timeout

timeoutRecv :: Socket -> Int -> IO (Maybe ByteString)
timeoutRecv sock usec = timeout usec $ recv sock 1024

As I explained, timeout uses an asynchronous exception. How can we implement this function without asynchronous exceptions?

The idea is as follows:

Prepare one STM action and ask the system TimerManager to wake up me via the action on timeout.
Prepare another STM action and ask the system IOManager to wake up me via the action when the socket is ready for read.
Race two managers and check if which one is a winner via the composition of these STM actions. If TimerManager wins, return Nothing from IO. Otherwise, the winner is IOManager. This means that data is available for the socket. So, call recv.

The following code implements this idea. I don't explain this code in detail but please read carefully if you are interested.

import Control.Concurrent.STM (
    atomically,
    newEmptyTMVarIO,
    orElse,
    putTMVar,
    takeTMVar,
 )
import Control.Exception (bracket)
import Data.ByteString (ByteString)
import GHC.Event (getSystemTimerManager, registerTimeout, unregisterTimeout)
import Network.Socket (Socket, waitAndCancelReadSocketSTM)
import Network.Socket.ByteString (recv)

data Wait = MoreData | TimeoutTripped

recvEOFevent :: Socket -> Int -> IO (Maybe ByteString)
recvEOFevent sock usec = do
    tmmgr <- getSystemTimerManager
    tmvar <- newEmptyTMVarIO
    bracket (setupTimeout tmmgr tmvar) (cancelTimeout tmmgr) $ \_ -> do
        bracket (setupRead sock) cancelRead $ \(rxWait', _) -> do
            let toWait = do
                    takeTMVar tmvar
                    return TimeoutTripped
                rxWait = do
                    rxWait'
                    return MoreData
            waitRes <- atomically (toWait `orElse` rxWait)
            case waitRes of
                TimeoutTripped -> return Nothing
                MoreData -> Just <$> recv sock 1024
  where
    setupTimeout tmmgr tmvar =
        registerTimeout tmmgr usec $ atomically $ putTMVar tmvar ()
    cancelTimeout = unregisterTimeout
    setupRead = waitAndCancelReadSocketSTM
    cancelRead (_, cancel) = cancel

I'm trying to get rid of asynchronous exceptions as much as possible from my network libraries by introducing the event-poll programming.

ã‚ã©ã‘ãªã„è©±

Internet technologies