Thursday, February 25, 2010

HaskellTorrent v0.0 and further developments

HaskellTorrent v0.0 released!

Last weekend I released version 0.0 of the HaskellTorrent project. However, as modern development will have it, the interesting things happen on the main integration branch, master, on github: haskell-torrent. From the point where the 0.0 release was done and till today, three main things happened in the client.

CPU optimizations

A little bit of work with the profiler has shaved the use of the CPU down by a fairly large amount. I optimized the assertions on the piece database by using Data.IntSet rather than plain old lists. Right now the cost centres are the piece database assertions (still) and deciding which blocks to download. The former of these is simple to get rid of. We don’t need to assert the database at each message, but can do so with a rarer frequency. As for the latter, I have a couple of ideas to shave off CPU cycles on that one as well.

Listen sockets

HaskellTorrent now accepts incoming connections! It does this on port 1579 that has no special connotation apart from being one of the fairly low prime numbers. The etorrent project uses 1729 which has a more interesting history associated with it. Of course, one has to open the eventual NAT/PAT or Firewall to get connections flowing in, should you want to test it.

test-framework

Finally, we now use the excellent test-framework by Max Bolingbroke. The test inclusion was inspired a lot by Eric Koweys blog post on the subject, and it also used bits and pieces from Real World Haskell.

The bottom line is that now you can execute tests directly:

jlouis@illithid:~/Projects/haskell-torrent$ make test
runghc Setup.lhs build
Preprocessing executables for HaskellTorrent-0.0...
Building HaskellTorrent-0.0...
runghc Setup.lhs test
Test test-framework:
reverse-reverse/id: [OK, passed 100 tests]
Protocol/BCode:
QC encode-decode/id: [OK, passed 100 tests]
HUnit encode-decode/id: [OK]
Protocol/Wire:
Piece (-1) 1 ""
QC encode-decode/id: [Failed]
Falsifiable with seed 2776559770653812966, after 1 tests. Reason: Falsifiable

       Properties  Test Cases  Total
Passed  2           1           3
Failed  1           0           1
Total   3           1           4

though it does seem we need to do some work in order to correct the software :)

Wednesday, February 03, 2010

How logging is performed in Haskell-Torrent

First an update: Haskell-torrent has downloaded its first torrent file “from the wild”. Unfortunately, this behaviour is not deterministic yet, and sometimes the download fails in spectacular ways.

For a bittorrent client, which is heavily concurrent, we have a few obvious tools at our disposal to ensure correctness. We can use QuickCheck to generate test cases and test properties. This is best done with the parts of the code relating to encoding and decoding of data in the client. Second, and I think this is important, we can pose invariants on the various processes and check that these invariants hold periodically. Third, we can trace the execution.

In order to trace the execution, we need a logging framework. Several exist for Haskell, but none of these really suits our purpose.

Logging

The first thing we have done is to create a process with a channel. This process is the logger and the only task is to drain the channel and send out log messages on stdout. All other processes log by synchronization on an event where a log message is sent to this process.

Log messages have three components: A /name/ of the logging process, a /message/ to log, and a priority:

> data LogPriority = Debug -- ^ Fine grained debug info
>          | Warn  -- ^ Potentially harmful situations
>          | Info  -- ^ Informational messages, progress reports
>          | Error -- ^ Errors that are continuable
>          | Fatal -- ^ Severe errors. Will probably make the application abort
>          | None  -- ^ No logging at all
>              deriving (Show, Eq, Ord)

Since the log channel is part of so many process configurations, it is beneficial to create a type class of those configurations which can log:

> class Logging a where
>    getLogger :: a -> (String, LogChannel)

For example, the Piece Manager has a configuration

> data PieceMgrCfg = PieceMgrCfg
>     { logCh :: LogChannel
>     , pieceMgrCh :: PieceMgrChannel
>     , fspCh :: FSPChannel
>     , chokeCh :: ChokeInfoChannel
>     , statusCh :: StatusChan
>     }

so to make it possible to log, we instance this

> instance Logging PieceMgrCfg where
>   getLogger cf = ("PieceMgrP", logCh cf)

And to formalize what a log message and channel are:

> data LogMsg = Mes LogPriority String String
> type LogChannel = Channel LogMsg
> instance Show LogMsg where
>     show (Mes pri name str) = show name ++ "(" ++ show pri ++ "):\t" ++ str

The logging itself is done by a logger function. Here is its type:

> log :: Logging a => LogPriority -> String -> Process a b ()

It designates that it takes a Priority and a Message to log (the String). It then logs in a Process monad where the configuration is of type a and the state is of type b. Note that the configuration must be an instance of the Logging class.

The code itself is rather straightforward, save for the call to the logLevel function:

> log prio msg = do
>   (name, logC) <- asks getLogger
>   when (prio >= logLevel name)
>       (liftIO $ logMsg' logC name prio msg)
>   where logMsg' c name pri = sync . transmit c . Mes pri name

For the sake of convenience, we want it simpler for most uses, so without much further ado:

> logInfo, logDebug, logFatal, logWarn, logError :: Logging a => String -> Process a b ()
> logInfo  = log Info
> logDebug = log Debug
> logFatal = log Fatal
> logWarn  = log Warn
> logError = log Error

Filtering

The missing piece of the puzzle is now the “logLevel” function. In the function above, we only log messages which has a priority higher than the current log-level. The log-level is discriminated by the process name in question that tries to log.

Basically, what we want is a simple way to filter log messages. Rather than log all messages, we want to selectively log only part of these, depending on what we are currently debugging or tracing. A log filter is:

> type LogFilter = String -> LogPriority

That is, given a process name, they return a minimum priority for that name. The log function then filters the message if it is not above this priority. Here are a couple of simple combinators:

> matchP :: String -> LogPriority -> LogFilter
> matchP process prio = \s -> if s == process then prio else None
> 
> matchAny :: LogPriority -> LogFilter
> matchAny prio = const prio

The first of those matches a process by name. If there is a match it sets a specific priority. Otherwise it punts by returning the highest priority possible. The second is a simple wrapper for always matching at a given level.

With this down, we get to action. Log priorities are Monoids in the sense that None is the neutral element and composition ignores None by “eating” it:

> instance Monoid LogPriority where
>     mempty = None
>     mappend None g = g
>     mappend f g    = f

Since there is already an instance:

> instance Monoid b => Monoid (a -> b) where ...

Log filters are now monoids. So we can write:

> logLevel :: LogFilter
> #ifdef DEBUG
> logLevel = mconcat [matchP "SendQueueP" Fatal,
>             matchP "ReceiverP" Fatal,
>             matchAny Debug]
> #else
> logLevel = matchAny Info
> #endif

Which means that we do not care much for log messages from the processes “SendQueueP” and “ReceiverP” unless they are fatal. Otherwise we want Debug messages or above. The only restriction is that the first match in the monoid chain is the one that steers the level.

About Me

My Photo
Lambda-loving CS Geek. Likes metal music. Likes dogs, cats. Does not like pictures of dogs and cats (unless they are lambdacats!)

Has an unhealthy coffee addiction. Calls himself the coffee zombie in the morning (BEEEEANS!)

Has a neverending curiosity gene, loves intelligence and passion.