Enter the void *

Hakyll 101

Etienne Millon — Mon, 21 Nov 2011 00:00:00 UT

So, the recent trend seems to be static blogging. Indeed, as a web application, a blog is mostly read-only. By generating static .html files, one can eliminate :

CPU load : static content is what’s easiest to serve, especially with modern servers using sendfile(2).
security issues : without dynamic page generation, the attack surface is also vastly reduced. Authentication is moved from a PHP or Python script to the Unix way.
deployment problems : I don’t know a free host that won’t serve static files. I use S3 (and the free tier will often be enough !) but if I am not satisfied, it’s dead simple to migrate.

Basically, it’s like moving from a dynamic language to a static one ☺. The only problem is if you want to add comments. The popular solution is Disqus but it is unfortunately a non-free application. I’ll probably stick to it but I fear data lock-in.

As it is fashionable, a lot of tools have appeared : pelican, blogofile, ikiwiki, jekyll… Being a haskeller, I decided to give hakyll a try.

Hakyll is a haskell library for writing and deploying static websites ; that’s about it. As in a dynamic application, you define routes and how to serve them :

makeCss :: Rules
makeCss =
  void $ match "css/*" $ do
      route   idRoute
      compile compressCssCompiler

Most rules consist of compiling markdown to HTML (with the fantastic pandoc library) and copying stuff around.

The resulting binary, when compiled, can be run to see previews, build files or even deploy the site.

 ~/www/blog [master] % ./blog
ABOUT

This is a Hakyll site generator program. You should always
run it from the project root directory.

USAGE

blog build           Generate the site
blog clean           Clean up and remove cache
blog help            Show this message
blog preview [port]  Run a server and autocompile
blog rebuild         Clean up and build again
blog server [port]   Run a local test server
blog deploy          Upload/deploy your site

So far I’ve found it very easy to use. That’s it for this first mini-tour. Stay tuned !

What's in an ADT ?

Etienne Millon — Wed, 14 Dec 2011 00:00:00 UT

Introduction

Algebraic Data Types, or ADTs for short, are a core feature of functional languages such as OCaml or Haskell. They are a handy model of closed disjoint unions and unfortunately, outside of the functional realm, they are only seldom used.

In this article, I will explain what ADTs are, how they are used in OCaml and what trimmed-down versions of them exist in other languages. I will use OCaml, but the big picture is about the same in Haskell.

Principles

Functional languages offer a myriad of types for the programmer.

some base types, such as int, char or bool.
functions, ie arrow types. A function with domain a and codomain b has type a -> b.
tuples, ie product types. A tuple is an heterogeneous, fixed-width container type (its set-theoretic counterpart is the cartesian product) For example, (2, true, 'x') has type int * bool * char. record types are a (mostly) syntactic extension to give name to their fields.
some parametric types. For example, if t is a type, t list is the type of homogeneous linked list of elements having type t.
what we are talking about today, algebraic types (or sum types, or variant types).

If product types represent the cartesian product, algebraic types represent the disjoint union. In another words, they are very adapted for a case analysis.

We will take the example of integer ranges. One can say that an integer range is either :

the empty range
of the form ]-∞;a]
of the form [a;+∞[
an interval of the form [a;b] (where a ≤ b)
the whole range (ie, ℤ)

With the following properties :

Disjunction : no range can be of two forms at a time.
Injectivity : if [a;b] = [c;d], then a = c and b = d (and similarly for other forms).
Exhaustiveness : it cannot be of another form.

Syntax & semantics

This can be encoded as an ADT :

type range =
  | Empty
  | HalfLeft of int
  | HalfRight of int
  | Range of int * int
  | FullRange

Empty, HalfLeft, HalfRight, Range and FullRange are t’s constructors. They are the only way to build a value of type t. For example, Empty, HalfLeft 3 and Range (2, 5) are all values of type t¹. They each have a specific arity (the number of arguments they take).

To deconstruct a value of type t, we have to use a powerful construct, pattern matching, which is about matching a value against a sequence of patterns (yes, that’s about it).

To illustrate this, we will write a function that computes the minimum value of such a range. Of course, this can be ±∞ too, so we have to define a type to represent the return value.

type ext_int =
  | MinusInfinity
  | Finite of int
  | PlusInfinity

In a math textbook, we would write the case analysis as :

min ∅ = +∞
min ]-∞;a] = -∞
min [a;+∞[ = a
min [a;b] = a
min ℤ = -∞

That translates to the following (executable !) OCaml code :

let range_min x =
  match x with
  | Empty -> PlusInfinity
  | HalfLeft a -> MinusInfinity
  | HalfRight a -> Finite a
  | Range (a, b) -> Finite a
  | FullRange -> MinusInfinity

In the pattern HalfLeft a, a is a variable name, so it get bounds to the argument’s value. In other words, match (HalfLeft 2) with HalfLeft x -> e bounds x to 2 in e.

It’s functions all the way down

Pattern matching seems magical at first, but it is only a syntactic trick. Indeed, the definition of the above type is equivalent to the following definition :

type range

(* The following is not syntactically correct *)
val Empty : range
val HalfLeft : int -> range
val HalfRight : int -> range
val Range : int * int -> range
val FullRange : range
(* Moreover, we know that they are injective and mutually disjoint *)

val deconstruct_range :
  (unit -> 'a) ->
  (int -> 'a) ->
  (int -> 'a) ->
  (int * int -> 'a) ->
  (unit -> 'a) ->
  range ->
  'a

deconstruct_range is what replaces pattern matching. It also embodies the notion of exhaustiveness, because given any value of type range, we can build a deconstructed value out of it.

Its type looks scary at first, but if we look closer, its arguments are a sequence of case-specific deconstructors² and the value to get “matched” against.

To show the equivalence, we can implement deconstruct_range using pattern patching and range_min using deconstruct_range³ :

let deconstruct_range
      f_empty
      f_halfleft
      f_halfright
      f_range
      f_fullrange
      x
    =
  match x with
  | Empty -> f_empty ()
  | HalfLeft a -> f_halfleft a
  | HalfRight a -> f_halfright a
  | Range (a, b) -> f_range (a, b)
  | FullRange -> f_fullrange ()

let range_min' x =
  deconstruct_range
    (fun () -> PlusInfinity)
    (fun a -> MinusInfinity)
    (fun a -> Finite a)
    (fun (a, b) -> Finite a)
    (fun () -> MinusInfinity)
    x

Implementation

After this trip in denotational-land, let’s get back to operational-land : how is this implemented ?

In OCaml, no type information exists at runtime. Everything exists with a uniform representation and is either an integer or a pointer to a block. Each block starts with a tag, a size and a number of fields.

With the Obj module (kids, don’t try this at home), it is possible to inspect blocks at runtime. Let’s write a dumper for range value and watch outputs :

(* Range of integers between a and b *)
let rec rng a b =
  if a > b then
    []
  else
    a :: rng (a+1) b

let view_block o =
  if (Obj.is_block o) then
    begin
      let tag = Obj.tag o in
      let sz = Obj.size o in
      let f n =
        let f = Obj.field o n in
        assert (Obj.is_int f);
        Obj.obj f
      in
      tag :: List.map f (rng 0 (sz-1))
    end
  else if Obj.is_int o then
    [Obj.obj o]
  else
    assert false

let examples () =
  let p_list l =
    String.concat ";" (List.map string_of_int l)
  in
  let explore_range r =
    print_endline (p_list (view_block (Obj.repr r)))
  in
  List.iter explore_range
    [ Empty
    ; HalfLeft 8
    ; HalfRight 13
    ; Range (2, 5)
    ; FullRange
    ]

When we run examples (), it outputs :

0
0;8
1;13
2;2;5
1

We can see the following distinction :

0-ary constructors (Empty and FullRange) are encoded are simple integers.
other ones are encoded blocks with a constructor number as tag (0 for HalfLeft, 1 for HalfRight and 2 for Range) and their argument list afterwards.

Thanks to this uniform representation, pattern-matching is straightforward : the runtime system will only look at the tag number to decide which constructor has been used, and if there are arguments to be bound, they are just after in the same block.

Conclusion

Algebraic Data Types are a simple model of disjoint unions, for which case analyses are the most natural. In more mainstream languages, some alternatives exist but they are more limited to model the same problem.

For example, in object-oriented languages, the Visitor pattern is the natural way to do it. But class trees are inherently “open”, thus breaking the exhaustivity property.

The closest implementation is tagged unions in C, but they require to roll your own solution using enums, structs and unions. This also means that all your hand-allocated blocks will have the same size.

Oh, and I would love to know how this problem is solved with other paradigms !

Unfortunately, so is Range (10, 2). The invariant that a ≤ b has to be enforced by the programmer when using this constructor.↩︎
For 0-ary constructors, the type has to be unit -> 'a instead of 'a to allow side effects to happen during pattern matching.↩︎
More precisely, we would have to show that any function written with pattern matching can be adapted to use the deconstructor instead. I hope that this example is general enough to get the idea.↩︎

Comonadic Life

Etienne Millon — Thu, 18 Oct 2012 00:00:00 UT

Of monads and comonads

This post is written in Literate Haskell. This means that you can copy it into a .lhs file¹ and run it through a Haskell compiler or interpreter.

Today we’ll talk about…

import Control.Comonad
import Control.Monad

Comonads ! They are the categoric dual of monads, which means that the type signatures of comonadic functions look like monadic functions, but with the arrow reversed. I am not an expert in category theory, so I won’t go further.

I will use the following typeclass for comonads : it’s from Edward Kmett’s comonad package (split from the infamous category-extras package).

class Functor w => Comonad w where
  extract :: w a -> a
  extend :: (w a -> b) -> w a -> w b
  duplicate :: w a -> w (w a)

extend or duplicate are optional, as one can be written in terms of the other one. The Monad typeclass, for reference, can be described as² :

class Functor m => Monad m where
  return :: a -> m a
  (=<<) :: (a -> m b) -> m a -> m b
  join :: m (m a) -> m a

The duality is quite easy to see : extract is the dual of return, extend the one of (=<<) and duplicate the one of join.

So what are comonads good for ?

I stumbled upon an article which explains that they can be used for computations which depend on some local environment, like cellular automata. Comments ask whether it’s possible to generalize to higher dimensions, which I will do by implementing Conway’s Game of Life in a comonadic way.

List Zippers

List zippers are a fantastic data structure, allowing O(1) edits at a “cursor”. Moving the cursor element to element is O(1) too. This makes it a very nice data structure when your edits are local (say, in a text editor). You can learn more about zippers in general in this post from Edward Z Yang. The seminal paper is of course Huet’s article.

A list zipper is composed of a cursor and two lists.

data ListZipper a = LZ [a] a [a]

To go in a direction, you pick the head of a list, set it as your cursor, and push the cursor on top of the other list. We assume that we will only infinte lists, so this operation can not fail. This assumption is reasonnable especially in the context of cellular automata³.

listLeft :: ListZipper a -> ListZipper a
listLeft (LZ (l:ls) x rs) = LZ ls l (x:rs)
listLeft _ = error "listLeft"

listRight :: ListZipper a -> ListZipper a
listRight (LZ ls x (r:rs)) = LZ (x:ls) r rs
listRight _ = error "listRight"

Reading and writing on a list zipper at the cursor is straightforward :

listRead :: ListZipper a -> a
listRead (LZ _ x _) = x

listWrite :: a -> ListZipper a -> ListZipper a
listWrite x (LZ ls _ rs) = LZ ls x rs

We can also define a function to convert a list zipper to a list, for example for printing. As it’s infinite on both sizes, it’s not possible to convert it to the whole list, so we have to pass a size parameter.

toList :: ListZipper a -> Int -> [a]
toList (LZ ls x rs) n =
  reverse (take n ls) ++ [x] ++ take n rs

We can easily define a Functor instance for ListZipper. To apply a function on whole zipper, we apply it to the cursor and map it on the two lists :

instance Functor ListZipper where
  fmap f (LZ ls x rs) = LZ (map f ls) (f x) (map f rs)

Time for the Comonad instance. The extract method returns an element from the structure : we can pick the one at the cursor.

duplicate is a bit harder to grasp. From a list zipper, we have to build a list zipper of list zippers. The signification behind this (confirmed by the comonad laws that every instance has to fulfill) is that moving inside the duplicated structure returns the original structure, altered by the same move : for example, listRead (listLeft (duplicate z)) == listLeft z.

This means that at the cursor of the duplicated structure, there is the original structure z. And the left list is composed of listLeft z, listLeft (listLeft z), listLeft (listLeft (listLeft z)), etc (same goes for the right list).

The following function applies repeatedly two movement functions on each side of the zipper (its type is more generic than needed for this specific case but we’ll instanciate z with something other than ListZipper in the next section).

genericMove :: (z a -> z a)
            -> (z a -> z a)
            -> z a
            -> ListZipper (z a)
genericMove a b z =
  LZ (iterate' a z) z (iterate' b z)

iterate' :: (a -> a) -> a -> [a]
iterate' f = tail . iterate f

And finally we can implement the instance.

instance Comonad ListZipper where
  extract = listRead

  duplicate = genericMove listLeft listRight

Using this comonad instance we can already implement 1D cellular automata, as explained in the sigfpe article. Let’s see how they can be extended to 2D automata.

Plane zippers

Let’s generalize list zippers to plane zippers, which are cursors on a plane of cells. We will implement them using a list zipper of list zippers.

data Z a = Z (ListZipper (ListZipper a))

We start by defining move functions. As a convention, the external list will hold lines : to move up and down, we will really move left and right at the root level.

up :: Z a -> Z a
up (Z z) = Z (listLeft z)

down :: Z a -> Z a
down (Z z) = Z (listRight z)

For left and right, it is necessary to alter every line, using the Functor instance.

left :: Z a -> Z a
left (Z z) = Z (fmap listLeft z)

right :: Z a -> Z a
right (Z z) = Z (fmap listRight z)

Finally, editing is quite straightforward : reading is direct (first read the line, then the cursor) ; and in order to write, it is necessary to fetch the current line, write to it and write the new line.

zRead :: Z a -> a
zRead (Z z) = listRead $ listRead z

zWrite :: a -> Z a -> Z a
zWrite x (Z z) =
  Z $ listWrite newLine z
    where
      newLine = listWrite x oldLine
      oldLine = listRead z

Time for algebra. Let’s define a Functor instance : applying a function everywhere can be achieved by applying it on every line.

instance Functor Z where
  fmap f (Z z) = Z (fmap (fmap f) z)

The idea behind the Comonad instance for Z is the same that the ListZipper one : moving “up” in the structure (really, “left” at the root level) returns the original structure moved in this direction.

We will reuse the genericMove defined earlier in order to build list zippers that describe movements in the two axes⁴.

horizontal :: Z a -> ListZipper (Z a)
horizontal = genericMove left right

vertical :: Z a -> ListZipper (Z a)
vertical = genericMove up down

This is enough to define the instance.

instance Comonad Z where
  extract = zRead

  duplicate z =
    Z $ fmap horizontal $ vertical z

Conway’s (comonadic) Game of Life

Let’s define a neighbourhood function. Here, directions are moves on a plane zipper. Neighbours are : horizontal moves, vertical moves and their compositions (liftM2 (.))⁵.

neighbours :: [Z a -> Z a]
neighbours =
  horiz ++ vert ++ liftM2 (.) horiz vert
    where
      horiz = [left, right]
      vert  = [up, down]

aliveNeighbours :: Z Bool -> Int
aliveNeighbours z =
  card $ map (\ dir -> extract $ dir z) neighbours

card :: [Bool] -> Int
card = length . filter (==True)

The core rule of the game fits in the following function : if two neighbours are alive, return the previous state ; if three neighbours are alive, a new cell is born, and any other count causes the cell to die (of under-population or overcrowding).

It is remarkable that its type is the dual of that of a Kleisli arrow (a -> m b).

rule :: Z Bool -> Bool
rule z =
  case aliveNeighbours z of
    2 -> extract z
    3 -> True
    _ -> False

And then the comonadic magic happens with the use of extend :

evolve :: Z Bool -> Z Bool
evolve = extend rule

evolve is our main transition function between world states, and yet it’s only defined in terms of the local transition function !

Let’s define a small printer to see what’s going on.

dispLine :: ListZipper Bool -> String
dispLine z =
  map dispC $ toList z 6
    where
      dispC True  = '*'
      dispC False = ' '

disp :: Z Bool -> String
disp (Z z) =
  unlines $ map dispLine $ toList z 6

Here is the classic glider pattern to test. The definition has a lot of boilerplate because we did not bother to create a fromList function.

glider :: Z Bool
glider =
  Z $ LZ (repeat fz) fz rs
    where
      rs = [ line [f, t, f]
           , line [f, f, t]
           , line [t, t, t]
           ] ++ repeat fz
      t = True
      f = False
      fl = repeat f
      fz = LZ fl f fl
      line l =
        LZ fl f (l ++ fl)

*Main> putStr $ disp glider
             
             
             
             
             
             
             
        *    
         *   
       ***   
             
             
             
*Main> putStr $ disp $ evolve glider
             
             
             
             
             
             
             
             
       * *   
        **   
        *

We did it ! Implementing Conway’s Game of Life is usually full of ad-hoc boilerplate code : iterating loops, managing copies of cells, etc. Using the comonadic structure of cellular automata, the code can be a lot simpler.

In this example, ListZipper and Z should be library functions, so the actual implementation is only a dozen lines long!

The real benefit is that it has really helped be grasp the concept of comonads. I hope that I did not just fall into the comonad tutorial fallacy :)

Update (March 10th): Brian Cohen contributed a simple extension to simulate a closed topology. Thanks !

Or download the source on github.↩︎
In the real Haskell typeclass, there are the following differences: Monad and Functor are not related, join is a library function (you can’t use it to define an instance), (>>=) is used instead of its flipped counterpart (=<<) and there two more methods (>>) and fail.↩︎
Simulating a closed topology such as a torus may even be possible using cyclic lists instead of lazy infinite lists. Update: see Brian Cohen’s response at the end of this post.↩︎
At first I thought that it was possible to only use the Comonad instance of ListZipper to define horizontal and vertical, but I couldn’t come up with a solution. But in that case, the z generic parameter is instanciated to Z, not ListZipper. For that reason I believe that my initial thought can’t be implemented. Maybe it’s possible with a comonad transformer or something like that.↩︎
This could have been written in extension as there are only 8 cases, but it’s funnier and arguably less error prone this way :-)↩︎

A lens-based ST20 emulator

Etienne Millon — Thu, 20 Aug 2015 00:00:00 UT

Every year, as part of the SSTIC conference, there is a forensics/reverse engineering challenge. I participated in the 2015 edition. Though I did not manage to complete it, I made an emulator for the exotic ST20 architecture, which is probably worth describing here.

Note that this emulator is not really optimized for pure speed. In the actual challenge I actually had to rewrite it as pure Haskell (i.e., removing the emulation part) so that it was faster. Instead, the goal of this article is to show a few techniques to write powerful emulators in Haskell.

{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE TemplateHaskell #-}
import Control.Applicative
import Control.Concurrent
import Control.Monad.RWS
import Control.Lens hiding (imap, op)
import Data.Bits
import Data.Int
import Data.Maybe
import Data.Word
import Numeric
import System.Exit
import System.IO
import Text.Printf

import qualified Data.ByteString as BS
import qualified Data.Map as M
import qualified Data.Set as S

The evaluation monad

This program uses Template Haskell to define lenses, so unfortunately we need to start with a few type definitions.

The ST20’s memory goes from 0x80000000 to 0x7fffffff:

type Address = Int32

We’ll represent the memory using a map. The performance is surprisingly close to that of an array. It is possible to get significantly speeds up memory access by using an IOUArray but it turns loads and stores become monadic operations and makes it impossible to use lenses.

type Mem = M.Map Address Word8

As we’ll see, transputers (hardware threads) can communicate together. We’ll be able to connect it either between them, or to a tty.

data IChannel = InChan (Chan Word8)
              | InHandle Handle

data OChannel = OutChan (Chan Word8)
              | OutHandle Handle

type IChannelMap = [(Int32, IChannel)]
type OChannelMap = [(Int32, OChannel)]

All evaluations take place in a Eval Monad which is a monad transformer stack with the following capabilities:

read and write an EvalState value;
read an EvalEnv value
do some I/O.

newtype Eval a = Eval (RWST EvalEnv () EvalState IO a)
    deriving ( Functor
             , Monad
             , MonadIO
             , MonadReader EvalEnv
             , MonadState EvalState
             )

data EvalEnv =
    EvalEnv
        { envInChans :: IChannelMap
        , envOutChans :: OChannelMap
        }

data EvalState =
    EvalState
        { _iptr :: !Address
        , _intStack :: [Int32]
        , _wptr :: !Int32
        , _mem :: !Mem
        }
$(makeLenses ''EvalState)

runEval :: Mem -> IChannelMap -> OChannelMap -> Eval a -> IO a
runEval memory imap omap (Eval m) =
    fst <$> evalRWST m env st
        where
            env = EvalEnv imap omap
            st =
                 EvalState
                     { _iptr = memStart
                     , _intStack = []
                     , _wptr = 0xaaaaaaaa
                     , _mem = memory
                     }

The above $(...) is a Template Haskell splice. It creates lenses based on the record declaration of EvalState. Lenses are a very powerful tool that makes it possible to compose record reads and updates in a functional way. Here, it defines a lens for each record field; for example, the splice expands to a top-level declaration iptr :: Lens' EvalState Address. But we will define our own lenses too, and everything will remain composable.

Memory

This is naturally adapted to byte access:

memByteOpt :: Address -> Lens' EvalState (Maybe Word8)
memByteOpt addr =
  mem . at addr

See? We composed the mem lens (between an evaluation state and a memory state) with at addr, which is a lens between a memory state and the value at address addr. Well, not exactly: at actually returns a Maybe Word8. We will assume that all memory accesses will succeed, so we want a lens that returns a plain Word8. To achieve this, we can compose with a lens that treats Maybe a as a container of a:

maybeLens :: Lens' (Maybe a) a
maybeLens = lens fromJust (const Just)

memByte :: Address -> Lens' EvalState Word8
memByte addr =
  memByteOpt addr . maybeLens

Sometimes we will also need to access memory word by word. To achieve that, we first define conversion functions.

bytesToWord :: (Word8, Word8, Word8, Word8) -> Int32
bytesToWord (b0, b1, b2, b3) =
    sum [ fromIntegral b0
        , fromIntegral b1 `shiftL` 8
        , fromIntegral b2 `shiftL` 16
        , fromIntegral b3 `shiftL` 24
        ]

wordToBytes :: Int32 -> (Word8, Word8, Word8, Word8)
wordToBytes w =
    (b0, b1, b2, b3)
        where
            b0 = fromIntegral $ w .&. 0x000000ff
            b1 = fromIntegral $ (w .&. 0x0000ff00) `shiftR` 8
            b2 = fromIntegral $ (w .&. 0x00ff0000) `shiftR` 16
            b3 = fromIntegral $ (w .&. 0xff000000) `shiftR` 24

Then, we can define a lens focusing on a 32-bit value.

compose :: [a -> a] -> a -> a
compose = foldr (.) id

get32 :: Address -> EvalState -> Int32
get32 base s =
    bytesToWord (b0, b1, b2, b3)
        where
            b0 = s ^. memByte base
            b1 = s ^. memByte (base + 1)
            b2 = s ^. memByte (base + 2)
            b3 = s ^. memByte (base + 3)

set32 :: Address -> EvalState -> Int32 -> EvalState
set32 base s v =
   compose
       [ set (memByte base) b0
       , set (memByte (base + 1)) b1
       , set (memByte (base + 2)) b2
       , set (memByte (base + 3)) b3
       ] s
        where
            (b0, b1, b2, b3) = wordToBytes v

memWord :: Address -> Lens' EvalState Int32
memWord addr = lens (get32 addr) (set32 addr)

The instruction set reference defines a handy operator to shift an address by a word offset:

(@@) :: Address -> Int32 -> Address
a @@ n = a + 4 * n

It will be also handy to access the memory in list chunks:

mem8s :: Address -> Int32 -> Lens' EvalState [Word8]
mem8s base len = lens getList setList
    where
        getList s =
            map (\ off -> s ^. memByte (base + off)) [0 .. len - 1]
        setList s0 ws =
            compose (zipWith (\ off w -> set (memByte (base + off)) w) [0..] ws) s0

Instruction decoding

Instructions are usually encoded on a single byte: the opcode is in the first nibble, and a parameter is in the second one. For example this is how a LDC (load constant) is encoded:

   .--- 0x40 LDC
  |.---          0x5
  ||
0x45         LDC 0x5

This only works for 4-bytes constants. To load bigger constants, there is a “prefix” operation that will shift the current operand:

   .-------- 0x20 PFX
  |.--------          0x2
  ||
  ||    .--- 0x40 LDC
  ||   |.---          0x5
  ||   ||
0x22 0x45    LDC 0x25

Those are chainable; for example 0x21 0x22 0x45 encodes LDC 0x125.

Another prefix shifts and complements the current operand value:

   .-------- 0x60 NFX
  |.--------          0x2
  ||
  ||    .--- 0x40 LDC
  ||   |.---          0x5
  ||   ||
0x62 0x45    LDC (~0x25)

The ST20 architecture actually provides two type of instructions:

“primary” instructions such as LDC. Their operand is directly encoded.
“secondary” instructions such as MINT (equivalent to LDC 0x80000000). They do not have operands. On the contrary, they are actually a special case of the first type, using a special OPR n opcode. For example, MINT is OPR 0x42, which is encoded using 0x24 0xF2.

We know enough to draft an instruction decoder.

data PInstr = AJW   | ADC
            | LDC   | STL
            | LDL   | LDNL
            | LDLP  | LDNLP
            | CJ    | J
            | EQC   | CALL
            | STNL
    deriving (Eq, Ord, Show)

data SInstr = PROD  | MINT   | GAJW
            | LDPI  | OUT    | IN
            | LB    | XOR    | SB
            | BSUB  | SSUB   | DUP
            | GTx   | WSUB   | AND
            | RET   | GCALL  | SHR
            | SHL   | REM
    deriving (Eq, Ord, Show)

data Instr = Pri PInstr Int32
           | Sec SInstr
    deriving (Eq, Ord)

instance Show Instr where
    show (Pri p n) = show p ++ " " ++ show n
    show (Sec s) = show s

Instruction decoding will need to move within the instruction stream, so it is part of the evaluation monad.

decodeInstr :: Eval Instr
decodeInstr = decodeInstr_ 0

decodeInstr_ :: Int32 -> Eval Instr
decodeInstr_ acc = do
    b <- peekAndIncr
    let acc' = acc + fromIntegral (b .&. 0xf)
    case () of
        _ | b <= 0x0f -> return $ Pri J acc'
        _ | b <= 0x1f -> return $ Pri LDLP acc'
        _ | b <= 0x2f -> decodeInstr_ $ acc' `shiftL` 4
        _ | b <= 0x3f -> return $ Pri LDNL acc'
        _ | b <= 0x4f -> return $ Pri LDC acc'
        _ | b <= 0x5f -> return $ Pri LDNLP acc'
        _ | b <= 0x6f -> decodeInstr_ $ complement acc' `shiftL` 4
        _ | b <= 0x7f -> return $ Pri LDL acc'
        _ | b <= 0x8f -> return $ Pri ADC acc'
        _ | b <= 0x9f -> return $ Pri CALL acc'
        _ | b <= 0xaf -> return $ Pri CJ acc'
        _ | b <= 0xbf -> return $ Pri AJW acc'
        _ | b <= 0xcf -> return $ Pri EQC acc'
        _ | b <= 0xdf -> return $ Pri STL acc'
        _ | b <= 0xef -> return $ Pri STNL acc'
        _             -> return $ Sec $ parseSecondary acc'

peekAndIncr :: Eval Word8
peekAndIncr = do
    addr <- use iptr
    b <- use (memByte addr)
    iptr += 1
    return b

parseSecondary :: Int32 -> SInstr
parseSecondary 0x01 = LB
parseSecondary 0x02 = BSUB
parseSecondary 0x06 = GCALL
parseSecondary 0x07 = IN
parseSecondary 0x08 = PROD
parseSecondary 0x09 = GTx
parseSecondary 0x0a = WSUB
parseSecondary 0x0b = OUT
parseSecondary 0x1b = LDPI
parseSecondary 0x1f = REM
parseSecondary 0x20 = RET
parseSecondary 0x33 = XOR
parseSecondary 0x3b = SB
parseSecondary 0x3c = GAJW
parseSecondary 0x40 = SHR
parseSecondary 0x41 = SHL
parseSecondary 0x42 = MINT
parseSecondary 0x46 = AND
parseSecondary 0x5a = DUP
parseSecondary 0xc1 = SSUB
parseSecondary b = error $ "Unknown secondary 0x" ++ showHex b ""

The two stacks

Data is manipulated using two different mechanisms: the integer stack and the workspace.

The integer stack is a set of three registers: A, B, and C, which can be used as a stack using these operations. Actually, it can only be manipulated through push and pop operations, so we represent this using a list.

The instruction set reference says that an undefined value will be popped if the stack is empty; here we consider that this will not happen, and allow a partial pattern matching.

pushInt :: Int32 -> Eval ()
pushInt n =
    intStack %= (n:)

popInt :: Eval Int32
popInt = do
    (h:t) <- use intStack
    intStack .= t
    return h

popAll :: Eval (Int32, Int32, Int32)
popAll = do
    a <- popInt
    b <- popInt
    c <- popInt
    return (a, b, c)

Only the head (A) can be directly accessed, so we first define a lens between a list and its head, and compose it with intStack.

headLens :: Lens' [a] a
headLens = lens head $ \ l x -> x:tail l

areg :: Lens' EvalState Int32
areg = intStack . headLens

The workspace is a place in memory (pointed to by a register wptr) where local variables can be stored and loaded, a bit like a stack pointer. We first define push and pop operations.

pushWorkspace :: Int32 -> Eval ()
pushWorkspace value = do
    wptr -= 4
    var 0 .= value

popWorkspace :: Eval Int32
popWorkspace = do
    w <- use $ var 0
    wptr += 4
    return w

Then we define a lens to focus on a variable.

var :: Int32 -> Lens' EvalState Int32
var n =
    lens getVar setVar
        where
            varLens s = memWord ((s ^. wptr) @@ n)
            getVar s = s ^. varLens s
            setVar s v = set (varLens s) v s

Input and output

The main particularity of the ST20 architecture is that it has hardware support of message channels. They map fairly naturally to Control.Concurrent.Chan channels. Each ST20 thread will have a map from channel numbers to input or output channels:

getXChan :: (EvalEnv -> [(Int32, a)]) -> Int32 -> EvalEnv -> a
getXChan member w st =
    fromJust $ lookup w $ member st

getIChan :: Int32 -> EvalEnv -> IChannel
getIChan = getXChan envInChans

getOChan :: Int32 -> EvalEnv -> OChannel
getOChan = getXChan envOutChans

And these channels can be either a Chan Word8 or a plain Handle, to connect a thread to the process’ standard input and output.

readFromIChan :: IChannel -> Int32 -> Eval [Word8]
readFromIChan (InChan chan) n =
    liftIO $ mapM (\ _ -> readChan chan) [1..n]
readFromIChan (InHandle h) n =
    liftIO $ do
        bs <- BS.hGet h $ fromIntegral n
        return $ BS.unpack bs

writeToOChan :: OChannel -> [Word8] -> Eval ()
writeToOChan (OutChan chan) ws =
    liftIO $ writeList2Chan chan ws
writeToOChan (OutHandle h) ws =
    liftIO $ do
        BS.hPutStr h $ BS.pack ws
        hFlush h

A few combinators

We first define a few combinators that will help us define the interpret function.

Pop two operands, and push the result:

liftOp :: (Int32 -> Int32 -> Int32) -> Eval ()
liftOp op = do
    a <- popInt
    b <- popInt
    pushInt $ op a b

Exchange two registers:

xchg :: Lens' EvalState Int32 -> Lens' EvalState Int32 -> Eval ()
xchg l1 l2 = do
  x1 <- use l1
  x2 <- use l2
  l1 .= x2
  l2 .= x1

Convert a boolean to an integer:

fromBool :: Bool -> Int32
fromBool False = 0
fromBool True = 1

The `interpret` function

The core of the interpreter is the following function. It takes an instruction and transforms it into a monadic action in Eval.

interpret :: Instr -> Eval ()

Some cases are very simple.

interpret (Pri AJW n) = wptr += 4 * n
interpret (Pri LDNLP n) = areg += 4 * n
interpret (Pri J n) = iptr += n
interpret (Pri LDC n) = pushInt n
interpret (Sec MINT) = pushInt 0x80000000
interpret (Sec GAJW) = xchg areg wptr
interpret (Sec GCALL) = xchg areg iptr
interpret (Pri ADC n) = areg += n
interpret (Pri EQC n) = areg %= (\ a -> fromBool $ a == n)

For some others, we can lift them into the host language and use Haskell operations.

interpret (Sec PROD) = liftOp (*)
interpret (Sec XOR) = liftOp xor
interpret (Sec AND) = liftOp (.&.)
interpret (Sec BSUB) = liftOp (+)
interpret (Sec SSUB) = liftOp $ \ a b -> a + 2 * b
interpret (Sec WSUB) = liftOp (@@)
interpret (Sec GTx) = liftOp $ \ a b -> fromBool $ b > a
interpret (Sec SHR) = liftOp $ \ a b -> b `shiftR` fromIntegral a
interpret (Sec SHL) = liftOp $ \ a b -> b `shiftL` fromIntegral a
interpret (Sec REM) = liftOp $ \ a b -> b `mod` a

Others need a few operations to prepare the operands and access memory.

interpret (Sec SB) = do
    a <- popInt
    b <- popInt
    memByte a .= fromIntegral b
interpret (Sec DUP) = do
    a <- popInt
    pushInt a
    pushInt a
interpret (Pri STL n) = do
    v <- popInt
    var n .= v
interpret (Pri LDLP n) = do
    v <- use wptr
    pushInt $ v @@ n
interpret (Pri LDL n) = do
    v <- use $ var n
    pushInt v
interpret (Sec LDPI) = do
    ip <- use iptr
    areg += ip
interpret (Pri CJ n) = do
    a <- popInt
    let d = if a == 0 then n else 0
    iptr += d
interpret (Sec LB) = do
    a <- use areg
    a' <- fromIntegral <$> use (memByte a)
    areg .= a'
interpret (Pri STNL n) = do
    a <- popInt
    b <- popInt
    memWord (a @@ n) .= b
interpret (Pri LDNL n) = do
    a <- use areg
    a' <- use $ memWord $ a @@ n
    areg .= a'

Call and return instructions use the workspace to pass arguments.

interpret (Pri CALL n) = do
    (a, b, c) <- popAll
    pushWorkspace c
    pushWorkspace b
    pushWorkspace a
    ip <- use iptr
    pushWorkspace ip
    areg .= ip
    iptr += n
interpret (Sec RET) = do
    newIp <- popWorkspace
    _ <- popWorkspace
    _ <- popWorkspace
    _ <- popWorkspace
    iptr .= newIp

To perform I/O, the calling transputer needs to supply three things in the int stack:

the number of bytes to transfer;
a pointer to a channel;
where to read or write the message.

The channel itself is abstracted in the transputer’s channel maps. Most reads succeed; however the first transputer’s channel 0 will read directly from a file, so it will reach end of file at some time. We can detect that when an empty list is read, and exit the process.

interpret (Sec OUT) = do
    (len, pChan, pMsg) <- popAll
    message <- use $ mem8s pMsg len
    chan <- asks $ getOChan pChan
    writeToOChan chan message
interpret (Sec IN) = do
    (len, pChan, pMsg) <- popAll
    chan <- asks $ getIChan pChan
    input <- readFromIChan chan len
    when (null input) $ liftIO exitSuccess
    mem8s pMsg (fromIntegral $ length input) .= input

The core of the interpreter is then very simple:

evalLoop :: Eval ()
evalLoop = do
    i <- decodeInstr
    interpret i
    evalLoop

Boot from link

Several things are missing: the memory map, and how the system boots.

It turns out that the ST20 has a very simple boot protocol:

read 1 byte from port 0, call it n
read n bytes from port 0
store those at memStart
set the workspace just after this memory chunk
jump to memStart

bootSeq :: Eval ()
bootSeq = do
    chan <- asks $ getIChan $ iPin 0
    len <- head <$> readFromIChan chan 1
    prog <- readFromIChan chan $ fromIntegral len
    mem8s memStart (fromIntegral $ length prog) .= prog
    wptr .= memStart + fromIntegral len

There’s some flexibility on memStart, but this value works:

memStart :: Address
memStart = 0x80000100

Pin numbers, however, are mapped to fixed address:

iPin :: Int32 -> Int32
iPin n = 0x80000010 @@ n

oPin :: Int32 -> Int32
oPin n = 0x80000000 @@ n

We decide to initialize the memory with zeroes:

initialMem :: Mem
initialMem =
    M.fromList $ zip [0x80000000 .. memEnd] $ repeat 0
        where
            memSize = 0x4000
            memEnd = memStart + memSize - 1

Booting a transputer is then just a matter of reading from the correct channel and doing the rest of the evaluation loop.

transputer :: Maybe Analysis
           -> [((Int32, IChannel), (Int32, OChannel))]
           -> IO (MVar ())
transputer analysis cmap = do
    let (imap, omap) = unzip cmap
    fork $ runEval initialMem imap omap $ do
        bootSeq
        runAnalysis analysis
        evalLoop

Multithreading boilerplate

If you fork threads and don’t wait for them, nothing will happen since the main thread will just exit. The solution is to create a “control” MVar that will be signalled to by each thread:

fork :: IO () -> IO (MVar ())
fork io = do
    mvar <- newEmptyMVar
    _ <- forkFinally io $ \ _ -> putMVar mvar ()
    return mvar

And to wait for all of them:

runAll :: [IO (MVar ())] -> IO ()
runAll ms = do
    threads <- sequence ms
    mapM_ takeMVar threads

Connecting the lines

For this problem we have 13 transputers.

data TransputerName = T00 | T01 | T02 | T03
                    | T04 | T05 | T06 | T07
                    | T08 | T09 | T10 | T11
                    | T12
  deriving (Enum, Eq)

We devise a way to connect them together. The communication between two transputers is bidirectional, so we need two channels. Each of them is converted to an OChannel on one side and an IChannel on the other one.

connect :: TransputerName -> Int32
        -> TransputerName -> Int32
        -> IO [(TransputerName, Int32, OChannel, IChannel)]
connect src srcPort dst dstPort = do
  x <- newChan
  y <- newChan
  return [ (src, srcPort, OutChan x, InChan y)
         , (dst, dstPort, OutChan y, InChan x)
         ]

Booting them is a matter of creating the correct communication channels (this pinout list comes from a schematic that was present in the challenge files).

main :: IO ()
main = do
    pins <- concat <$> sequence
        [ connect T00 1 T01 0
        , connect T00 2 T02 0
        , connect T00 3 T03 0
        , connect T01 1 T04 0
        , connect T01 2 T05 0
        , connect T01 3 T06 0
        , connect T02 1 T07 0
        , connect T02 2 T08 0
        , connect T02 3 T09 0
        , connect T03 1 T10 0
        , connect T03 2 T11 0
        , connect T03 3 T12 0
        , connect T11 1 T12 1
        ]
    runAll $ map (buildTransputer pins) [T00 ..]
        where
            buildTransputer pins t =
                transputer (isDebug t) $ onlyFor t pins ++ extraPins t
            pin n ochan ichan = ((iPin n, ichan), (oPin n, ochan))
            onlyFor src l = [pin p oc ic | (name, p, oc, ic) <- l, name == src]
            extraPins T00 = [((iPin 0, InHandle stdin), (oPin 0, OutHandle stdout))]
            extraPins _ = []

Bonus: static analysis tools

The above transputer function is controlled by the following configuration:

data Analysis = Graph | Disasm

isDebug :: TransputerName -> Maybe Analysis
isDebug _ = Nothing

It means that for each transputer, we can choose to print a graph or a disassembly of the code that will be executed. To do that, we will first compute the set of all edges in the control flow graph.

This analysis relies on a nextInstr function that statically computes the set of next instructions. These can be reached either because it’s the next one in the instruction flow (DSeq), because of jump (DJmp), or an unknown destination, for example after a RET (DDyn).

data Dest = DSeq Address
          | DJmp Address
          | DDyn
    deriving (Eq, Ord)

nextInstrs :: Instr -> [Dest]
nextInstrs (Pri CJ n) = [DSeq 0, DJmp n]
nextInstrs (Pri J n) = [DJmp n]
nextInstrs (Pri CALL n) = [DSeq 0, DJmp n]
nextInstrs (Sec GCALL) = [DDyn]
nextInstrs (Sec RET) = [DDyn]
nextInstrs _ = [DSeq 0]

We can wrap this function in a monadic one that can turn these relative addresses into absolute ones (since it can know the addresses of functions).

type EdgeSet = S.Set (Address, Instr, Dest)

instrDests :: Address -> Eval EdgeSet
instrDests start = do
    iptr .= start
    i <- decodeInstr
    let deltaips = nextInstrs i
    new <- use iptr
    return $ S.fromList $ map (\ d -> (start, i, adjust new d)) deltaips
        where
            adjust n (DSeq d) = DSeq $ n + d
            adjust n (DJmp d) = DJmp $ n + d
            adjust _ DDyn = DDyn

Then, the algorithm consists in computing the fixpoint of the following iterating function:

step :: EdgeSet -> Eval EdgeSet
step s = do
    xs <- mapM (basicBlockM . getDest) $ S.toList s
    return $ S.union s $ S.unions xs
        where
            getDest (_, _, DSeq a) = Just a
            getDest (_, _, DJmp a) = Just a
            getDest (_, _, DDyn) = Nothing
            basicBlockM (Just a) = instrDests a
            basicBlockM Nothing = return S.empty

The fixpoint itself is computed using the following function, which takes a predicate on two EdgeSets to stop the iteration.

stepUntil :: ((EdgeSet, EdgeSet) -> Bool) -> (EdgeSet, EdgeSet) -> Eval EdgeSet
stepUntil p (a, b) | p (a, b) = return b
stepUntil p (_, b) = do
    c <- step b
    stepUntil p (b, c)

We’ll stop when their size is equal.

runAnalysis :: Maybe Analysis -> Eval ()
runAnalysis Nothing = return ()
runAnalysis (Just analysis) = do
    s0 <- instrDests memStart
    let p (a, b) = S.size a == S.size b
    r <- stepUntil p (S.empty, s0)
    liftIO $ putStrLn $ convert analysis r
    iptr .= memStart

Finally, here is how to convert the EdgeSets in a human-readable form.

convert :: Analysis -> EdgeSet -> String
convert Graph es =
    "digraph G{\n"
    ++ "node[shape=point]\n"
    ++ concatMap edge (S.toList es)
    ++ "}"
        where
            edge (x, i, y) = show x ++ " -> " ++ toNode x y ++ "[label=\"" ++ show i ++ "\"];\n"
            toNode _ (DSeq a) = show a
            toNode _ (DJmp a) = show a
            toNode x DDyn = "dyn" ++ show x
convert Disasm es = concatMap go $ S.toList es
    where
        go (x, i, DSeq _) =
            printf "%04x %s\n" x (show i)
        go (x, i, DJmp y) =
            printf "%04x %s  [* %04x]\n" x (show i) y
        go (x, i, DDyn) =
            printf "%04x %s  [* dyn]\n" x (show i)

For example here is an extract of the beginning of the first transputer. You can notice instructions with several destinations (conditional jumps) are displayed twice.

80000100 AJW -76
80000102 LDC 0
80000103 STL 1
80000104 LDC 0
80000105 STL 3
80000106 MINT
80000108 LDNLP 1024
8000010b GAJW
8000010d AJW -76
8000010f LDC 201
80000111 LDPI
80000113 MINT
80000115 LDC 8
80000116 OUT
80000117 LDLP 73
80000119 MINT
8000011b LDNLP 4
8000011c LDC 12
8000011d IN
8000011e LDL 73
80000120 CJ 21
80000120 CJ 21  [* 80000137]
80000122 LDC 205
80000124 LDPI

For the graph output, I assume that you have already seen graphviz output:

The introduction image was done using the same output but an alternative layout engines.

Hope you enjoyed this article!