Enter the void *

A lens-based ST20 emulator

by Etienne Millon on August 20, 2015

Tagged as: haskell, emulator, lenses.

Every year, as part of the SSTIC conference, there is a forensics/reverse engineering challenge. I participated in the 2015 edition. Though I did not manage to complete it, I made an emulator for the exotic ST20 architecture, which is probably worth describing here.

Some programs will loop. It’s OK.

Some programs will loop. It’s OK.

Note that this emulator is not really optimized for pure speed. In the actual challenge I actually had to rewrite it as pure Haskell (i.e., removing the emulation part) so that it was faster. Instead, the goal of this article is to show a few techniques to write powerful emulators in Haskell.

> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> {-# LANGUAGE Rank2Types #-}
> {-# LANGUAGE TemplateHaskell #-}
> import Control.Applicative
> import Control.Concurrent
> import Control.Monad.RWS
> import Control.Lens hiding (imap, op)
> import Data.Bits
> import Data.Int
> import Data.Maybe
> import Data.Word
> import Numeric
> import System.Exit
> import System.IO
> import Text.Printf
> 
> import qualified Data.ByteString as BS
> import qualified Data.Map as M
> import qualified Data.Set as S

The evaluation monad

This program uses Template Haskell to define lenses, so unfortunately we need to start with a few type definitions.

The ST20’s memory goes from 0x80000000 to 0x7fffffff:

> type Address = Int32

We’ll represent the memory using a map. The performance is surprisingly close to that of an array. It is possible to get significantly speeds up memory access by using an IOUArray but it turns loads and stores become monadic operations and makes it impossible to use lenses.

> type Mem = M.Map Address Word8

As we’ll see, transputers (hardware threads) can communicate together. We’ll be able to connect it either between them, or to a tty.

> data IChannel = InChan (Chan Word8)
>               | InHandle Handle
> 
> data OChannel = OutChan (Chan Word8)
>               | OutHandle Handle
> 
> type IChannelMap = [(Int32, IChannel)]
> type OChannelMap = [(Int32, OChannel)]

All evaluations take place in a Eval Monad which is a monad transformer stack with the following capabilities:

> newtype Eval a = Eval (RWST EvalEnv () EvalState IO a)
>     deriving ( Functor
>              , Monad
>              , MonadIO
>              , MonadReader EvalEnv
>              , MonadState EvalState
>              )
> 
> data EvalEnv =
>     EvalEnv
>         { envInChans :: IChannelMap
>         , envOutChans :: OChannelMap
>         }
> 
> data EvalState =
>     EvalState
>         { _iptr :: !Address
>         , _intStack :: [Int32]
>         , _wptr :: !Int32
>         , _mem :: !Mem
>         }
> $(makeLenses ''EvalState)
> 
> runEval :: Mem -> IChannelMap -> OChannelMap -> Eval a -> IO a
> runEval memory imap omap (Eval m) =
>     fst <$> evalRWST m env st
>         where
>             env = EvalEnv imap omap
>             st =
>                  EvalState
>                      { _iptr = memStart
>                      , _intStack = []
>                      , _wptr = 0xaaaaaaaa
>                      , _mem = memory
>                      }

The above $(...) is a Template Haskell splice. It creates lenses based on the record declaration of EvalState. Lenses are a very powerful tool that makes it possible to compose record reads and updates in a functional way. Here, it defines a lens for each record field; for example, the splice expands to a top-level declaration iptr :: Lens' EvalState Address. But we will define our own lenses too, and everything will remain composable.

Memory

This is naturally adapted to byte access:

> memByteOpt :: Address -> Lens' EvalState (Maybe Word8)
> memByteOpt addr =
>   mem . at addr

See? We composed the mem lens (between an evaluation state and a memory state) with at addr, which is a lens between a memory state and the value at address addr. Well, not exactly: at actually returns a Maybe Word8. We will assume that all memory accesses will succeed, so we want a lens that returns a plain Word8. To achieve this, we can compose with a lens that treats Maybe a as a container of a:

> maybeLens :: Lens' (Maybe a) a
> maybeLens = lens fromJust (const Just)
> 
> memByte :: Address -> Lens' EvalState Word8
> memByte addr =
>   memByteOpt addr . maybeLens

Sometimes we will also need to access memory word by word. To achieve that, we first define conversion functions.

> bytesToWord :: (Word8, Word8, Word8, Word8) -> Int32
> bytesToWord (b0, b1, b2, b3) =
>     sum [ fromIntegral b0
>         , fromIntegral b1 `shiftL` 8
>         , fromIntegral b2 `shiftL` 16
>         , fromIntegral b3 `shiftL` 24
>         ]
> 
> wordToBytes :: Int32 -> (Word8, Word8, Word8, Word8)
> wordToBytes w =
>     (b0, b1, b2, b3)
>         where
>             b0 = fromIntegral $ w .&. 0x000000ff
>             b1 = fromIntegral $ (w .&. 0x0000ff00) `shiftR` 8
>             b2 = fromIntegral $ (w .&. 0x00ff0000) `shiftR` 16
>             b3 = fromIntegral $ (w .&. 0xff000000) `shiftR` 24

Then, we can define a lens focusing on a 32-bit value.

> compose :: [a -> a] -> a -> a
> compose = foldr (.) id
> 
> get32 :: Address -> EvalState -> Int32
> get32 base s =
>     bytesToWord (b0, b1, b2, b3)
>         where
>             b0 = s ^. memByte base
>             b1 = s ^. memByte (base + 1)
>             b2 = s ^. memByte (base + 2)
>             b3 = s ^. memByte (base + 3)
> 
> set32 :: Address -> EvalState -> Int32 -> EvalState
> set32 base s v =
>    compose
>        [ set (memByte base) b0
>        , set (memByte (base + 1)) b1
>        , set (memByte (base + 2)) b2
>        , set (memByte (base + 3)) b3
>        ] s
>         where
>             (b0, b1, b2, b3) = wordToBytes v
> 
> memWord :: Address -> Lens' EvalState Int32
> memWord addr = lens (get32 addr) (set32 addr)

The instruction set reference defines a handy operator to shift an address by a word offset:

> (@@) :: Address -> Int32 -> Address
> a @@ n = a + 4 * n

It will be also handy to access the memory in list chunks:

> mem8s :: Address -> Int32 -> Lens' EvalState [Word8]
> mem8s base len = lens getList setList
>     where
>         getList s =
>             map (\ off -> s ^. memByte (base + off)) [0 .. len - 1]
>         setList s0 ws =
>             compose (zipWith (\ off w -> set (memByte (base + off)) w) [0..] ws) s0

Instruction decoding

Instructions are usually encoded on a single byte: the opcode is in the first nibble, and a parameter is in the second one. For example this is how a LDC (load constant) is encoded:

   .--- 0x40 LDC
  |.---          0x5
  ||
0x45         LDC 0x5

This only works for 4-bytes constants. To load bigger constants, there is a “prefix” operation that will shift the current operand:

   .-------- 0x20 PFX
  |.--------          0x2
  ||
  ||    .--- 0x40 LDC
  ||   |.---          0x5
  ||   ||
0x22 0x45    LDC 0x25

Those are chainable; for example 0x21 0x22 0x45 encodes LDC 0x125.

Another prefix shifts and complements the current operand value:

   .-------- 0x60 NFX
  |.--------          0x2
  ||
  ||    .--- 0x40 LDC
  ||   |.---          0x5
  ||   ||
0x62 0x45    LDC (~0x25)

The ST20 architecture actually provides two type of instructions:

We know enough to draft an instruction decoder.

> data PInstr = AJW   | ADC
>             | LDC   | STL
>             | LDL   | LDNL
>             | LDLP  | LDNLP
>             | CJ    | J
>             | EQC   | CALL
>             | STNL
>     deriving (Eq, Ord, Show)
> 
> data SInstr = PROD  | MINT   | GAJW
>             | LDPI  | OUT    | IN
>             | LB    | XOR    | SB
>             | BSUB  | SSUB   | DUP
>             | GTx   | WSUB   | AND
>             | RET   | GCALL  | SHR
>             | SHL   | REM
>     deriving (Eq, Ord, Show)
> 
> data Instr = Pri PInstr Int32
>            | Sec SInstr
>     deriving (Eq, Ord)
> 
> instance Show Instr where
>     show (Pri p n) = show p ++ " " ++ show n
>     show (Sec s) = show s

Instruction decoding will need to move within the instruction stream, so it is part of the evaluation monad.

> decodeInstr :: Eval Instr
> decodeInstr = decodeInstr_ 0
> 
> decodeInstr_ :: Int32 -> Eval Instr
> decodeInstr_ acc = do
>     b <- peekAndIncr
>     let acc' = acc + fromIntegral (b .&. 0xf)
>     case () of
>         _ | b <= 0x0f -> return $ Pri J acc'
>         _ | b <= 0x1f -> return $ Pri LDLP acc'
>         _ | b <= 0x2f -> decodeInstr_ $ acc' `shiftL` 4
>         _ | b <= 0x3f -> return $ Pri LDNL acc'
>         _ | b <= 0x4f -> return $ Pri LDC acc'
>         _ | b <= 0x5f -> return $ Pri LDNLP acc'
>         _ | b <= 0x6f -> decodeInstr_ $ complement acc' `shiftL` 4
>         _ | b <= 0x7f -> return $ Pri LDL acc'
>         _ | b <= 0x8f -> return $ Pri ADC acc'
>         _ | b <= 0x9f -> return $ Pri CALL acc'
>         _ | b <= 0xaf -> return $ Pri CJ acc'
>         _ | b <= 0xbf -> return $ Pri AJW acc'
>         _ | b <= 0xcf -> return $ Pri EQC acc'
>         _ | b <= 0xdf -> return $ Pri STL acc'
>         _ | b <= 0xef -> return $ Pri STNL acc'
>         _             -> return $ Sec $ parseSecondary acc'
> 
> peekAndIncr :: Eval Word8
> peekAndIncr = do
>     addr <- use iptr
>     b <- use (memByte addr)
>     iptr += 1
>     return b
> 
> parseSecondary :: Int32 -> SInstr
> parseSecondary 0x01 = LB
> parseSecondary 0x02 = BSUB
> parseSecondary 0x06 = GCALL
> parseSecondary 0x07 = IN
> parseSecondary 0x08 = PROD
> parseSecondary 0x09 = GTx
> parseSecondary 0x0a = WSUB
> parseSecondary 0x0b = OUT
> parseSecondary 0x1b = LDPI
> parseSecondary 0x1f = REM
> parseSecondary 0x20 = RET
> parseSecondary 0x33 = XOR
> parseSecondary 0x3b = SB
> parseSecondary 0x3c = GAJW
> parseSecondary 0x40 = SHR
> parseSecondary 0x41 = SHL
> parseSecondary 0x42 = MINT
> parseSecondary 0x46 = AND
> parseSecondary 0x5a = DUP
> parseSecondary 0xc1 = SSUB
> parseSecondary b = error $ "Unknown secondary 0x" ++ showHex b ""

The two stacks

Data is manipulated using two different mechanisms: the integer stack and the workspace.

The integer stack is a set of three registers: A, B, and C, which can be used as a stack using these operations. Actually, it can only be manipulated through push and pop operations, so we represent this using a list.

The instruction set reference says that an undefined value will be popped if the stack is empty; here we consider that this will not happen, and allow a partial pattern matching.

> pushInt :: Int32 -> Eval ()
> pushInt n =
>     intStack %= (n:)
> 
> popInt :: Eval Int32
> popInt = do
>     (h:t) <- use intStack
>     intStack .= t
>     return h
> 
> popAll :: Eval (Int32, Int32, Int32)
> popAll = do
>     a <- popInt
>     b <- popInt
>     c <- popInt
>     return (a, b, c)

Only the head (A) can be directly accessed, so we first define a lens between a list and its head, and compose it with intStack.

> headLens :: Lens' [a] a
> headLens = lens head $ \ l x -> x:tail l
> 
> areg :: Lens' EvalState Int32
> areg = intStack . headLens

The workspace is a place in memory (pointed to by a register wptr) where local variables can be stored and loaded, a bit like a stack pointer. We first define push and pop operations.

> pushWorkspace :: Int32 -> Eval ()
> pushWorkspace value = do
>     wptr -= 4
>     var 0 .= value
> 
> popWorkspace :: Eval Int32
> popWorkspace = do
>     w <- use $ var 0
>     wptr += 4
>     return w

Then we define a lens to focus on a variable.

> var :: Int32 -> Lens' EvalState Int32
> var n =
>     lens getVar setVar
>         where
>             varLens s = memWord ((s ^. wptr) @@ n)
>             getVar s = s ^. varLens s
>             setVar s v = set (varLens s) v s

Input and output

The main particularity of the ST20 architecture is that it has hardware support of message channels. They map fairly naturally to Control.Concurrent.Chan channels. Each ST20 thread will have a map from channel numbers to input or output channels:

> getXChan :: (EvalEnv -> [(Int32, a)]) -> Int32 -> EvalEnv -> a
> getXChan member w st =
>     fromJust $ lookup w $ member st
> 
> getIChan :: Int32 -> EvalEnv -> IChannel
> getIChan = getXChan envInChans
> 
> getOChan :: Int32 -> EvalEnv -> OChannel
> getOChan = getXChan envOutChans

And these channels can be either a Chan Word8 or a plain Handle, to connect a thread to the process’ standard input and output.

> readFromIChan :: IChannel -> Int32 -> Eval [Word8]
> readFromIChan (InChan chan) n =
>     liftIO $ mapM (\ _ -> readChan chan) [1..n]
> readFromIChan (InHandle h) n =
>     liftIO $ do
>         bs <- BS.hGet h $ fromIntegral n
>         return $ BS.unpack bs
> 
> writeToOChan :: OChannel -> [Word8] -> Eval ()
> writeToOChan (OutChan chan) ws =
>     liftIO $ writeList2Chan chan ws
> writeToOChan (OutHandle h) ws =
>     liftIO $ do
>         BS.hPutStr h $ BS.pack ws
>         hFlush h

A few combinators

We first define a few combinators that will help us define the interpret function.

Pop two operands, and push the result:

> liftOp :: (Int32 -> Int32 -> Int32) -> Eval ()
> liftOp op = do
>     a <- popInt
>     b <- popInt
>     pushInt $ op a b

Exchange two registers:

> xchg :: Lens' EvalState Int32 -> Lens' EvalState Int32 -> Eval ()
> xchg l1 l2 = do
>   x1 <- use l1
>   x2 <- use l2
>   l1 .= x2
>   l2 .= x1

Convert a boolean to an integer:

> fromBool :: Bool -> Int32
> fromBool False = 0
> fromBool True = 1

The interpret function

The core of the interpreter is the following function. It takes an instruction and transforms it into a monadic action in Eval.

> interpret :: Instr -> Eval ()

Some cases are very simple.

> interpret (Pri AJW n) = wptr += 4 * n
> interpret (Pri LDNLP n) = areg += 4 * n
> interpret (Pri J n) = iptr += n
> interpret (Pri LDC n) = pushInt n
> interpret (Sec MINT) = pushInt 0x80000000
> interpret (Sec GAJW) = xchg areg wptr
> interpret (Sec GCALL) = xchg areg iptr
> interpret (Pri ADC n) = areg += n
> interpret (Pri EQC n) = areg %= (\ a -> fromBool $ a == n)

For some others, we can lift them into the host language and use Haskell operations.

> interpret (Sec PROD) = liftOp (*)
> interpret (Sec XOR) = liftOp xor
> interpret (Sec AND) = liftOp (.&.)
> interpret (Sec BSUB) = liftOp (+)
> interpret (Sec SSUB) = liftOp $ \ a b -> a + 2 * b
> interpret (Sec WSUB) = liftOp (@@)
> interpret (Sec GTx) = liftOp $ \ a b -> fromBool $ b > a
> interpret (Sec SHR) = liftOp $ \ a b -> b `shiftR` fromIntegral a
> interpret (Sec SHL) = liftOp $ \ a b -> b `shiftL` fromIntegral a
> interpret (Sec REM) = liftOp $ \ a b -> b `mod` a

Others need a few operations to prepare the operands and access memory.

> interpret (Sec SB) = do
>     a <- popInt
>     b <- popInt
>     memByte a .= fromIntegral b
> interpret (Sec DUP) = do
>     a <- popInt
>     pushInt a
>     pushInt a
> interpret (Pri STL n) = do
>     v <- popInt
>     var n .= v
> interpret (Pri LDLP n) = do
>     v <- use wptr
>     pushInt $ v @@ n
> interpret (Pri LDL n) = do
>     v <- use $ var n
>     pushInt v
> interpret (Sec LDPI) = do
>     ip <- use iptr
>     areg += ip
> interpret (Pri CJ n) = do
>     a <- popInt
>     let d = if a == 0 then n else 0
>     iptr += d
> interpret (Sec LB) = do
>     a <- use areg
>     a' <- fromIntegral <$> use (memByte a)
>     areg .= a'
> interpret (Pri STNL n) = do
>     a <- popInt
>     b <- popInt
>     memWord (a @@ n) .= b
> interpret (Pri LDNL n) = do
>     a <- use areg
>     a' <- use $ memWord $ a @@ n
>     areg .= a'

Call and return instructions use the workspace to pass arguments.

> interpret (Pri CALL n) = do
>     (a, b, c) <- popAll
>     pushWorkspace c
>     pushWorkspace b
>     pushWorkspace a
>     ip <- use iptr
>     pushWorkspace ip
>     areg .= ip
>     iptr += n
> interpret (Sec RET) = do
>     newIp <- popWorkspace
>     _ <- popWorkspace
>     _ <- popWorkspace
>     _ <- popWorkspace
>     iptr .= newIp

To perform I/O, the calling transputer needs to supply three things in the int stack:

The channel itself is abstracted in the transputer’s channel maps. Most reads succeed; however the first transputer’s channel 0 will read directly from a file, so it will reach end of file at some time. We can detect that when an empty list is read, and exit the process.

> interpret (Sec OUT) = do
>     (len, pChan, pMsg) <- popAll
>     message <- use $ mem8s pMsg len
>     chan <- asks $ getOChan pChan
>     writeToOChan chan message
> interpret (Sec IN) = do
>     (len, pChan, pMsg) <- popAll
>     chan <- asks $ getIChan pChan
>     input <- readFromIChan chan len
>     when (null input) $ liftIO exitSuccess
>     mem8s pMsg (fromIntegral $ length input) .= input

The core of the interpreter is then very simple:

> evalLoop :: Eval ()
> evalLoop = do
>     i <- decodeInstr
>     interpret i
>     evalLoop

Several things are missing: the memory map, and how the system boots.

It turns out that the ST20 has a very simple boot protocol:

> bootSeq :: Eval ()
> bootSeq = do
>     chan <- asks $ getIChan $ iPin 0
>     len <- head <$> readFromIChan chan 1
>     prog <- readFromIChan chan $ fromIntegral len
>     mem8s memStart (fromIntegral $ length prog) .= prog
>     wptr .= memStart + fromIntegral len

There’s some flexibility on memStart, but this value works:

> memStart :: Address
> memStart = 0x80000100

Pin numbers, however, are mapped to fixed address:

> iPin :: Int32 -> Int32
> iPin n = 0x80000010 @@ n
> 
> oPin :: Int32 -> Int32
> oPin n = 0x80000000 @@ n

We decide to initialize the memory with zeroes:

> initialMem :: Mem
> initialMem =
>     M.fromList $ zip [0x80000000 .. memEnd] $ repeat 0
>         where
>             memSize = 0x4000
>             memEnd = memStart + memSize - 1

Booting a transputer is then just a matter of reading from the correct channel and doing the rest of the evaluation loop.

> transputer :: Maybe Analysis
>            -> [((Int32, IChannel), (Int32, OChannel))]
>            -> IO (MVar ())
> transputer analysis cmap = do
>     let (imap, omap) = unzip cmap
>     fork $ runEval initialMem imap omap $ do
>         bootSeq
>         runAnalysis analysis
>         evalLoop

Multithreading boilerplate

If you fork threads and don’t wait for them, nothing will happen since the main thread will just exit. The solution is to create a “control” MVar that will be signalled to by each thread:

> fork :: IO () -> IO (MVar ())
> fork io = do
>     mvar <- newEmptyMVar
>     _ <- forkFinally io $ \ _ -> putMVar mvar ()
>     return mvar

And to wait for all of them:

> runAll :: [IO (MVar ())] -> IO ()
> runAll ms = do
>     threads <- sequence ms
>     mapM_ takeMVar threads

Connecting the lines

For this problem we have 13 transputers.

> data TransputerName = T00 | T01 | T02 | T03
>                     | T04 | T05 | T06 | T07
>                     | T08 | T09 | T10 | T11
>                     | T12
>   deriving (Enum, Eq)

We devise a way to connect them together. The communication between two transputers is bidirectional, so we need two channels. Each of them is converted to an OChannel on one side and an IChannel on the other one.

> connect :: TransputerName -> Int32
>         -> TransputerName -> Int32
>         -> IO [(TransputerName, Int32, OChannel, IChannel)]
> connect src srcPort dst dstPort = do
>   x <- newChan
>   y <- newChan
>   return [ (src, srcPort, OutChan x, InChan y)
>          , (dst, dstPort, OutChan y, InChan x)
>          ]

Booting them is a matter of creating the correct communication channels (this pinout list comes from a schematic that was present in the challenge files).

> main :: IO ()
> main = do
>     pins <- concat <$> sequence
>         [ connect T00 1 T01 0
>         , connect T00 2 T02 0
>         , connect T00 3 T03 0
>         , connect T01 1 T04 0
>         , connect T01 2 T05 0
>         , connect T01 3 T06 0
>         , connect T02 1 T07 0
>         , connect T02 2 T08 0
>         , connect T02 3 T09 0
>         , connect T03 1 T10 0
>         , connect T03 2 T11 0
>         , connect T03 3 T12 0
>         , connect T11 1 T12 1
>         ]
>     runAll $ map (buildTransputer pins) [T00 ..]
>         where
>             buildTransputer pins t =
>                 transputer (isDebug t) $ onlyFor t pins ++ extraPins t
>             pin n ochan ichan = ((iPin n, ichan), (oPin n, ochan))
>             onlyFor src l = [pin p oc ic | (name, p, oc, ic) <- l, name == src]
>             extraPins T00 = [((iPin 0, InHandle stdin), (oPin 0, OutHandle stdout))]
>             extraPins _ = []

Bonus: static analysis tools

The above transputer function is controlled by the following configuration:

> data Analysis = Graph | Disasm
> 
> isDebug :: TransputerName -> Maybe Analysis
> isDebug _ = Nothing

It means that for each transputer, we can choose to print a graph or a disassembly of the code that will be executed. To do that, we will first compute the set of all edges in the control flow graph.

This analysis relies on a nextInstr function that statically computes the set of next instructions. These can be reached either because it’s the next one in the instruction flow (DSeq), because of jump (DJmp), or an unknown destination, for example after a RET (DDyn).

> data Dest = DSeq Address
>           | DJmp Address
>           | DDyn
>     deriving (Eq, Ord)
> 
> nextInstrs :: Instr -> [Dest]
> nextInstrs (Pri CJ n) = [DSeq 0, DJmp n]
> nextInstrs (Pri J n) = [DJmp n]
> nextInstrs (Pri CALL n) = [DSeq 0, DJmp n]
> nextInstrs (Sec GCALL) = [DDyn]
> nextInstrs (Sec RET) = [DDyn]
> nextInstrs _ = [DSeq 0]

We can wrap this function in a monadic one that can turn these relative addresses into absolute ones (since it can know the addresses of functions).

> type EdgeSet = S.Set (Address, Instr, Dest)
> 
> instrDests :: Address -> Eval EdgeSet
> instrDests start = do
>     iptr .= start
>     i <- decodeInstr
>     let deltaips = nextInstrs i
>     new <- use iptr
>     return $ S.fromList $ map (\ d -> (start, i, adjust new d)) deltaips
>         where
>             adjust n (DSeq d) = DSeq $ n + d
>             adjust n (DJmp d) = DJmp $ n + d
>             adjust _ DDyn = DDyn

Then, the algorithm consists in computing the fixpoint of the following iterating function:

> step :: EdgeSet -> Eval EdgeSet
> step s = do
>     xs <- mapM (basicBlockM . getDest) $ S.toList s
>     return $ S.union s $ S.unions xs
>         where
>             getDest (_, _, DSeq a) = Just a
>             getDest (_, _, DJmp a) = Just a
>             getDest (_, _, DDyn) = Nothing
>             basicBlockM (Just a) = instrDests a
>             basicBlockM Nothing = return S.empty

The fixpoint itself is computed using the following function, which takes a predicate on two EdgeSets to stop the iteration.

> stepUntil :: ((EdgeSet, EdgeSet) -> Bool) -> (EdgeSet, EdgeSet) -> Eval EdgeSet
> stepUntil p (a, b) | p (a, b) = return b
> stepUntil p (_, b) = do
>     c <- step b
>     stepUntil p (b, c)

We’ll stop when their size is equal.

> runAnalysis :: Maybe Analysis -> Eval ()
> runAnalysis Nothing = return ()
> runAnalysis (Just analysis) = do
>     s0 <- instrDests memStart
>     let p (a, b) = S.size a == S.size b
>     r <- stepUntil p (S.empty, s0)
>     liftIO $ putStrLn $ convert analysis r
>     iptr .= memStart

Finally, here is how to convert the EdgeSets in a human-readable form.

> convert :: Analysis -> EdgeSet -> String
> convert Graph es =
>     "digraph G{\n"
>     ++ "node[shape=point]\n"
>     ++ concatMap edge (S.toList es)
>     ++ "}"
>         where
>             edge (x, i, y) = show x ++ " -> " ++ toNode x y ++ "[label=\"" ++ show i ++ "\"];\n"
>             toNode _ (DSeq a) = show a
>             toNode _ (DJmp a) = show a
>             toNode x DDyn = "dyn" ++ show x
> convert Disasm es = concatMap go $ S.toList es
>     where
>         go (x, i, DSeq _) =
>             printf "%04x %s\n" x (show i)
>         go (x, i, DJmp y) =
>             printf "%04x %s  [* %04x]\n" x (show i) y
>         go (x, i, DDyn) =
>             printf "%04x %s  [* dyn]\n" x (show i)

For example here is an extract of the beginning of the first transputer. You can notice instructions with several destinations (conditional jumps) are displayed twice.

80000100 AJW -76
80000102 LDC 0
80000103 STL 1
80000104 LDC 0
80000105 STL 3
80000106 MINT
80000108 LDNLP 1024
8000010b GAJW
8000010d AJW -76
8000010f LDC 201
80000111 LDPI
80000113 MINT
80000115 LDC 8
80000116 OUT
80000117 LDLP 73
80000119 MINT
8000011b LDNLP 4
8000011c LDC 12
8000011d IN
8000011e LDL 73
80000120 CJ 21
80000120 CJ 21  [* 80000137]
80000122 LDC 205
80000124 LDPI

For the graph output, I assume that you have already seen graphviz output:

T03 with dot driver

T03 with dot driver

The introduction image was done using the same output but an alternative layout engines.

Hope you enjoyed this article!