Getting Started

backprop is a Haskell library available on hackage, so can be used in your package however way you like to require libraries. Be sure to add it to your cabal file's (or package.yaml's) build-depends field.

Automatic Backpropagated Functions

With backprop, you can write your functions in Haskell as normal functions:

import Numeric.Backprop

myFunc x = sqrt (x * 4)

They can be run with evalBP:

evalBP myFunc 9 :: Double => 
6.0

And...the twist? You can also get the gradient of your functions!

gradBP myFunc 9 :: Double => 
0.3333333333333333

We can even be cute with with the simple-reflect library:

evalBP myFunc x :: Expr => 
sqrt (x * 4)
gradBP myFunc x :: Expr => 
4 * (1 / (2 * sqrt (x * 4)))

And that's the gist of the entire library: write your functions to compute your things, and gradBP will give you the gradients and derivatives of those functions.

Multiple Same-Type Inputs

Multiple inputs of the same type can be handled with sequenceVar:

funcOnList (sequenceVar->[x,y,z]) = sqrt (x / y) * z
evalBP funcOnList [3,5,-2] :: Double => 
-1.5491933384829668
gradBP funcOnList [3,5,-2] :: [Double] => 
[ -0.2581988897471611
, 0.15491933384829668
, 0.7745966692414834]

Heterogeneous Backprop

But the real magic happens when you mix and match types. Let's make a simple type representing a feed-forward fully connected artificial neural network with 100 inputs, a single hidden layer of 20 nodes, and 5 outputs:

data Net = N { _nWeights1 :: L 20 100
             , _nBias1    :: R 20
             , _nWeights2 :: L  5  20
             , _nBias2    :: R  5
             }
  deriving (Show, Generic)

instance Backprop Net

-- requires -XTemplateHaskell
makeLenses ''Net

using the L m n type from the hmatrix library to represent an m-by-n matrix, and the R n type to represent an n-vector.

We can write a function to "run" the network on a R 100 and get an R 5 back, using ^^. for lens access and #> from the hmatrix-backprop library for matrix-vector multiplication:

runNet net x = z
  where
    -- run first layer
    y = logistic $ (net ^^. nWeights1) #> x + (net ^^. nBias1)
    -- run second layer
    z = logistic $ (net ^^. nWeights2) #> y + (net ^^. nBias2)

logistic :: Floating a => a -> a
logistic x = 1 / (1 + exp (-x))

We can run this with a network and input vector:

evalBP2 runNet myNet myVector => 
[0.7710060782631345,0.3382199144893642,0.4700125359176054,0.45157174197218575,0.5091200271700325]

But --- and here's the fun part --- if we write a "loss function" to evaluate "how badly" our network has done, using dot from the hmatrix-backprop library:

squaredError target output = error `dot` error
  where
    error = target - output

we can "test" our networks:

netError target input net = squaredError (auto target)
                                         (runNet net (auto input))

(more on auto later)

evalBP (netError myTarget myVector) myNet => 
0.4510551183616181

At this point, we've written a normal function to compute the error of our network. And, with the backprop library...we now have a way to compute the gradient of our network's error with respect to all of our weights!

gradBP (netError myTarget myVector) myNet => 
N {_nWeights1 = (matrix -- ...
 [    4.11263641483051e-4,  1.2452768908611275e-4,   4.945692214073669e-4, -1.8787627939922071e-6,   -- ...
 ,   4.456096403579713e-3,  1.3492741188636985e-3,   5.358723472095238e-3, -2.0356645433810056e-5,   -- ...
 ,  -2.363504793989016e-3,  -7.156523466991931e-4,   -2.84225642106488e-3,  1.0797124818415972e-5, - -- ...
 ,   6.226054129738608e-3,  1.8852046587574407e-3,   7.487203907160475e-3, -2.8442287799021238e-5,   -- ...
-- ...

We can now use the gradient to "train" our network to give the correct responses given a certain input! This can be done by computing the gradient for every expected input-output pair, and adjusting the network in the opposite direction of the gradient every time.

Main Idea

The main pattern of usage for this library is:

  1. Write your function normally to compute something (like the loss function)
  2. Use gradBP to automatically get the gradient of that something with respect to your inputs!

In the case of optimizing models, you:

  1. Write your function normally to compute the thing you want to minimize
  2. Use gradBP to automatically get the gradient of the thing you want to minimize with respect to your inputs. Then, adjust your inputs according to this gradient until you get the perfect minimal result!

Now that you've had a taste, let's look at the details. You can also just go ahead and jump into the haddock documentation!