Chris Donnan : Programming – Brooklyn Style
software, trading, family, fun
Posted erlang, functional programming on Sunday, August 9th, 2009.
Why blog about Erlang now?
For me writing blog posts is often a therapeutic way to work out puzzles and clarify my own understanding of a concept. I have been writing Erlang on and off for around 1.5 years now. As of late I have also been typing a lot more Haskell and some OCaml. This has all been great and I love these languages. I have decided to commit a bit more to Erlang lately and I thought I would start blogging again in earnest – mostly with small-ish Erlang examples just to force myself to be able to explain the concepts. I can program in Erlang (to some level at least) – now I want to be able to be good enough to explain Erlang software to people.
Essentially – I think that erlang the language is OK, I think that Erlang the runtime is amazing and I think that OTP – the common set of Erlang behaviors is very powerful. The entire ecosystem puts together a very compelling offering. You can achieve massively parallel, manageable bits of software in small enough bits of code that it is ownable. So – Erlang it is…
So in that spirit – I will start here with some very simple Erlang concepts.
Erlang is Dynamically Typed
Erlang is a dynamically typed language that to be honest – I found pretty ugly (syntactically) at first. I have done plenty of programming in dynamic languages – especially in Ruby (and to a slightly lesser extent – Python). I really do like dynamic languages but to be honest; I do believe that in larger software projects today that static typing can really by you a generally higher degree of assertable ‘correctness’. I prefer Haskell’s type system to many others – but my basic point is that I think types are powerful. In any case – I still think Erlang’s total offering is powerful enough to make a concede this point.
While Erlang is dynamically typed it does have a simple way to organize primitive data elements into composite types – this is Erlang records. You can also of course use tuples (a sort of anonymous type). In addition, Erlang code is organized into modules. So – while we may organize our code differently than in other languages – we have some good organizational concepts.
Modules
-module(exchange).
-export([addOrder/2,exchangeLoop]).
At the top of our .erl Erlang file, we put the name of our module. We also put the names of the functions we want to make visible to the outside world. Note the . at the end of each line. This is the way we tell Erlang that we are at the end of a statement. In the export call – we give it some stuff inside the []s. Those [...] bits are how we declare a list. Inside the list we are giving it the names of the functions we want to export along with their arity – the # of arguments they have. This gives us a module named exchange that exports 2 functions to be used to the outside world.
Records and Tuples
As I mentioned – erlang lets us organize primative elements into composite data types. The anonymous way of doing this is tuples. The ‘named’ way of doing this is records.
-record(order,{market, price, size, side}).
This is the declaration of a record in Erlang. You can declare an instance of a record like this:
#order{market=vodln,price=127.95,size=350,side=bid}
It is essentially a nice dress on top of a tuple which would look like this:
{vodln,127.95, 350, bid}
So – records are really just nice ways to put together composite types with names. Tuples are ways to assembly composite types anonymously. Both have their usages.
Object Orientation vs Functional Programming
Notice that these records/ tuple things have no methods ‘on’ them. Typically when you declare a type in OO languages – you put methods on them. This whole bundling up of methods and data IS OO programming. Functional Programming lets you have data and describe functionality. What this means is that it is usually easier to add more functions on a given data type in a functional language – but harder to change the data structures. This is because in an OO language – the data is WITH the functionality. In a functional language the data is apart from the functionality. It is all a tradeoff, but I think the functional paradigm is often much closer to the domains I have been working in (trading specifically). So – FP takes away the general encapsulation of data/ methods together. Note that you can still hide private functions inside a module. Eg; you implement a module and export 1 of 10 methods. This is still a good way to expose a minimal surface area of your software to the outside consumer of it.
Functions
Functions are – as you may have guessed at the heart of Erlang.
levelFor(Order, Exch) when Order#order.side == bid ->
levelForSide(Order, Exch, bid, fun(E) -> E#exch.bids end);
levelFor(Order, Exch) when Order#order.side == offer ->
levelForSide(Order, Exch, offer, fun(E) -> E#exch.offers end).
Here we have an example function that I will use as a demonstration for several points.
- You declare functions in erlang as a top level item. You do not put them inside of records, you put them in a module – but that is it.
- Values/ Parameters all start with upper case letters (I found this very odd at 1st).
- You can declare the same function N times each with different ‘guard clauses’ and the order of the declrations is the order in which the runtime attempts to bind to them.
- This example shows how to extract values from records.
- This example shows simple funs- anonymous functions.
Declaring functions, Guard Clauses, Extracting data from records
“levelFor” is the name of our function. This function takes 2 arguments named Order and Exch. The arguments come after the name in the (…).
After that is a guard clause – this ‘when’ is evaluated by the runtime to see if the method should be executed. So – we can see in case 1 – if the side == bid – then we will execute the method. The actual method body comes after the ->.
You can extract values from records using this syntax ValueName#recordName.fieldName. So our example above has several of those.
Funs and multiple function Declarations
The bit after the -> is invoking another method and passing in as the last argument an anonymous function. That bit that starts with fun(E) and ends with “end”. This is simply an unnamed function (delegate, function ptr, etc) passed to the next function.
Also notice that I declared the function 2x – at the end of the 1st declaration I put a ; and at the end of the 2nd I put a “.” ( a dot). This is a convenient way to get switch statements out of our code nesting. We use the guard statements to see which of the functions get bound to at runtime. We could have also used if or case statements – but for this example’s sake – we are not
OK – that is all the time I have for today… more next weekend
-Chris
Posted .net, c#, erlang, functional programming, haskell on Tuesday, July 28th, 2009.
So – I am deep into Erlang (still learning for > 1.5 years), Getting into Clojure (new) and getting into Haskell (a few months now). I have decided to spurn the OCaml/ F# branch of the world at least for a little while…
Anyhow – Erlang is the functional language I have the most experience with, yet next to Haskell – the LANGUAGE feels weak. Erlang the language + OTP + the Erlang VM runtime is amazing but the Language itself is not as lovely as Haskell. When I am typing C# code, I am finding myself more and more functionally minded. For a few years now I have felt the move to functional programming on me. C# has had great improvements – but it still feels way too brittle and verbose next to Haskell.
I recently read these articles:
Haskell for C# Programmers Part 2: Understanding IO.
Haskell for C# Programmers Part 3: Visualizing Monads
I found them helpful understanding monads as someone with a large C# experience. I also found this one:
Monads as containers very helpful. I will admit to having read dozens of articles on monads and just now really getting to a point where I think I can actually perceive the need for writing one of my own.
Functional programming style has made great differences in my personal style in several languages. I like programming languages and I am literate in many. Seek the monad, but 1st seek to understand the primitive bits of functional programming and it will aide your programming in any language.
Posted C++, algorithmic trading, coding, functional programming on Sunday, July 12th, 2009.
I have been thinking quite a lot about a few programming topics:
- Correctness – how do we make it easy to write correct software
- Size – how do we curb growth in size of our software (characters – not just lines – it all counts!)
- Understandability – how do we keep our software simple looking
- Parallelism – how do we really scale out across CPUs, machines from a language perspective
- Performance – how do we keep all of these other things AND have performance
All in all – I am after a functional/ functional hybrid language in order to achieve improved correctness, reduce size, increase understandability and promote parallelism. I have been working on a ‘pet project’ – mini-exchange for matching orders – a limit order book system. This has got me thinking about extreme performance.
Recently a former colleague went to work for the LSE, while another went to go work for a low latency trading group @ one of the banks. In light of recent events (the LSE is going to abandon windows) and when considering the experiences of reality – I think windows is simply not the tool for exchanges/ low latency trading – at least not now. This is much debated and I do not want to debate it – just to state my opinion.
Windows is not the solution for high frequency trading ’nuff said
Now – I have classically spent most of my professional development career working on windows machines developing software to be used by traders. I have done Java programming, some C++ programming, ruby and a zillion other languages for play.
Here is a good pointer to s set of writing on windows vs *nix performance.
What functional languages offer what performance?
Functional languages will win in the end for all the goals I have - especially the languages that allow some OO-isms. I am really a fan of several functional languages. I have been doing small to mid size experimental projects with Clojure, OCaml, F#, Clean, Haskell and a few others. I have done some work on my mac, some on windows, some on Ubuntu. At the end of the day, this is all well and good – and with my 1st caveat (not just windows), the next question is what language on what *nix?
Here are a few links to the great language/ performance shootout
- Ubuntu 64 Bit Quad Core
- Gentoo P4
- Ubuntu 64 Bit Quad Core OCaml vs C++ Intel
- Ubuntu 64 Bit Quad Core OCaml vs Clean
You can see a few basic things here. Currently - C++ GNU g++ is the best perf across the board- not surprising at all.
Essentially we see that OCaml is fast – from 1-3x slower than C++ – max, similar for memory – certain tests OCaml wins, but mostly 1-3 multiplier for memory size. OCaml – unsurprisingly wins – code typically 1/2 the size of the C/C++ code. Clean is also fast in many cases. In certain cases it outperfs OCaml, but mostly OCaml wins.
In many – less trivial cases I bet that OCaml/ functional solutions will compare with C/C++ because it will be easier to make them small, understandable, less redundant etc. For ‘benchmarks’ where you can optimize the code in a small area – you can be as obtuse/ obscure as you need to in order to optimize for the benchmark. It becomes increasingly difficult to scale your software and maintain its performance if it is in some obscure form. Program size and understandability become very important – even to performance as the software scales up.
We need a higher performance functional language in order to use it to compete in the arms race vs uber low-latency trading systems. If we can write low latency trading systems that are competitive with C based systems on *nix machines. 3x slower worst case is still way to slow. It is not 30x slower like F#/ mono – but it is still too slow at that level of competition. Haskell is doing better these days – raking 6th after c/ c++ and java servers. OCaml and clean are a bit behind – but doing OK relative to the bunch.
Abstraction costs performance
Ultimately c/c++ abstract the computer for the most part – they abstract the computers memory and cpu resources. Functional languages are more abstract – they try to hide away the hardware-ness of the computer. This has a cost. The cost is still performance. The value for cost is size, understandability etc. Hopefully we can get the perf more in line and then we will be onto something…
OCaml Trading
Old news, but an excellent watch – Jane Street does trading using OCaml, here is a video…
http://video.google.com/videoplay?docid=-2336889538700185341ATS
Interestingly ATS is a programming language I never heard of. According to Wikipedia:
The performance of ATS has been demonstrated to be comparable to that of the C and C++ programming languages.
and
ATS is derived mostly from the ML and Objective Caml programming languages.
Now this is interesting…. According to the most relevant current shootout mentioned above, ATS is at worst 1.83x worse than C/C++. This is not a bad start. I bet that we could improve that number and we would have a functional – OCaml like programming language that is nearer to competitive with C/C++….
Posted .net, F#, functional programming, programming on Monday, July 6th, 2009.
LAgent : an agent framework in F# – Part I – Workers and ParallelWorkers
LAgent : an agent framework in F# – Part II – Agents and control messages
LAgent: an agent framework in F# – Part III – Default error management
LAgent: an agent framework in F# – Part IV – Custom error management
LAgent: an agent framework in F# – Part V – Timeout management
LAgent: an agent framework in F# – Part VI – Hot swapping of code (and something silly)
Posted c#, concurrency, erlang, functional programming, scala on Saturday, May 9th, 2009.
Due to the nature of my current project, and more-or-less my career developing trading related software – I have been thinking hard about concurrency lately. This is not a new thing, but I feel for the 1st time in some years that we are really ready to have some substantial changes in the ways that we write highly concurrent software.
Pub/sub messaging
I have had some great experience doing pub/sub messaging work in the past, specifically @ Merrill Lynch, and to a lesser degree at Morgan Stanley years back now. As it goes, many applications in finance tend to be based on pub/sub messaging in one form or another. These types of systems create a few dynamics that are interesting. Incoing messages are coming from elsewhere on the network so they are inherently immutable. As it goes immutability winds up being very important – essentially having as little shared state as possible is key. Messaging systems tend to make this true to some degree at least.
Immutability
We are constantly trying to make more and more immutabe in my current project. I have been looking a whole bunch at BCL Extras (more on that below) for immutable collections. (When we are doing ‘edits’ we are basically copy-on-write since we are still working on mutable systems to one degree or another). In any case, the whole immutability thing simply enables simpler concurrent access; things can’t change underneath you if they can’t change at all…
(Better Concurrent Collections)
This is an aside, but I am also working hard at creating/ finding/ using better concurent collections that help us to use them correctly. There are some good articles on low/ no lock techniques and Jared Par has a good post here.
BCL Extras
BCL Extras is an intestesting library written by Jared Par. The most interesting bits in here are good implementations of immutable collections, including a good immutable map. We are currently trying to find a good way to use immutable collections alongside our better ‘concurrent mutable’ collections (low locking, still mutable, etc). I recommend just checking this library out – it is along many of my favorite trajectories; Concurrency, Immutablity, Functional Programming and LINQ.
Scala, Erlang
Erlang and more recently Scala use the messaging/ actor based concurrency model. People wrote much more intelligently than I on this here, here, here, etc.
Axum
Axum (formely called Maestro) is here. This is an incubation project at Microsoft for implementing natively in the .net framework the actor/ message concurrency model. Their team blog is here. Axum is pretty exciting – if it ever makes it out of incubation.
Retlang
Written mostly by Mike Rettig for use at DRW Trading. Retlang is most interesting to me because #1 it is useable today in the context of my current project (unlike scala/ erlang/ haskell, axum, etc) #2 it is being used today for real world applications. #3 a former colleuge Mike Roberts and I had a few messages on it and he gave it is ‘thumbs up’. Personal vouchers from trustworthy sources are very useful. his slides for his presentation on Jetlang/ Retlang are here.
Retlang feels like a pub/ sub messaging system, the difference is that you use it in process. The model that has worked successfully for messaging and driven a certain way of working that removes your locking code etc. is just used in process to give you similar dynamics.
Retlang gives you channles and fibers as its basic bits and pieces. You pub at one end of a channel, you sub at the other end of the channel. It gives you execution queues that your components/ subscribers feed from and act as if they are all working on 1 thread. This means you can punt out your locking all over your business logic.
Again, with this model, within a fiber, you think about everything as single threaded. Dependency injection is for getting the assembly out of the business logic. With retlang (the messaging/ actor model) we get the multi-threaded-ness out of the business logic and it is wired together. Everything in a fiber is just a component/ set of classes that are wired together and are in a single thread together. We now can punt all the locking, etc code OUT of our business logic code.
Channels are pub/ sub conduits and fibers are the wiring of what is executing on the same thread. You can have 1 fiber that listens to N channels, and all the subscribers on that fiber will be executing in the same thread. This is SO much like dependency injection it astounds me. We get to NOW assume we are single threaded within this component.
This all sounds good to me. I am going to seriously consider Retlang currently.
Conclusions
I really think we are at the edge of changing the way that we build concurrent applications. We need to. Building concurrent apps the way to do today is too hard. Really it is. I am looking forward to working hard to make it better so I can deliver better software, faster to my desks!
-Chris-
Posted User Interfaces, functional programming, haskell, programming on Sunday, May 18th, 2008.
This is a great Google Tech Talk on Functional Programming and GUI Development. These are of course 2 things that are near and dear to my programming heart.
The basic concepts are that I liked:
- Functional programming = value oriented programming
- UIs are visualizations of values.
- You expose the parameterization of the functions to users.
- You want to use the unix pipe concepts to pipe inputs/ outputs to each other.
Essentially, the difficult bit for me is that much of the work that I do is about passing around an ‘entity’, editing, maintaing and persisting its state. I am sure there is a good FP answer to this, but I need to see how to fit it into my mental development model.
If I view Excel as the de-facto FP app – I am sure that I can find a way to map my world to a more pure FP stance.
The things randomly rattling around in my brain – vaguely related are:
- Column oriented databases (Bigtable, HBase, etc)
- Google AppEngine’s data model – especially expando types (basically each row may have different or extra columns)
- I am increasingly interested in concepts from Lisp – like the fact that types are just lists of attributes.
Putting together those ideas gets you something like bigtable for persisting potentially sparse-ish, or versions of, or extended type hierarchies of values sets. Value sets are basically attribute lists.
All of that gives a different world than I have today. I cannot say where it is going for sure, but all of this gives me a scent of the next set of dominant paradigms. Learning, building, advancing, etc.
-Chris
