Powered by Twitter Tools.

May 2009
M T W T F S S
« Apr   Jul »
 123
45678910
11121314151617
18192021222324
25262728293031
Chris Donnan

Create Your Badge

Chris Donnan : Programming – Brooklyn Style

software, trading, family, fun

Axum Genetic Algorithm Example

I have worked for some years on genetic, evolutionary, memetic and other optimisation software – specifically in the context of automated/ systematic trading. I have also worked for some years on very multi-threaded software systems. I have been interested in the actor model for writing concurrent software.

Axum

Axum anincubation project at Microsoft implementing the actor model for the CLR. Others have posted on it – Matthew Podwysocki has two posts here and here that are worth a look. The Axum team blog is here. Have a read as you can, they are much more ‘on it’ wrt Axum than I.

Genetic Algorithms

Genetic algorithms are a class of algorithms from the school of ‘evolutionary computation’. I will not go into a study here, just google genetic algos. Needless to say – this example is a very naive example. I tried pretty hard to use any of my ‘real’ evolutionary algorithm libraries, but they just did not work easily enough with Axum. More on that later…

Caveats

Aside from being a really naive GA, the demo is also ver imperative and array/ for loop based. This is not good – but I was focused on getting Axum running. After I was unable to pull in my real evolutionary algo libs I just went for bare bones. We are also doing a ‘pure numeric’ optimization problem, this is a common test alog for testing multi-objective optimizer software. It was easy to convert to axum – so I used it. Perhaps for future examples I will try to apply a GA in a more interesting way (eg; not just a maths problem).

High level steps I followed to getting the demo running

  1. Define the data elements flowing between actors. I 1st tried to use a class called ‘individual’ from my standard libs. Axum has a concept called schemas. Here is a post from the Axum team blog on schemas. Schemas are data transfer objects or ’schematized messages‘ that are immutable and used to travel in a type safe way between agents, across channels.
  2. Next I defined the channels. Channels are aptly named, they are typed channels that your messages (those schematized object instances from above) travel through from one agent to another.
  3. Next I defined the agents that received messages from a channel, and published messages back onto a channel. In the agents are any ‘real code’. the schemas and channels are just construct definitions to establish what can travel around.
Here is my very simple Individual schema. Notice some things were commented out. There is a coerce function in Axum that will take a class that has the same shape as a schema and turn it into one. I had some difficulty with non-trivial objects coercing into their schema counterparts, so I just wound up dumbing down the object and bringing it ALL into the Axum demo. There is supposed to be the ability to share libs – but you must ‘behave correctly’ in your libs. No static data mutation, no side effects etc.
This is basically just 2 arrays of doubles 1 array represents the ‘genes’ the other array represents the ‘fitness’ along N dimensions. In our example, we are using 3 genes and 2 fitness objectives.
Here are our 2 channel defs:
One channel is used to send individuals to for evaluation (look @ the genes and assign fitness values). The other channel is used to take an entire population of individuls and produce the next generation. The interesting bit is that individuals can be evaluated completely on their own in whatever the evaluation enviroment is. Populations of individuals can also be evolved (crossbred, mutated etc) as a unit of work. The classic EA algos have many nice partitions where you can parallelize etc.
The agents
The agents are where all the work goes. The agents look like something from a classic message based system. They listen on a queue for incoming data and they send data out to a queue, the queues are channels in Axum. Here is our ‘evaluator’ agent. The job of this agent is to listen to the incoming individuals for evaluation channel, evaluate them, then send the evaluated individual back to the queue. Here is the simple start:
You can see that it is declared as an agent it also has the channel EvaluationChannel declaration. You can see then that this agent has as its ‘Primary Channel’. The agent goes into a loop in the constructor (sort of odd) then calls ‘receive’ to get messages from its channel. It calls its own Evaluate function then it sends back to the Evaluated port on its channel. The receive call is special in Axum; blocks until there are messages to consume. This is classical ‘producer consumer’ stuff here. The other call with the funny <– sends the data back to the port on the channel.
Note: this is the Kursawe MOP problem evaluator
Functions
Functions in axum are routines that do not cause side effects (commonly called procedures when you do cause side effects). Here are some functions from the above Kursawe Evaluator:
These are the function that evaluates the incoming individual. I have left out some code for simplicity’s sake. I just wanted to show a few things.
  1. Immutability of schema objects: I am not mutating the incoming individual to the Evaluate function, I am returning a new individual with the same genes as the incoming one and new fitness values that I assign.
  2. No side effects: Functions dealt only with local variables and returned data without causing side effects.
  3. Lots of odd casting business: Axum simply had strange problems calling external libs (like system.math in this case). I got many errors like “cannot cast double[] to double[]“. I saw some errors that showed that it seems Axum wraps all the primative types in an AxumSomethingOrOther<primative> type. Again, this just caused ugly code to deal with it + casting galore.
Here are a few more code examples of both the evolution bits (the thing that takes a whole population of indivuals and evolves them into a new population) and the Axum application that coordinates the whole thing. 
I have ‘red boxed’ the Axium bits, channel declaration, the receive calls, etc. I have also yellow hilighed another ‘no side effects’ bit. I could not use any ’standard’ random number generators. Even my custom ‘deterministic’ random number generator because it mod’d static variables. In any case – this random # generator returns an instance of a new RandomNumberGenerator with each call. This is back to immutability. Once I create one of these, it will set its own internal state and return another new rng. This may look strange in the yellow hilighted lines but that is the reason. I actually like that feel. I have been using BclExtras immutable collections lately and it is the same feel, goodness.
This class you can see is getting an input population from its channel, then it is making a new population of individuals to use as seeds for the next population. This is the really naive bit. We do a purely random selection and a purely random crossover of genes from each individual we have randomly selected. So – we randomly couple individuals and randomly select their genes for their offspring. There is no mutation in this example. My real optimizers – I trued to use an optimizer that was fit for solving MOPs (multi-objective problems – problems where you want to minimize/ maximize > 1 fitness/ utility function/ value). Axum just could not integrate too well with a big bit of preexising code.
This final bit is the agent that derives from AxumApplication – it is basically our main. You can see 2 calls to create each agent in a new domain (more next perhaps on that). You can also see the same pattern for using the RNG (immutability). Note that these agents have bunch of your normal mutable state, standard .net-ish stuff in them. They feel very much like normal C#.
Axum issues:
Axum had some general issues with .net CLR level. I think I was trying to program too much in a .net 3.5 way – I believe Axum is 3.0 compliant.  Here are my little list of issues:
  • Generally red herring error messages like ‘unable to recover’ with no more detail
  • Poor handling of arrays (like my above casting issues) in many cases I had to copy values out of the arrays into new arrays.
  • When newing a schema type, you can’t use () – just the new XXX {propName=xxx} if you put new XXX() you would get truly odd errors.
  • Generally high difficulty integrating with other C# libs. I tried 1st to interop with a big lib, it barfed, I then tried to have another small C# project in the same solution, I got null pointer errors when adding ref’s to it, etc. I had all kinds of errors using external types -just too hard or not possible.
Closing thoughts
Although a simple example, it was fun. I missed having Resharper also ;) . I will likely continue on and try to get a more ‘real’ optimizer running using Axum and report back. Love the model, I am skeptical that much will make it into the CLR :( I hope some does…
Here is the code

Responses are currently closed, but you can trackback from your own site.



Message/ Actor based Concurency for .Net

Due to the nature of my current project, and more-or-less my career developing trading related software – I have been thinking hard about concurrency lately. This is not a new thing, but I feel for the 1st time in some years that we are really ready to have some substantial changes in the ways that we write highly concurrent software.

Pub/sub messaging

I have had some great experience doing pub/sub messaging work in the past, specifically @ Merrill Lynch, and to a lesser degree at Morgan Stanley years back now. As it goes, many applications in finance tend to be based on pub/sub messaging in one form or another. These types of systems create a few dynamics that are interesting. Incoing messages are coming from elsewhere on the network so they are inherently immutable. As it goes immutability winds up being very important – essentially having as little shared state as possible is key. Messaging systems tend to make this true to some degree at least.

Immutability

We are constantly trying to make more and more immutabe in my current project. I have been looking a whole bunch at BCL Extras (more on that below) for immutable collections. (When we are doing ‘edits’ we are basically copy-on-write since we are still working on mutable systems to one degree or another). In any case, the whole immutability thing simply enables simpler concurrent access; things can’t change underneath you if they can’t change at all…

(Better Concurrent Collections)

This is an aside, but I am also working hard at creating/ finding/ using better concurent collections that help us to use them correctly. There are some good articles on low/ no lock techniques and Jared Par has a good post here.

BCL Extras

BCL Extras is an intestesting library written by Jared Par. The most interesting bits in here are good implementations of immutable collections, including a good immutable map. We are currently trying to find a good way to use immutable collections alongside our better ‘concurrent mutable’ collections (low locking, still mutable, etc). I recommend just checking this library out – it is along many of my favorite trajectories; Concurrency, Immutablity, Functional Programming and LINQ.

Scala, Erlang

Erlang and more recently Scala use the messaging/ actor based concurrency model. People wrote much more intelligently than I on this herehere, here, etc.

Axum

Axum (formely called Maestro) is here. This is an incubation project at Microsoft for implementing natively in the .net framework the actor/ message concurrency model. Their team blog is here. Axum is pretty exciting – if it ever makes it out of incubation.

Retlang

Written mostly by Mike Rettig for use at DRW Trading. Retlang is most interesting to me because #1 it is useable today in the context of my current project (unlike scala/ erlang/ haskell, axum, etc) #2 it is being used today for real world applications. #3 a former colleuge Mike Roberts and I had a few messages on it and he gave it is ‘thumbs up’. Personal vouchers from trustworthy sources are very useful. his slides for his presentation on Jetlang/ Retlang are here

Retlang feels like a pub/ sub messaging system, the difference is that you use it in process. The model that has worked successfully for messaging and driven a certain way of working that removes your locking code etc. is just used in process to give you similar dynamics.

Retlang gives you channles and fibers as its basic bits and pieces. You pub at one end of a channel, you sub at the other end of the channel. It gives you execution queues that your components/ subscribers feed from and act as if they are all working on 1 thread. This means you can punt out your locking all over your business logic. 

Again, with this model, within a fiber, you think about everything as single threaded. Dependency injection is for getting the assembly out of the business logic. With retlang (the messaging/ actor model) we get the multi-threaded-ness out of the business logic and it is wired together. Everything in a fiber is just a component/ set of classes that are wired together and are in a single thread together. We now can punt all the locking, etc code OUT of our business logic code. 

Channels are pub/ sub conduits and fibers are the wiring of what is executing on the same thread. You can have 1 fiber that listens to N channels, and all the subscribers on that fiber will be executing in the same thread. This is SO much like dependency injection it astounds me. We get to NOW assume we are single threaded within this component.

This all sounds good to me. I am going to seriously consider Retlang currently. 

Conclusions

I really think we are at the edge of changing the way that we build concurrent applications. We need to. Building concurrent apps the way to do today is too hard. Really it is. I am looking forward to working hard to make it better so I can deliver better software, faster to my desks!

-Chris-


Responses are currently closed, but you can trackback from your own site.



The Canonical Meta-Model, Code Generation, T4 and Domain Projections

In my current project, I have been using some very interesting (to me at least) tools and techniques that I thought were worth mentioning. There are many vaguely related people doing similar things, but not exactly the same…

Common Probems

Boilerplate code, excessive mapping, and DRY (Don’t Repeat Yourself) violations. It is very easy to have several essentially redundant yet seemingly necessary representations of your domain model manifest as code in your software. Lets take a simplifed example domain – Trade. (pseudo-code)

class Trade { int TradeId, Trader Trader, Security Security, int Quantity } //etc

Now – it is not common to have some mapping to persistence, either your classical ORM, or XML persistence. You might have some code that essentially takes that domain (your *canonical representation*) and maps it (or *projects* the canonical representation) into another domain (xml, database, etc). You get something like:

class TradeDao {void Save (Trade trade) Trade LoadById(int id)}

It is also common to have a messaging/ transportable representation of the type, again, this could be XML, bytes, serializable in some custom form + a message envelope, etc. The point is there is usually some other mapping for ‘messaging’.

class TradeMessageAdapter {Trade FromMessage (Message msg); Message ToMessage(Trade t);}

In addition, it is also common to have a mapping or a set of representations of your types that are specialized for user interface code. UIs do not have the same locking semantics as servers usually, they have databinding ramifications (especially in WPF as our apps are), so you have a mapping or *projection* into that domain. These types of projections are interesting, since in many cases the type system just does not have any real way (in C#/ the ‘raw’ CLR at least) to generalize these types of relationships, more on this another time…

class TradePresentation { UiProperty<Trader> Trader; UiProperty<Counterparty> Counterparty; }

In some cases – you have a pure data projection, our message example is like this. In some cases you have a semantic projection (eg, figure out relevant operations based on the data and do something with it (our above DAO example is like this). 

Data Projection = Mapper

Anytime you are taking a property/ field/ value from 1 place and packing/ unpacking it into an alternate form with essentially the same name, you have some sort of mapper. This is essentially what I will call a data projection, as in our Trade to TradePresentation projection.

Semantic Projection = Boilerplate Code

Anytime you can look at the model and infer some operations, and generate boilerplate code (eg; our DAO example), you have what I will call a semantic projection. In the DAO example, you can look at the relationships in the canonical model and then generate methods/ their bodies in a templated way, since Boilerplate code is really just a form of a template that can’t be automated/ abstracted away by the type system you are using.

Canonical Model = the essence of your domain model

The canonical model is that – the essence of your domain. It is our 1st Trade example above. There is no domain specific information in it, just the most terse, easy to understand version of your domain.

How do we make practical use of these concepts?

C# can make classes, they suffice to represent the canonical domain – you make classes. You can be fancy or simple, but essentially you model your cannonical domain using C# classes. How do we then project the canonical domain into another domain? Using T4 Templates/ Code Generation with your canonical model.

T4 Templates

VS 2008 has a great feature called T4 templates. You can simply add a file with a .tt extension, put some ASP.Net like code into it and it will generate you some new files, usually C# classes. 

Example projection T4 Template

<#@ Output Language=”C#” #>

<# foreach (var type in myModel)  { #>

class <#= type.Name #>Message : BaseMessageTypeOfYourFlavor {

    public void FromInstance(<#= type.Name +” “  type.AsVariableName #>) {

        <# foreach (var property in type.Properties) {#>

              AddValue<<#= property.Type #>( <#=type.AsVariableName #>);

       <#}#>

    }

}

<#~

var myModel = new MyCanonicalModel();

#>

 

Assuming your canonical model had a few types in it, it would generate something like:

class TradeMessage : BaseMessageTypeOfYourFlavor {

      public void FromInstance(Trade _trade) {

           AddValue<int>(_trade.Id);

           …etc

     }

}

We will have to do some more work to make it handle mapping other types from the canonical model, but that is still the basic idea. We generate any repetitive code, mapping code etc.

If you use partial classes, it turns out you can easily genrate different aspects of the same class into different files, it is really quite nice! More on that next…

Conclusions

So – this works towards a dynamic where you focus on writing ‘framework layers’ and your canonical model, then generating instances of what you need from your domain layer for each of your framework layers… Again, more in the future here.

So – this is some of the stuff on my mind these days. More to come on these topics. The basic ideas are relentless automation, only stating relationships and the general model 1x in your software (DRY), do not write boilerplate code, etc. This gives you a LOT more leverage. It also lets you focus on modeling the actual domain model, but does not restrict you from using it projected into several other areas by making concessions IN your canonical domain model. The last thing we want is ALL of our domains subtley manifesting themselves IN our canonical model…

So – go have a play with T4 and consider ways to do some of what I have mentioned. I think the important ideas are presented herein. I will work on some better practical examples going forward.

-Chris-


Responses are currently closed, but you can trackback from your own site.