• srdan Oct 28, 2011

    Doing it simultaneously. by srdan

    Making it big means making it scalable. And making it scalable means making it so that many independent processes can share the load. Coordinating these processes and making sure they’re not stepping on each others toes while maintaining full speed ahead is one of the leading causes of Tourette’s within the software developer community [1]. Then again…

    Doing things one at a time sucks. It’s boring. And it takes forever. You don’t wanna be going to the grocery store buying twizzlers one pack at a time. Sure you can carry a couple more by getting a bag, and even more by loading them all up in your car, but the only way this’ll scale is if you can get a couple of buddies to help you out. Then you don’t all have to be going to the same store and you can buy more simultaneously.

    But you may still run into a problem if you all only have one credit card account. If you don’t stop to think about it you may find yourself in a situation where a bunch of your buddies are standing at the register and paying at the same time. Each of them may have checked that there are sufficient funds at one point and then just went ahead and handed the card over to the cashier. Only to find that money had already been redrawn. Purchase denied.

    Oh the embarrassment. Everybody looking. Pointing.

    This threatens your entire twizzler-repackaging and reselling business so you bite the bullet and say that until you’ve figured this out, there’s gonna be only one card being used at any given time. If Alice goes to buy twizzlers, Bob’s gonna have to wait until she returns. It’s basically your synchronized approach. Your throughput is much lower, but at least no one is laughing at your crowd.

    After a while you work out a way in which you pre-assign how much twizzlers each person should buy based on the financial situation at that time. Then you ask your buddy to go get you just the right amount of twizzlers, pay out of their own pocket and you’ll give them back the money when they bring you the merchandise. No embarrassments, just the satisfaction of a job well done. What’s more, if your buddy were to perish in a twizzler related shooting incident — it’s no biggie. You just send out another buddy [2].

    This is what Actors do. It’s a ridiculously old concept really, originating in the 70’ies. It helps you avoid the naive first solution, the bad throughput of the second solution and helps you focus on getting it right.

    All actors are self contained. They share no state or resources. Every actor handles one message at a time. That way, the actor’s methods are limited, specific and easily testable. And since you know there’s only ever going to be one message at a time, you don’t really have to worry about threading issues (sort of).

    Ruby MRI has no native implementation of actors though. Rubinius apparently has. But we don’t use Rubinius, nor do we use MRI. We use JRuby and we are truly blessed. Because… There’s this cool Actor framework called Akka. It’s written in Scala, but Java does Scala, and JRuby… JRuby does it all. And, courtesy of Theo, we have the Mikka, which is an Akka wrapper… Creating an actor is now as simple as:

    class AnActor < Mikka::Actor
      def receive(msg)
        # do a bunch of stuff
      end
    end 

    To start an actor you go:

    @an_actor = Mikka.actor_of { AnActor.new }.start 

    To start multiple which you can loadbalance on, you go:

    @actor_load_balancer = Mikka.load_balancer(:actors => 40.times.map { Mikka.actor_of { AnActor.new }.start}) 

    To have the actor do your chore you would go

    @an_actor << message
    @actor_load_balancer << message 

    …where message is the options that are going to get passed to the receive method in the above mentioned class. The actor has its own initialize method and you can setup anything else you might require, respecting of course the rule that no actors share state. Apart from that you’re free to act.

    The basic actors have a very small footprint so you can easily spawn hundreds of thousands of them. When one dies or malfunctions you can kill it and replace it — no problem.

    This paradigm was famously used in Ericsson’s AXE phone switches to achive uptimes of 99.9999999%, or just about 2ms downtime per month. That was erlang though. And dedicated hardware. And years of development… Soo. Yeah. That probably won’t mean that your blog-engine on AWS is gonna be as stable, but it won’t hurt. =)

    [1] Note. Not a fact.
    [2] Note. This is no way to treat your IRL buddies.

    #engineering, #Burt, #burtcorp,