Asynchronous Games in Firebase; pt II

Note: this post is cross-posted on Cubeia.com.

Last post I looked at the motivation for writing asynchronous multiplayer games on top of Firebase.  This post I’ll sketch an outline on how to actually do it.The first thing to consider is that we now have two notions: The Firebase concept of area (called “table” for legacy reasons) and the overall concept of a “game”. Firebase tables are associated with games, but whereas in a synchronous game the table probably would be the game such that if the table is closed, the game is over, in asynchronous games the table may come and go, but the game itself would survive.Oh by the way, I’ll use Java in this post, but remember you can write it in script languages Ruby, Python and Groovy as well. :-) Let’s step it through:

  1. The game is created. This is probably done on a website somewhere and does not involve Firebase at all.
  2. Client connects. When game client is opened (someone wants to make a move), the client connects to Firebase and does the following:
    1. Search the lobby for any table with the correct “gameId” set in the attributes.
    2. If a table is found, attempt a “join” command.
    3. If a table is not found, or the join fails, send a table creation request and make sure to include the “gameId”.
  3. Activator creates table. When a client cannot find a table for the game, it send a table creation request, the activator then reads the game and its state from database, and creates a new table.
  4. Game play. The player makes its moves and actions normally via Firebase. The game in Firebase either saves the state on each action, or delegates to the time when the table is being closed (see below).
  5. Table closes. When the activator finds  table which has not been accessed for some time. If needed it should also save the game state at this point (see above).

I’ll focus on points no. 3, 4 and 5 above for the rest of this post. And I’ll use code examples from the Kalaha game I’m currently involved in writing.Activator / Table creationThe game activator in Firebase is the components that know how to create and destroy tables in the system. We’ll follow best practises (but we’ll skip the “init” state for now) as it helps us save the state to database.The activator should first make sure to implement RequestAwareActivator to make sure it gets the requests:

public class ActivatorImpl implements GameActivator,RequestAwareActivator {[...]

When a client wants a table to join for a specific game, it’ll send a table creation request, and the activator should read the game state from the database and create a new table. Somewhat compressed, it may look like this:

@Overridepublic RequestCreationParticipant getParticipantForRequest(int pid, int seats, Attribute[] atts)throws CreationRequestDeniedException {// find the game id in the parametersint gameId = getKalahaGameId(atts);if(gameId == -1) {// you may want to handle this as a special form of "new game"} else {log.debug("Ressurecting game " + gameId + " for player id " + pid);// read the game from the databaseGame game = gameManager.getGame(gameId);if(game == null) {// code 1 for "no such game"throw new CreationRequestDeniedException(1);}return new Participant(game);}}

The Participant is an inner class for handling the request, like so:

private static class Participant implements RequestCreationParticipant {private final Game game;public Participant(Game game) {this.game = game;}@Overridepublic void tableCreated(Table table, LobbyTableAttributeAccessor atts) {table.getGameState().setState(new net.kalaha.game.action.State(game.getState()));atts.setStringAttribute(TABLE_STATE_ATTRIBUTE, "OPEN");atts.setIntAttribute("gameId", game.getId());}@Overridepublic LobbyPath getLobbyPathForTable(Table table) {return new LobbyPath(table.getMetaData().getGameId(), "", table.getId());}[...]}

As you can see above the creation participant sets the Game object on the table when it is created. It also sets a lobby attribute with the game ID which is important for the client to find the table. The lobby path above is kept simple for this example and the TABLE_STATE_ATTRIBUTE is a constant you can define yourself.Now we need to close the table when it isn’t used. This is somewhat outside the scope of this post, but I’ll post some pseudo code here to demonstrate table destruction:

public void checkTables() {TableFactory fact = context.getTableFactory();for (LobbyTable table : fact.listTables()) {int tableId = table.getTableId();long lastModified = getLastModifiedFromAttributes(table);int seated = getSeatedFromAttributes(table);if(seated == 0 && isOld(lastModified)) {checkClose(tableId, table);}}}

The above should be called regularly from a scheduled task. The “get from attributes” method are trivial, for example:

private long getLastModifiedFromAttributes(LobbyTable table) {Map map = table.getAttributes();AttributeValue a = map.get(DefaultTableAttributes._LAST_MODIFIED);return a.getDateValue().getTime();}

The attributes “last modified” and “seated” are standard attributes and always available. The “table state” attribute is not and you’d have to set and get it yourself. In “check close” we’ll check the table state, destroy it if it is closed and if not, send an action to the table in order to clsoe it:

private void checkClose(int tableId, LobbyTable table) {TableFactory fact = context.getTableFactory();String state = getTableStateFromAttributes(table);if(state.equals("CLOSED")) {// the table is closed, so destroyfact.destroyTable(tableId, true);} else {sendCloseActionToTable(tableId);}}

And finally, the method to send a “close youself” action to the table would look something like this:

private void sendCloseActionToTable(int tableId) {ActivatorRouter router = context.getActivatorRouter();CloseTableAction action = // create your action herebyte[] actionBytes = // convert action to bytesGameDataAction wrap = new GameDataAction(-1, tableId);wrap.setData(ByteBuffer.wrap(actionBytes));router.dispatchToGame(tableId, wrap);}

Which should be more or less self-explainable. The action is of course whatever type of object and encoding you use in your game, it could be standard Java objects and Serialization for example.Table Play / ClosingThe table should work as usual, the only thing we’ll add is to save the game state on the “close table command”. We need to translate the action byte data to an object, then differ between internal actions and client actions and process. Something like this perhaps (again from my Kalaha game):

public void handle(GameDataAction action, Table table) {Object act = // translate to action objectlog.debug("Got action: " + act);if(act instanceof KalahaAction) {// here's where you'll handle the actual game state} if(act instanceof CloseTableAction) {setTableClosedAttribute(table);saveTableStateToDb(table);} else {log.warn("Unknown action: " + act);}}

In the above all kalaha actions are treated separately as client actions and the “close table” action simple sets the table state attribute to “CLOSED” to mark for the activator that the table is safe to remove, and then saves the game state to database.ConclusionAs you can see, the actual code to manage asynchronous games in Firebase is minimal, you’re going to spend infinitely more time on game logic than state handling. All the code I’ve omitted is trivial. In fact, writing this article took me longer than implementing the feature in my game!Which ends our discussion about asynchronous games in Firebase. Last post we looked at motivation and background, and in this post we’ve seen how to actually program it in Firebase. Now go and try it yourself!

Asynchronous Games in Firebase; pt I

Note: this post is cross-posted on Cubeia.com.

Since the social web started to expand, asynchronous multiplayer games seems to really have taken off. So my immediate question is obviously, can I do that on Firebase? And the short answer is: of course. For the long answer, please read on.Let’s recap first. In a traditional synchronous game you play against other people in real-time, everyone you play against will have to be online at the same time. But given the nature of our everyday Internet use, a different pattern has emerged: in asynchronous games not all players have to be online at the same time. For example, turn-based games can easily be made asynchronous, you play your turn and then sometime in the future you opponents will have been online and performed their moves. As such, a classic game such as chess in play-be-email mode is a prime example of an asynchronous game. But you’re not limited to turn-based games, you can define the order to act on time or any metric you’d like, the only mandatory feature is that all players do not have to be online at the same time.Firebase is fundamentally a synchronous multiplayer server. The technical difficulties Firebase set out to solve are all issues arising from the kind of distributed computing you get when a couple of thousand players have to interact in real-time. So normally you’d build an asynchronous game on a traditional web platform, as the demand on timely updates just got a lot less urgent standard web techniques can be easily applied. But having said that, is there anything you’d gain by using Firebase?I believe there is: A synchronous option. Easy to overlook, this is actually a big one. If you build your asynchronous game on Firebase from the start, you have an immediate option of including synchronous play. Or better yet, the difference between synchronous and asynchronous could be a simple configuration issue. And by the way, who says it’s either or? Why not both?Let me explain the above in an example: At the moment I’m hacking on a Kalaha game for Facebook on my past-time. And obviously it’ll be asynchronous, players will take their turn while logged in, and the game mechanics will inform you of your move via the Facebook streams. But if two players are online at the same time and want to play against each other, why shouldn’t they? And would quick matches be fun, say max 5 seconds per move? And how about tournaments?Here’s some other, perhaps minor, points to consider:

  • Duplex communication: implicit in the entire discussion is that Firebase can push events both ways from and to clients and server. In HTTP you’re forced to use Comet patterns, whereas Firebase uses a persistent connection and any actions are delivered to the clients immediately.
  • Latency: Just because you’re building an asynchronous game doesn’t automatically mean latency is not important. If your game is based on actions against a common notion of time, you might be interested in latency anyway. Firebase has an very low latency compared to most application/web servers.
  • Bandwidth: If it turns out you have a hugely successful game, and a large portion of the traffic is made up of game data (as opposed to actual web pages or other contextual data) you’ll be happy just to avoid the HTTP headers and the overhead they automatically bring to the table. Firebase uses TCP only at the moment, thus dramatically reducing the bandwidth requirements.

There are other less apparent options as well, but lets be honest, anything but a huge advantage would probably be negated by the additional platform with the integration and administration it’d mean. That is to say, if you can built it using standard web tools only, why add another platform with which to communicate, integrate and administrate?So that’s my declaration: Using Firebase gives you the entire range from highly synchronous to entirely asynchronous on one platform! Next post I’ll sketch a proposal on how to actually do it as well. Stay tuned!

Script Support in Firebase

Note: this post is cross-posted on Cubeia.com.

Now we have released a candidate for script support in Firebase! This is something we’re very excited about as it means no more Java (unless you want to of course, old hands like me aren’t likely to change in a hurry).This is a first release so there’s no support for tournaments or services yet. But is not far off.We’re using Java’s built in scripting support under the hood. It turned out to be not to trivial, but not very hard either. The interesting issues are likely to arrive when we start optimizing and bug hunting. And speaking of optimizing, I ran a few bots, say 50, against a very small script (basically the equivalent of a hello world) and on avarage the bots returned on 10 ms. That’s 10 ms for network latency, Firebase internals, and script evaluation for each event. Pretty damn good! Next step there will be to start optimizing depending on the script implementation, cashing compiled scripts, mutli-threading etc.One major up-shot of writing on a script language and re-evaluating the script for each event is the velocity: you don’t have to restart Firebase when you change code, the script is re-evaluated automatically. The round-trip time is cut dramatically!And… You want to see code? Here’ you are, this is the server part of the Hello World tutorial, written in…JavaScript:

function handleDataAction(action, table) {_log.debug('Entering handleDataAction');var data = _support.getActionDataAsUTF8(action);var playerId = action.getPlayerId();var outAction = _support.newGameDataAction(playerId, table);_support.setActionDataAsUTF8(outAction, data);table.getNotifier().notifyAllPlayers(outAction);_log.debug('Exiting handleDataAction');}

Ruby…

def handleDataAction(action, table)$_log.debug("Entering handleDataAction")data = $_support.getActionDataAsUTF8(action)playerId = action.getPlayerId()outAction = $_support.newGameDataAction(playerId, table)$_support.setActionDataAsUTF8(outAction, data)table.getNotifier().notifyAllPlayers(outAction)$_log.debug('Exiting handleDataAction')end

Python…

def handleDataAction(action, table):_log.debug("Entering handleDataAction")data = _support.getActionDataAsUTF8(action)playerId = action.getPlayerId()outAction = _support.newGameDataAction(playerId, table)_support.setActionDataAsUTF8(outAction, data)table.getNotifier().notifyAllPlayers(outAction)_log.debug('Exiting handleDataAction')

Groovy…

def handleDataAction(action, table) {_log.debug('Entering handleDataAction')data = _support.getActionDataAsUTF8(action)playerId = action.getPlayerId()outAction = _support.newGameDataAction(playerId, table)_support.setActionDataAsUTF8(outAction, data)table.getNotifier().notifyAllPlayers(outAction)_log.debug('Exiting handleDataAction')}

Cool, eh?You’ll notice some strange objects above. We bound some helper objects in the evaluation context, “_log” a Firebase Log4j logger, “_support” a tool for string to byte conversion etc, and some other helpful stuff.The JavaScript Hello World can be found here.And tentative documentation here. Have fun!

Guice Support in Firebase

I’ve always wanted to add dependency injection support to Firebase, and today we released a candidate for Guice! And if you ask me, it’s very cool indeed.The documentation is a bit sparse at the moment, but can be found on our wiki. The rest of the post I’ll just show how a small fictional game would look using Guice.To start with, the Guice support comes in a set of abstract base classes, one for each Firebase artefact. And to use those you’d have to add a dependency to you Maven build (I’ll assume Maven here, you can of course use whatever you’d like):

<dependency><groupId>com.cubeia.firebase</groupId><artifactId>guice-support</artifactId><version>1.0-RC.1</version></dependency>

And if you haven’t already got it, you’d need our repository as well:

<repository><id>cubeia-nexus</id><url>http://m2.cubeia.com/nexus/content/groups/public/</url><releases><enabled>true</enabled></releases><snapshots><enabled>true</enabled></snapshots></repository>

Now your all set to go, just extend GuiceGame and return the class of you game processor within the configuration, like so:

public class MyGame extends GuiceGame {public Configuration getConfigurationHelp() {return new ConfigurationAdapter() {public Class getGameProcessorClass() {return MyProcessor.class;}};}}

So what’s the magic then? It is this: The class MyProcessor will be instantiated by Guice and can therefore contain injections. And further, it will be done in a custom scope, per event, thus isolating instances nicely.You can also add your own modules to the injection context, again by overriding a method in GuiceGame:

protected void preInjectorCreation(List list) {list.add(new MyGameModule());}

Which means, you can inject not only stuff from the current table but also, your own classes. So if we continue:

public class MyProcessor implements MyProcessor {/** This is probably configured in the "MyGameModule" configured* in the guice game extension.*/@Injectprivate MyHandler handler;/** This is a speciality, you can inject Firebase services* right into your classes.*/@Serviceprivate ScriptSupport support;/** And another shortcut, if you use Log4j, we have a* a helper annotation for you...*/@Log4jprivate Logger log;public void handle(GameDataAction action, Table table) {// do something here eh?}[...]}

That should give you the idea. You can inject Firebase services as well as a logger (and remember, if you don’t use Log4j, Guice support the Java utility logging package from scratch). There’s a couple of things not shown here, for example, you can inject table members directly into the classes and the state object, so you don’t have to pass those around.Any catch? Well, when you create your own modules you’ll need to keep in mind that the processor will only work in a custom scope, called EventScope. So if you have something which needs to be bound not as a singleton or in the default scope, you’ll probably need to do something like this:

bind(MyHandler.class).to(MyHandlerImpl.class).in(EventScoped.class);

And that’s it! In a few days we’ll release our script support, which is as you might imagine built on top of the Guice support. And so far? I love it!

Write a multiplayer game in 10 minutes or less!

Note: this post is cross-posted on Cubeia.com.

As I was surfing along the other day it struck me that one of the coolest things about Firebase Community Edition is how incredibly fast you can get going. Do you think the title is a boast? Well, in a manner of speaking it is,  you see: we’re using Maven to build, and if you haven’t used Maven to build a Flex/Flash client before, Maven is going to start with downloading half of the Internet for you, and that will inevitably slow you down and may take a few minutes. But hear me! If you have used Maven before, and if you allow for the first time Maven will download the artifacts needed to compile the Firebase game and the Flex client, then I stand firm: you will have a Flex client and a Java server going in less than 10 minutes!Do you want to get started on a multiplayer game really, really fast? Here’s two different ways:

  • The Extreme Quick Start – This hard-core, and Maven only, quick start will have you up in less than 5 minutes (excluding Maven download times). It does not however, send actual game actions between the client and the server, all you can do is join/leave tables and chat with other players… Excuse me? All you can do?! It’s completely awesome if you ask me.
  • The Beloved Hello World – This tutorial can be done with Maven and optionally Flexbuilder. It will explain along the way what happens, and it will also replace the Firebase standard chat with game actions (also chat) showing you how to communicate properly between game server and client. 10 minutes? Well, if you’re impatient and a fast reader, or if you do it twice you will most certainly beat the 10 minute mark.

If you ask me, and I’m obviously biased, this is extremely cool. Of course, this isn’t actually a game yet and there’s a lot more to learn before launching your own international success on Firebase, but hell, you want to write a game? Hop right to it!Update: The commenting system seems to be behaving badly.  Even I can’t seem to comment at the moment. Please check the main blog for updates. I’ll be looking at switching blog system now…

Firebase Community Edition

Note: this post is cross-posted on Cubeia.com.

So what can you expect from the Firebase Community Edition (FCE)? Here’s a basic rundown:

  • It’ll be free and open source under the AGPL license. This means you can use, modify and even redistribute it to your hearts delight. However, you can’t change the copyright, nor the license itself. Also, there’s this viral GPL thing going on…
  • It’s limited to a single server. Sounds too restricted to you? Well, we’ve run thousands and thousands of players of single, rather cheap, servers so we’re not too concerned. Try it!
  • Performance! Basically FCE is an optimized single server version of the Enterprise Edition. While perhaps not your 1st choise for first person shooters, the FCE boasts a very low latency indeed. Not to mention the parallel event execution and the transparent transactions and so on.
  • Community support will be available. Forums, wikis etc.
  • There’s a clear upgrade path for you. Do you have a lot of players and you’re getting edgy about uptime? There’s the Enterprise Edition just waiting for you. Want more support? Sure. Scalability up to high heaven? Sure. Want a cherry on top? We’ll see what we can do…

I mean, seriously: The industry’s best and sexiest game server, for free?! It’ll be cool!

What’s this Open Source Thingy?

Note: this post is cross-posted on Cubeia.com.

Yesterday we silently let slip that we’re open sourcing Firebase. It is a major change of direction and we are very excited about the whole thing.So Firebase, which has been under a closed proprietary licens only so far, will now be split into three distinct versions:

  • Firebase Community Edition – Open source under the AGPL license. This version is a single server only version, but is otherwise feature complete.
  • Firebase Standard Edition  – A proprietary license version of the cummunity edition. Still a single server only version but under a proprietary licens.
  • Firebase Enterprise Edition – All the bells and whistles of the current Firebase version. This includes transparent clustering, fail-over safety, rolling cluster updates, unlimited scalability and so on.

Confusing? Stay tuned, over the next couple of days and weeks I’ll post the details here. Exciting times ahead!

A Java Concurrency Bug, and How We Solved It

Note: this post is cross-posted on Cubeia.com.

Everyone agrees that good debugging is critical. But it is not entirely trivial when it comes to multi-threaded applications. So here’s the story how I tracked down a nasty bug in the Java5 ReentrantReadWriteLock.

Update 15/4: To be a bit clearer, the Java bug in question is on 1) Java 5; and 2) only when using fair locks, non fairness seems to work, however, that was never an option for us, if nothing else due to this other Java bug… Thanks to Artur Biesiadowski for the heads up.

Our product Firebase is designed to offload threading issues and scalability from game developers to a generic platform. However, one of our customers experienced mysterious “freezes” when running a fairly small system load in their staging environment.

First step on the way: getting stack dumps when the system is frozen. In other words, request that the client do a “kill -3” on the server process when it’s hanged, as this dumps all threads and their current stack traces to the standard output. This we did, but only got confused by it, all threads seemed dormant and there was no visible dead-lock in the stack traces.

However, several threads were all mysteriously waiting on a read lock deep down in the server, and seemingly not getting any further, despite that fact that no-one was holding the write lock. This wasn’t conclusive though as there was one thread waiting to take the write lock and this would block the readers. But given that the server was actually frozen it looked suspicious. So my first analysis concluded:

As far as I can see, the only abnormal thing that could have caused this stack is if a thread have taken the lock (read or write) and then crashed without releasing it, however there doesn’t seem to be any place in the code not safeguarded by try/finally (where the lock is released in the finally clause).

Implied in that conclusion is of course that this might either be a normal behavior and we’re looking in the wrong direction, or that we have a more serious Java error on our hands.

There’s a lot of information to be had from a ReentrantReadWriteLock, including the threads waiting for either read or write privileges, and the thread holding a write lock (if any), but not (strangely enough) the threads actually holding a read lock. And as a reader thread can effectively block the entire lock by not unlocking while a writer is waiting, this is information you really need to know.

So the next step was to get hold of the reader threads. I did this by sub-classing the ReentrantReadWriteLock to return my own version of the read lock, which, simplified, did this:

Set readers = Collections.synchronizedSet(new HashSet());

public Set getReaders() {
    return readers;
}

public void lock() {
    super.lock();
    readers.add(Thread.currentThread());
}

public void unlock() {
    super.unlock();
    readers.remove(Thread.currentThread());
}

Given this code, we now have a set containing a snapshot of the threads holding the read lock. I then added a JMX command for printing the following information to standard out for the given read/write lock:

  1. The threads waiting for a read lock, including their stack traces.

  2. The threads waiting for the write lock, including their stack traces.

  3. Any threads holding a read lock, including their stack traces.

  4. The thread holding the write lock, including its stack trace, if any.

I shipped this patched code to the client and asked them to freeze the server with the patch, print the new debug information, and then send me the output. Which they did, and the relevant output looked like this (very much shortened):

Holding reader:
Thread[DQueue Handoff Executor Pool Thread { GAME }-1,5,main]
    sun.misc.Unsafe.park(Native Method)
    java.util.concurrent.locks.LockSupport.park(LockSupport.java:118)
    […]

Waiting writer:
Thread[Incoming,dqueue,127.0.0.1:7801,5,Thread Pools]
    sun.misc.Unsafe.park(Native Method)
    […]

Waiting reader:
Thread[DQueue Handoff Executor Pool Thread { GAME }-1,5,main]
    sun.misc.Unsafe.park(Native Method)
    […]

See anything strange here? It appears that the same thread is both holding and waiting for a read lock at the same time. But this is supposed to be a reentrant lock. In which case…

So, the freeze was caused by this: There’s a bug in Java5 (but not in Java6) where a fair ReentrantReadWriteLock stops a reader from re-entering the lock if there’s a writer waiting. It is of course easy to write a test case for, which you can find here.

This is now submitted as a bug to Sun, but I have yet to get a confirmation and bug number.

As for Firebase, it is now patched and released to use manually tracked re-entrance for read/write locks through the entire server when running under Java5, looking, again very much simplified, like this:

private ThreadLocal count = new ThreadLocal();

public void lock() {
    if (get() == 0) {
        // don't lock if we alread have a count, in other words, only
        // really lock the first time we enter the lock
        super.lock();
    }
    // increment counter
    increment();
}

public void unlock() {
    if (get() == 0) {
        // we don't have the lock, this is an illegal state
        throw new IllegalMonitorStateException();
    } else if (get() == 1) {
        // we have the lock, and this is the “first” one, so unlock
        super.unlock();
        remove();
    } else {
        // this is not the “first” lock, so just count down
        decrement();
    }
}

// --- HELPER METHODS --- //

private void remove() {
    count.remove();
}

private int get() {
    AtomicInteger i = count.get();
    if (i == null) return 0;
    else {
         return i.intValue();
    }
}

private void increment() {
    AtomicInteger i = count.get();
    if (i == null) {
        i = new AtomicInteger(0);
        count.set(i);
    }
    i.incrementAndGet();
}

private void decrement() {
    AtomicInteger i = count.get();
    if (i == null) {
        // we should never get here...
        throw new IllegalStateException();
    }
    i.decrementAndGet();
}

The above code isn’t trivial, but shouldn’t be too hard to decode: We’re simple managing a “count” each time a thread takes the read lock, but we’re only actually locking the first time, the other times we simply increment the counter. On unlock, we decrement the counter, and if it is the “last” lock, if the counter equals 1, we do the real unlock and remove the counter.

There are two things to learn, 1) in any sufficiently complex server, the above debug information is nice to have from the start on any important lock; and 2) it really is time to migrate to Java6.