J. Brisbin

Just another Wordpress.com weblog

RabbitMQ as a NoSQL distributed cache

leave a comment »

Part of what I’ve been doing with the cloud-friendly Tomcat session manager is basically implementing my own asynchronous distributed object cache. At the moment, this functionality is tightly coupled to what I’m doing inside the session Store. But in making some changes recently to add Spring Security integration and make working with Spring Security 3.0 a little easier, I noticed that there’s a lot of what I’m doing inside the session Store that could simply be abstracted into its own package and used as a standalone distributed cache.

The concept is simple and I think the code will be straightforward. Instead of synchronously loading an object from a data store (which is configured on the back end to shard its data or do other kinds of distributed load-balancing and failover replication), code would request the object be loaded asynchronously and provide a callback to be executed when the load is complete. This would actually simplify and make my own code quite a bit more robust and it would add another voice to the important area of our industry that is getting a lot of focus at the moment, that of distributed caching.

Terracotta looks like a killer app in all ways. I’d love to be able to use something I didn’t write myself to solve a lot of our problems. But I spent all our money on VMware support and new servers. There’s nothing left to go chasing proprietary and heavy solutions to our problems. I’ll use OpenSource or software I’ve written myself–or I’ll do something else entirely. A distributed data cache backed by RabbitMQ will be relatively lightweight (probably not at first, as I often have to strip things out to get to my lightweight goal) and I’m sure quite fast. It will transparently allow for sharding and aggregating data with no additional configuration. Since queue subscribers get load-balanced anyway, there’s no need to figure out some way to split up objects because they’ll be spread over however many listeners I put on those queues. I can partition data by using different RabbitMQ servers and combinations of queues and exchanges.

I’m starting work on this right away since I’ll be on vacation next week and, geek that I am, will likely not be able to pull myself away for long. Expect to see something on GitHub week after next!

Written by J. Brisbin

July 7, 2010 at 7:14 pm

Adventures in GrAppEngine

leave a comment »

Although I’ve blogged quite a bit about the cloud-based utilities I’ve been writing, releasing as OpenSource on GitHub, and working with here at the world’s largest Pizza Hut franchisee, there’s still plenty we’re doing to deploy Web 2.0 apps that I haven’t spent much time talking about. We’ve traditionally been a little tight-lipped about our application development because, quite frankly, there was no one to talk to about it. No one really cared much what a company they’ve never heard of does internally to develop applications for their own users.

But the tide is shifting away from this closed, isolationist attitude. It’s far from being endorsed at the highest levels; my superiors don’t particularly care that I blog on technical things but they’re not going to encourage me to do it. That said, I feel like I’ve made a smallish contribution to the global discussion of cloud computing. I hope to continue to do that by introducing you to aspects of our development efforts that have broader application. One of the things that might make good discussion material is our use of a custom-built Groovy-based REST framework for deploying web applications. I’ve alluded to it several times but never discussed it in detail. I’d like to open up this web framework a little and explain why we do what we do because I think it has broader application to the cloud deployment model.

The (currently unnamed) Groovy web framework I wrote is very opinionated about how to build and deploy applications. We use ExtJS (now called Sencha) internally as our Ajax toolkit of choice–primarily for the grid. There is no other JavaScript Ajax grid that I’ve found that is as powerful and easy to use as the Ext grid. It’s the foundation of a lot of our applications because the users understand how to use it. They know what a spreadsheet is and they know how to use a grid because it looks a lot like their email application. To power the grid, a developer needs to use a DataStore, which is an abstraction over Ajax and JSON that exposes server-side data to Ext components.

I realized early on that a lot of what we do when we build applications is simply expose data to end users. We give them lists (grids) and detail items that explain things to them. We give them links to other detail pages. Even updating this information is simply exposing a form that, when the user hits the “Update” button, sends the data to the back end to be persisted. There’s not a lot of actual code that needs to go into the basic CRUD operations that are the majority of our applications.

So the first thing I did when designing this framework was provide a way to map an HTTP verb (GET, POST, PUT, DELETE) to a CRUD operation (create, retrieve, update, delete) as defined in a Groovy source file. This works excellently for DataStore applications because this is all handled automatically by the Store. When the user interface updates a record in the client end and requests a save, the Store handles PUT’ing the data back to the server.

The whole of the web framework is not really designed to return HTML. It’s designed to return JSON. It handles serializing the data you want to send back to the client at the framework level. The way it does this is by using an SQL DSL (Domain-Specific Language) that allows the developer to express an SQL statement such that it can be built based on input data (e.g. by adding or removing columns or changing sort orders) and can be handed off to the framework for delegated execution.

By way of example, here’s a REST definition file that is part of a maintenance application to update an Ext menubar component:

import com.npci.enterprise.rest.util.SqlTemplate

create = {sql ->
}

retrieve = {sql ->
 dataSource = bean("postgres")

 minId = 0;
 sql {
   select "id,parent_id,title as \"text\",not(has_children) as leaf,order_index"
   from "webmenu.items_tree"
   where {
     condition column: "id", operator: ">", var: "minId"
     condition column: "id", var: "id", required: false
     condition column: "parent_id", param: "parent", type: "integer", required: false
   }
   if (exists("sort")) {
     order by: [sort], direction: dir
   }
   outputRawData true
   nolimit true
 }
}

update = {sql ->
 dataSource = bean("postgres")

 def relDelete = new SqlTemplate(dataSource, "DELETE FROM webmenu.relationships WHERE child_id = ?")
 relDelete.execute([childId])

 sql {
   insert "webmenu.relationships"
   column name: "parent_id", param: "parentId"
   column name: "child_id", param: "childId"
   column name: "order_index", param: "orderIndex"
 }
}

delete = {sql ->
}

This illustrates several things about the REST framework that make it worth our time to develop with:

  1. When the DataStore that powers the Tree component requests the data with a “GET”, the framework runs the “retrieve” closure. This particular closure doesn’t actually execute anything. It simply sets up an environment that the framework will use when it invokes the SQL being returned by the configured DSL helper object (the “sql” variable being passed into the closure). The developer can also specify a filter expression, which will only output the row of data if the filter closure returns “true”.
  2. In the “where” block, there are multiple column definitions, but only one of them (the first) will ever be run all the time. You’ll notice the other column definitions have “required: false” set on them. This means that if no variable exists with the name you’ve defined, the column won’t be included in the WHERE clause.
  3. The sort ORDER BY and direction are controlled by the Store. I wrote a little helper Closure called “exists” that serves the same function as PHP’s is_set.
  4. “outputRawData” means don’t include some of the other metaData that is normally included in JSON responses that inform the requesting Store how many records there are and other such information about the results being returned. But the developer *can* specify any extra metaData that should be included with the results and sent back to the client if they wanted to. This provides a clean mechanism for returning not just results, but arbitrary data that can be consumed by any Ajax request, not just Ext DataStores.
  5. “nolimit” is a setting that means don’t include any kind of pagination. By default, the REST framework will NOT return full result sets. It will only return pages of results at a time. This is controlled by the Ext grid, in combination with the Store. This makes application performance incredibly fast. In addition to the small size of JSON requests, we’re not selecting massive amounts of data from the database. We’re only working with slices of data at a time. This means performance improvements all the way back to the Postgres server, which can pull out a slice of records very quickly and efficiently.
  6. The “update” closure is invoked in response to an HTTP PUT request. You’ll notice that I also have a helper object here that ties the Spring JdbcTemplate to developer-supplied Groovy code (SqlTemplate). The execute method is overloaded to take several different kinds of input. In addition to parameters for the SQL statement, it takes a closure, which is invoked with every record. Executing SQL in a REST operation, then, means specifying the SQL statement in the helper, calling execute (passing any required parameters), and providing a Closure as a callback for each record returned.

We see some real benefit to developing applications in a strictly RESTful manner. Our HTML pages are simply wrappers that define the important areas of the screen into which our Ext components will be inserted. Since our web framework is transparently integrated with our client framework, we don’t have to write code to handle plumbing. I designed the framework so that the developer would only have to define the bare minimum of information required to get data from the user’s browser into the database and vice-versa. The useful abstractions I’ve included like the SqlTemplate mean I don’t have to write any more code than is required to execute the business logic.

Groovy and AppEngine

This is all closed-source, unfortunately. I’ve asked a couple times about simply OpenSource-ing it as is, but I get the impression there’s too many fires burning (there’s *always* too many fires burning) to give the idea much thought. It’s not that they’re opposed to OpenSource (in general), just that they don’t know much about giving away internal code, fear the idea at least a little bit, and would rather take the path of least resistance which is to ignore the question and move on to the next crisis.

I firmly believe this paradigm could be useful to cloud deployments outside our company, though, and since I like to keep up-to-date with what everyone’s doing in the community, I decided to port this REST framework to an AppEngine-friendly, OpenSource version. In taking on a start-up project recently, I investigated using Grails on AppEngine and found it was unwieldy because it required restarts of the AppEngine server *every time I saved my Groovy source*. I simply can’t be productive that way, so I chose to use Rails 3 and deploy on Heroku.

But I still want to see a cloud deployment option for Java/Spring apps and, even though this framework is no different than Grails in that you will be constantly restarting the server (it’s a side-effect of the Draconian limitations Google places on Java apps running on AppEngine), it should make developing RESTful Ajax AppEngine applications a little less painful. It won’t include a full ORM because, in my opinion, that’s impractical when developing Ajax/REST applications where the data is meant to be sent to the client for the actual processing. Why go to the trouble of wrapping a datastore in an ORM so you can have dynamic finder methods when all you’re going to do with those objects is serialize them and send them on out to the client? Less is more in this case, so I opted to adapt the SQL DSL I wrote to interact with PostgreSQL and the AS/400 to a JDOQL version that puts fewer layers between the actual data and the JavaScript on the other end of the request.

Another limitation I don’t like at all is the time limit for requests. Since this REST framework enforces delegation of the execution of queries to the framework itself, it would be easy to farm out requests for large amounts of data processing to an asynchronous queue, where the work could be done in true parallel, cloud fashion. Page requests, then, would be shorter in duration because of the parallelism. But AppEngine has no such capability. Task Queues are an approximation, of course. But Task Queues cannot replace an asynchronous message bus where workers are listening for events and do work in an event-driven way.

Another piece I’ve intentionally left out–that developers familiar with other web frameworks might be expecting–is a templating system. I use Sitemesh to make things a little more Grails-like, but we don’t have a need for a complex templating system because we don’t generate HTML. Everything that comes out of our REST applications comes out in JSON format. Data display is handled entirely by the Ext toolkit in the user’s browser. If your REST operation Groovy code delegates execution of a SQL statement to the framework, the framework handles streaming the JSON out to the browser. The developer doesn’t even need to do anything because it’s handled by the framework. The developer can, of course, choose to output their own content. You *could* output HTML or plain text or a PDF if you wanted. But the point here is that you don’t have to. The idea of the framework is to let the developer get to the business of the application faster, without taking time and energy away in writing code that doesn’t really contribute to the execution of the business logic.

I translated most of the critical portions of the codebase into an OpenSource version that’s friendly to the confines of AppEngine. Script reloading could be achieved by storing Groovy code in BigTable and providing a ResourceConnector to the GroovyScriptEngine that reads files from BigTable rather than loading a resource from the classpath. But I’m not really sure how easy it would be to provide some kind of build hook that loaded the Groovy source into BigTable whenever the local resource is saved. Without this, the developer would have to use a form and a textbox to edit their Groovy code. That’s doable, but pretty rotary-phone if you ask me.

I’ll be posting an example application that should run as-is on AppEngine and interacts with BigTable via JDO and illustrates all the points I’ve tried to outline in this rather lengthly article. It will be on GitHub, alongside my other cloud utilities. I won’t give myself a deadline just yet. I’m still half-way through a fairly extensive renovation of my house, where my wife and I are doing all the work. Time is limited. But I’ve been wanting to publish an OpenSource version of what I’m doing at work for a long time now. Keep an eye out!

Written by J. Brisbin

July 7, 2010 at 1:11 am

Cloud-friendly Classloading with RabbitMQ

leave a comment »

One of the things everyone who deploys artifacts into the cloud has to deal with is the issue of classloading. If you have multiple nodes out there, listening to your RabbitMQ server, waiting to do work, you have to have pre-deployed all the dependencies you need. This means some system to either copy them out there automatically (in the case of deployable artifacts), or you simply have to copy the JAR files into a lib/ directory somewhere that the listener has access to.

None of these solutions is ideal.

I was contemplating this on my way to work the other day and I’ve come up with a solution that I’m most of the way finished coding: write a ClassLoader that uses RabbitMQ to load the class data from a “provider” (just a listener somewhere in the cloud that actually *does* have that class in its CLASSPATH).

There are two moving parts: a Provider and a ClassLoader. The Provider has a number of message listeners and binds them to the configured exchange with routing keys that could be “com.mycompany.cloud.#”, or “com.thirdparty.#”, or simply “#”. The routing key is the class or resource name, so you could have different providers for different areas of responsibility. Third-party classes could come from one provider, while your own internal class files could come from an entirely different provider (ostensibly running on a different VM).

Some potential uses:

1. You could provide added layers of class file security because you could control exactly where class files come from without exposing those class files to be copied to the file system.
2. Providing class files dynamically to nodes that come up and down based on system demand but still need to do work that requires those individual classes. Amazon EC2 instances would not need to be pre-populated with JAR files, simply configured to use the cloud classloader pointed to your RabbitMQ server.
3. Wrap normal classloading with some AOP hooks that would cloud-ify an entire installation without touching the source code or using special configurations.

Point number 3 is the most interesting to me. Using Spring AOP, one could wrap normal classloading with a cloud-friendly version, which would alter the way all your classloaders work, without having to hack on the Tomcat source code (or whatever application you’re deploying).

I suspect I’ll write a Maven-aware Provider that will search maven repositories for requested class files. I’m sure there are other possibilities here.

Code will be posted on Github this week or next.

As always, patches and feedback are eagerly sought and heartily welcomed.

Written by J. Brisbin

June 29, 2010 at 8:33 pm

Log4J Logging with RabbitMQ

leave a comment »

In troubleshooting some problems I was having deploying my cloud-based session manager, I quickly grew frustrated by having to tail log files in three or four windows at once. With no real ability to filter what I was looking for, my important log messages would get buried under the truckloads of other DEBUG-level messages being dumped into those log files. I simply needed a better way to aggregate and monitor my log files.

I wrote an appender for Log4J that dumps logging events into a RabbitMQ queue rather than writing them to disk or inserting them into a database.

Our company is quite frugal, so the quote we got for Splunk, a tool to aggregate log files, was throat-constricting. Something in the tens of thousands! Thanks, but no thanks.

I haven’t written a web front-end for this yet, but it will be really simple to when I do. It will have a listener on the log events queue that processes incoming log events and builds nice grids so I can sort and search and do all those other Web 2.0 Ajax-y things.

It’s part of the larger umbrella of private, hybrid cloud utilities I have on Github. You can download the source on the vcloud project page: http://github.com/jbrisbin/vcloud/tree/master/amqp-appender/

Written by J. Brisbin

June 25, 2010 at 8:33 pm

Cloud Artifact Deployment with RabbitMQ and Ruby

leave a comment »

Running a hybrid or private cloud is great for your scalability but can get a little dodgy when it comes to deploying artifacts onto the various servers that need them. To show how I’m solving this problem, I’ve uploaded my Ruby scripts that monitor and deploy artifacts that have been staged by the automated processes on my continuous integration server, TeamCity. In order to make it fairly secure, it will not deploy arbitrary artifacts. Anything you want automatically deployed must be explicitly configured as to the URL from which to download the artifact and the path to which you want it copied (or unzipped/untarred).

The Parts

There are a couple moving parts here. You need a RabbitMQ server, of course. You also need a couple servers to deploy things to. I use three instances of SpringSource tcServer (basically Tomcat 6.0) per Ubuntu 10.04 virtual machine. So this script needs to deploy the same file to three different locations. I also need to deploy HTML files to my Apache server’s document root. As an aside: Apache has now been relegated to only serving static resources and PHP pages and is no longer the out-in-front proxy server. I’ve switched to HAProxy. I love it. More on that in a future post.

The Scripts

I haven’t included the script that actually publishes the notifications yet. That’s a Python script at the moment (Ruby is so much more fun to program in than Python :). It looks like this:


#!/usr/bin/python

import os, sys, hashlib
from amqplib import client_0_8 as amqp

EXCHANGE = 'vcloud.deployment.events'

def get_md5_sum(filename):
  if not os.path.exists(filename):
    return None

  md5 = hashlib.md5()
  try:
    with open(filename, 'r') as f:
      bytes = f.read(4096)
      while bytes:
        md5.update(bytes)
        bytes = f.read(4096)
  except IOError:
  # Probably doesn't exist
    pass
  return md5.hexdigest()

def send_deploy_message(queue=None, artifact=None, unzip=False):
	if not queue is None and not artifact is None:
		md5sum = get_md5_sum('/var/deploy/artifacts/%s' % sys.argv[2])
		#print 'MD5: %s' % md5sum

		mq_conn = amqp.Connection(host='rabbitmq', userid='guest', password='guest', virtual_host='/')
		mq_channel = mq_conn.channel()
		mq_channel.exchange_delete(EXCHANGE)
		mq_channel.exchange_declare(EXCHANGE, 'topic', durable=True, auto_delete=False)
		mq_channel.queue_declare(queue, durable=True, auto_delete=False, exclusive=False)
		mq_channel.queue_bind(queue=queue, exchange=EXCHANGE)
		msg = amqp.Message(artifact, delivery_mode=2, correlation_id=md5sum, application_headers={ 'unzip': unzip })
		mq_channel.basic_publish(msg, exchange=EXCHANGE, routing_key='')

if __name__ == '__main__':
	send_deploy_message(queue=sys.argv[1], artifact=sys.argv[2], unzip=sys.argv[3])

I’ll be converting this to Ruby at some point soon.

You can check out the Ruby scripts themselves on Github: http://github.com/jbrisbin/cloud-utils-deployer

The Deployment Chain

When our developers check anything into our Git repository, TeamCity sees that change and commences to build the project and automagically stage those artifacts onto the development server. This deployment requires no manual intervention. We always want development to use the latest bleeding edge of our application code. Once we’ve had a chance to test those changes and we’re ready to push them to production, I have a configuration in TeamCity that calls the above Python script. The developer can just click the button and it publishes a message to RabbitMQ announcing the availability of that project’s artifacts (of which there’s likely several). We haven’t decided how often we want the actual deployment to happen, but for the moment a cron job runs at 7:00 A.M. every morning on all the running application servers (it should also be run from an init.d script to catch servers that have been down and are behind on their artifacts). That script is the “monitor” script. It simply subscribes to a queue with the same name as the configuration section in the monior.yml YAML file:


myapp.war:
  :deploy: deploy -e %s

The “%s” placeholder in the “:deploy” section (the preceding colon is significant in Ruby) will be replaced by the name of the artifact as pulled from the body of the message. It may or may not correspond to the queue name. It doesn’t have to because it’s simply an arbitrary key in the deploy.yml file.

The “deploy” script is where all the fun happens. Via command-line switches, you can turn on or off the ETag matching and MD5 sum matching it does to keep from redeploying something that it’s already deployed (it keeps track in its own cache files).

First, the deployment script has to download the resource to a temporary file:


request = Net::HTTP::Get.new(@uri.request_uri)
load_etags do |etags|
	etag = etags[@name]
	if !@force and !etag.nil?
		request.initialize_http_header({
			'If-None-Match' => etag
		})
	end

	response = @http.request(request)
	case response
		when Net::HTTPSuccess
			# Continue to download file...
			$log.info(@name) { "Downloading: #{@uri.to_s}..." }
			bytes = response.body
			require "md5"
			@hash = MD5.new.update(bytes).hexdigest
			# Write to temp file, ready to deploy
			@temp_file = "/tmp/#{@name}"
			File.open(@temp_file, "w") { |f| f.write(bytes) }
			# Update ETags
			etags[@name] = response['etag']

			outdated = true
		when Net::HTTPNotModified
			# No need to download it again
			$log.info(@name) { "ETag matched, not downloading: #{@uri.to_s}" }
		else
			$log.fatal(@name) { "Error HTTP status code received: #{response['code']}" }
	end

	if @use_etags
		save_etags(etags)
	end
end

This method returns a true|false depending on if it thinks the resource is out-of-date or not. The deployment script then calls the “deploy!” method, which attempts to either copy the resource (if it’s say, a WAR file) or unzip the resource to the pre-configured path (if it’s say, a “.tar.gz” file of static HTML resources or a “.zip” file of XML definitions). The deployer decides whether to try to unzip or untar based on the extension. If it’s “.tar.gz” it will run the “tar” command. If it’s anything else, it will try to unzip it. This isn’t configurable, but might be a good project for someone if they want to use “.tbz2” files or something! 🙂

Permissions

The user you run this as matters. I have the log file set to “/var/log/cloud/deployer.log”. This is configurable in the sense that you can download the source code and change it in the constant where it’s defined (cloud/logger.rb). Your user should also have write permission to a directory named “/var/lib/cloud/”. You can change this (at the moment) only by editing the “cloud/deploy.rb” file and changing the constants. There’s only so many hours in the day. Just didn’t have time to make it fully configurable. I’d love some help on that, though, and would gladly accept patches!

Still to come…

I just haven’t had time to make it a true Gem yet. That’s my intention, but at this point, on a Friday afternoon, I’m thinking it’ll be next yet before that’s done. UPDATE: Done! This is now on RubyGems.org.

As always, the full source (Apache licensed) is on Github:

http://github.com/jbrisbin/cloud-utils-deployer

I’d love to hear what you think.

Written by J. Brisbin

June 11, 2010 at 8:55 pm

Beware the 802.3ad Beast!

leave a comment »

I readily admit I’m not a network person. We have engineers for that where I work, so I don’t have to be. But every once in a while, something crops up that makes even our network engineers scratch their heads, shrug, and say something to the effect of: “from what I can tell, it should work.”

We use VMware ESX server as our virtualization hypervisor and we manage them with vSphere. I love my VMware boxes. You’d have to pry them from my cold, dead, fingers if you want to get them away from me. But last week made get very close to actually cursing. The problem seemed like it was something very basic: virtual machines on one ESX host couldn’t always talk to virtual machines on another ESX host. Identically configured hosts, no less! If I ssh’d into one box and issued a “ping” back to box 1, then traffic started flowing. It was simply forgetting how to get to that other VM after some number of minutes. We tried every setting in the Cisco switch. We even called VMware support and they looked at everything and couldn’t find any glaring errors.

The next day, on a whim, I had our network engineer enable EtherChannel (Cisco’s name for 802.3ad) on the ports my ESX hosts were plugged into and I switched the ESX server’s NIC load balancing to “IP hash” instead of originating port ID and voila! It magically started working.

I was so frustrated the previous day that I was a little annoyed, truth be told, that the fix was something so simple. So if your VMs have trouble talking to other VMs reliably and you think your switches are losing the ARP entries for those VMs, enable EtherChannel for the NICs you’re using on that ESX host and change your load balancing to “IP hash” on the vSwitch properties in your vSphere client. It’ll save you a lot of headache.

Written by J. Brisbin

June 7, 2010 at 4:23 pm

Posted in The Server Side

Tomcat/tcServer session manager with attribute replication

leave a comment »

I’d like to think that my programming projects don’t suffer from philosophical schizophrenia but that I simply move from point A to point B so fast it just looks that way. Unfortunately, sometimes I have to come face-to-face with this and accept it for what it is: my programming projects suffer from philosophical schizophrenia.

I say this because I’ve been hacking on some changes to the virtual/hybrid cloud Tomcat and tcServer session manager and I went through several stages while I was trying to solve some basic problems.

For one, how do I get a serialized object in the cloud when I don’t know where it is? RabbitMQ comes to my rescue here because I think I’ve finally settled on the most efficient way to answer this question: bind a queue to a topic exchange using the object’s key or identifier (in this case, a session ID). Then any messages for that object automagically get routed to the right VM (the one that has that object in memory). I’m thinking this idea can be extended to create an asynchronous, RabbitMQ-backed, Map implementation.

Now that I have the object, how do I keep it in sync with the “master”? In my case, I send a replication message out whenever a session’s (set|remove)Attribute methods are called and the value objects differ. One notable problem that I don’t see being easily overcome (but thankfully, doesn’t apply to my scenario) is if there are listeners on sessions. I don’t have RabbitMQ wired into the internal Catalina session event mechanism. I could add that at some point, but for the moment, I think this kind of dumb saving/loading via RabbitMQ messages will work for what we’re doing.

I’ve now switched back to a ONEFORALL type operation mode which means there is only one real session object that resides in an internal map on the server who first created it. Whenever another server sees a request for this session, it will send a load message to this queue every time it needs it. That last part is important: it loads a session object every time it needs it. When code sets an attribute on server TC3, that attribute is replicated back to the original server (TC1) so subsequent session loads get that updated object. I’m still trying to wrap my head around how I want to handle replication in case of node failures. No good answer on that one, yet.

REPLICATED mode is my next task. In simplifying this code, I focussed on getting the ONEFORALL mode working right to begin with. Now I can go back and make it more performant by cranking out a little extra code to handle replication events.

Initial smoke tests seem to indicate this works pretty well. Session load times in my testing were around 20-60 milliseconds. Depending on the location of your RabbitMQ server and your network topology, you might experience different results. I’m sanding off a few rough edges now and I’ll be testing this on a new set of cloud servers we’re setting up as part of a new vSphere/vCenter-managed cloud.

As always, the code is available on GitHub:

http://github.com/jbrisbin/vcloud/tree/master/session-manager/

Written by J. Brisbin

May 4, 2010 at 6:35 pm