OpenSRF Jabber: A Technical Review


As has been mentioned before on this blog, OpenSRF relies on Jabber for it’s communication layer. Jabber is an instant messaging service much like AIM, Yahoo messenger, and the like. The advantage of Jabber, of course, is that it’s an open spec (see xmpp.org) and there are a number of open source server implementations, allowing us to run servers localy and write our own server code if we feel so inclined.

Like most chat frameworks, a Jabber client is distinguished by its username on the network. So a unique Jabber “account” would consist of something like bill@gapines.org. Jabber also adds an additional component which is called the “resource”. This allows a single account to have multiple open connections to a jabber server. A full client login would be something like bill@gapines.org/home, bill@gapines.org/work, etc. A single user may be logged into a given server as many times as they want so long as the resources are unique for each connection.

OpenSRF Application Servers

In typical server fashion (think Apache) OpenSRF applications have a single inbound component and one or more worker processes that actually handle incoming requests. Both the inbound handler and each worker process will connect to the local jabber server (or servers, but we’ll save that for later :)). The username and resource don’t really matter (you’ll see why shortly), so long as the resource is unique per username (we usually add the hostname and process id to the resource for this purpose) and the username can be authenticated. A typical OpenSRF jabber id might look something like: client@app03/opensrf.math_drone_at_app05_12434. Not something you would want to use for a chat login, but it works well for machines.

The Router

The router, seen previously here, is the communication brains of the OpenSRF network. The router allows us to have a known destination for client requests as well as a way for multiple like servers (think 4 opensrf.math servers starting up) balanced across client requests. Every application inbound process will register itself with the router. Every application class (e.g. opensrf.math) will have a virtual server that lives on the router and proxies requests to the backend servers. For example, each opensrf.math app will announce to the router that it is alive and can serve requests destined to the “opensrf.math” virtual destination.

Piecing it all together. Assume server1 is hosting jabber and the applications are running on server2.

1. Jabber server is online

2. Router comes online and connects to the jabber server as router@server1/router

3. The OpenSRF math application is launched
a. The worker processes are launched and each process connects to the jabber server as client@server1/opensrf.math_drone_at_server2_12345, etc.

b. The listener process connects to the jabber server as client@server1/opensrf.math_listener_at_server2_12346, for example. The listener process then sends a registration request to the router asking to be added to the pool of “opensrf.math” servers.

c. If this is the first registration for opensrf.math, the router creates the new opensrf.math class and opens a new connection to the jabber server as router@server1/opensrf.math. Otherwise, the new listener connection is simply added to the pool.

4. All clients will send opensrf.math request to the well known location router@server1/opensrf.math. The router will go to the next registered listener in the pool and send the request to the that listener.

Note: Here’s where some trickery comes in. Jabberd2, our home-spun jabber server (chop-chop), and probably just about any jabber server you find does not allow spoofing of sender addresses. If you login as bill@gapines.org/home, any message coming through your socket will be stamped as having come from bill@gapines.org/home regardless of the “from” field on the message. Because of this, the router appends a “router_from” field to the message so the listener applications can tell who actually sent the request. Jabberd2 and chop-chop (only tested servers) ignore extraneous xml attributes in the message so this extension is essentially ignored by anyone not expecting it.

5. Listener thread who receives the request passes the request on to one of its worker processes.

6. The worker process does its thing, then sends its response directly back to the client that made the initial request.

7. The client realizes that it is now receiving data from a new connection and sends any future intra-session requests to this new connection.

8 Any new top level requests for math will go right back to router@server1/opensrf.math.

Basic Jabber Connections

Workflow of a client request