Wednesday, March 19, 2008

SimpleDB and Eventual Concurrency

So we have a system that takes in user input, stores it in SimpleDB and then query SimpleDB to display it back to the user. Very easy. But guess what, when you query SimpleDB, that data you just put in is not there! Yikes!

That's actually expected because the nature of SimpleDB. Data is propagated between "nodes" and you do not know which node you will hit for save and which node for query. They call this "eventual concurrency". You can be assured that your data will eventually be propagated to all nodes but you do not know when and how fast.

This limitation make our simple use case impossible.

The easiest solution is to maintain our own aware-node. A node that we talk to for update and a node that we talk to for query. Whether this is a standalone app or built-in doesn't matter as long as it can have the latest value and the app (and other nodes) know its location and can talk to it.

Assuming that we have a distributed system. A cluster of nodes running the same app. When user made an entry, one of the app received it and update SimpleDB.

What's next?

How would user see latest value? Well, we can keep this user on the same node. Since this node is the node that is aware of the latest data, then we can see the latest data from here. Other nodes will hit SimpleDB to get data and that's okay. In all likelyhood, User B wouldn't mind if it takes a few minutes before seeing User A's update.

This requires sticky session. Any user session must go to the same server. That's probably not too hard to do. Designing an aware node is definitely harder to do. Basically it is a cache against SimpleDB and somehow it must know when to ask SimpleDB and when to rely on what's cached.

Now the next level of use case.

User enter data. We store it in SimpleDB. Alas, SimpleDB doesn't have much in terms of arithmetic operation, so we need a job that aggregate data. No problem can be done. We put an entry in SQS for the job to pick it up.

First problem. When the job picks it up and query SimpleDB, it might not see the data yet. So it will aggregate without the new data. Pointless.

To fix that we can stick information in SQS entry on where to look for latest data. Easy enough. We tell the job to look for latest data in the aware node.

What if two different users enter data one after another. Then we have two entries in SQS.

The problem is that we cannot be sure which entries are picked up first, and each entry is not aware of the other.

Let's say the last entry is picked up first by the job. Then the job will hit the aware node for that one latest data and get the rest from SimpleDB (maybe by asking the aware node or by itself, it doesn't matter). But if the second latest data has not been propagated, it will not know where to get it from.

One solution would be for the job to get all SQS entries in the queue. The sort them into buckets of similar data (say it knows that User A and User B are both doing update on the thing that has to be computed together). Then it takes the two SQS entries, pick up the latest data from respective aware nodes, and process data.

The drawback is that this one job will consume this entire queue and we cannot have another job on this same queue. This is okay as we can create unlimited queues.

No comments: