Greg Reinacker’s Weblog

Musings on just about everything.

Archive for December, 2002

Google and Greg

December 20th, 2002 by gregr

Ok, every now and then I search for ‘Greg’ in Google, just to see where I am.  Today I’m up to #40, behind sites like Dharma & Greg and Greg the Bunny.  :-)  But the one that caught my eye was #43, which was for Greg Ray.  He is an IRL driver (a pretty famous one), and in his bio it says:

First racing experience
Drove a spec racer at SCCA driver’s school, Denver, Colorado in September 1991. 

He raced with LaRue Motorsports, which is the company I currently race with.  The LaRue’s have two daughters, named Nikki (now 21) and Michelle (16), who work on the cars along with their parents.  They have a picture on their site of Nikki and Michelle with Greg Ray, way back when he was racing the Spec Racer Fords.  Very cool!  Small world.  :-)

Category: Uncategorized | No Comments »

More on Caching

December 20th, 2002 by gregr

After writing a bit the other day on server farm caching, I noticed an article in December’s MSDN Magazine titled “Use Data Caching Techniques to Boost Performance and Ensure Synchronization for Your Web Services”.  I read it this morning over a bagel…

Basically the article describes a N x N cross-linked notification mechanism for cache coherency notifications.  Whenever a server updates the data, it calls web services on all the other servers in the farm telling them to update themselves.  It basically looks up the server list in a database, and walks through the list, calling the “refresh” web service on each of the servers.

I don’t mean to be negative, but I don’t think this is anywhere close to something that could be usable for a production server environment.  Here’s why:

  • In a production environment, the server list typically isn’t in the database.  It’s managed by administration tools like Application Center 2000, or it’s managed by your hardware load balancer.  And when a server goes down, the load balancer knows about it – the database list doesn’t.  Keeping this separate list is unworkable unless you can script quite a bit of updating code from the load balancer.
     
  • Suppose server 1 gets an update, and server 3 has just gone down.  While server 1 is walking through the list of servers, it calls the refresh function on server 3.  Depending on the failure situation server 3 is in, this call might seem to be accepted, but not respond…which would cause the call from server 1 to timeout.  I could have more than a minute of cache incoherency between servers 1 and 4.
     
  • Suppose server 1 gets an update, writes it to the database, and is about to start walking through the list of servers, when suddenly it goes down for some reason.  Servers 2-4 will never know about the update.  We now have a potential for indefinite cache incoherency.

And in addition, this solution suffers from the problems discussed in my previous post.  It may seem like this is all nitpicky, but these are the things you have to worry about when building a system which must run reliably and ensure data consistency in the face of multiple failure modes.

I’m looking forward to seeing what Justin has come up with – this is an interesting problem, without a simple answer!

Category: Uncategorized | 2 Comments »

Virtual PC

December 19th, 2002 by gregr

I’ve been reading about what sounds like a super-cool product – Connectix Virtual PC.  It’s similar to VMWare, in that you can create virtual machines and install different operating systems.  Great tools – especially for those who must develop and test software for multiple platforms.  But what really caught my eye was this feature:

“Undoable Drives allow you to store or discard changes without affecting original state of underlying drive”

Among other things, what an awesome tool to test application installations.  I can create an image with, say, Windows .NET Server, and save it.  I can then start this OS, run my installation package, see how it went, and undo the whole thing automatically to restore the previous image.  Instantly.  No re-imaging of a drive.  No running the un-install and assuming it worked like it was supposed to.  We’ve all been through the pain of having to test and re-test installations on clean OS installations; the pain is even worse sometimes for server installations, where you are running scripts to modify the IIS metabase, etc. 

Very cool, if it does what it claims.  Was I the only one who didn’t know about this?

Category: Uncategorized | No Comments »

Caching in Web Farms

December 15th, 2002 by gregr

Justin is discussing caching in a multi-server environment, which I wanted to comment on.

First, there is the question of what data is good to cache.  Justin mentions a possible rule of thumb:

Let’s assume you have some data that is updated 1 time out of every 1000 reads. That’s a good candidate for caching in my mind.

I think this really depends on the application.  The more the data gets updated, the more cache coherency processing will be involved (unless you allow stale data, which we will discount for this discussion).  And as he mentions elsewhere, the cost of reading the data from the actual data store certainly factors into the equation.

He then discusses the interesting question of how to implement a cache in a multi-server environment.  He concludes that a N x N cross-linked notification mechanism is unworkable in real life (I agree in general), and continues:

So what are some other ways? I could do a reliable TCP multicast to all the web servers. That way all I have to know is one IP address. But TCP multicasting doesn’t scale that well in my experience with SwiftMQ and iBus JMS solutions.

The tough problem I see here, even if you could make this work and scale, is the race conditions.  Sometimes you use transactional reads from databases to ensure that you read the data either before or after a writing transaction, but not during.  But in any case, you know the data is the most up to date available.  With a system like this, you could have just updated the cache on your machine, and you’re starting to broadcast updates to the other servers, but at the same time another server is reading the same value from its own cache.  The data is stale, if only by a few milliseconds.  But in many applications, this will be a huge problem.

So one might say, “I’m only caching data that doesn’t change very often, like my catalog and prices, so this won’t be a problem.”  Well, I’m not so sure…if you update the price of an item, but there is a small window where your servers do not agree what the price is for the item, then you’re screwed – and depending on how your load is distributed and how your application is designed, it’s conceivable you could end up with corrupt data.

Justin then describes another scenario:

One thing that I’ve been playing around with is this –


Here I have a cache server. The web servers simply talk to the cache server and the cache server can batch up commands to the database. This works very well with Domain Models. I’ve never tried it using a Table Module Gateway. But it should work the same.

This certainly solves the problem I was just discussing.  However, you now have something between the web servers and the database, which has become a single point of failure.

Aha, one thinks, I will have multiple cache servers, and they will somehow ensure coherency between themselves.  Well, now you’re back to the original problem of coherency and timing, as well as a fail-over model which maintains absolute coherency - and it’s not an easy problem to solve.

In general, I think this is a hard problem to solve in the general case, and that’s probably why there aren’t any magic cache servers being sold.  But for a particular application, it’s not quite as bad, because you know how often your data will be updated, possibly the time of day it will be updated, etc.  The easiest data to cache is going to be data that doesn’t change except on a fixed schedule (for example, update the product catalog once a day).  For data that is subject to updates at any point, I think you’re left with an application-specific solution, which can take into account the strict-ness of your coherency requirements.

Category: Uncategorized | 2 Comments »

Transaction Management with EnterpriseServices

December 1st, 2002 by gregr

Lots of comments on my previous postIngo says:

But what happens, if you later decide that you need distributed TX, probably because another method wants to integrate the addition of a new customer with a post to a message queue? This wouldn’t be possible using the code you’ve shown.

Right, obviously my example only works with a single RM, and it was intended as such.  The situation you describe is, in my mind, the most pressing argument used to just bite the bullet with automatic/distributed transactions up front, even if you don’t need them immediately.

Clemens asks:

What are you doing with that method signature when you’re porting that app to another data provider? What if a future revision of that method wants to add that customer to the database asynchronously and does so by stuffing it into a transactional queue, first?

I read two parts to this.  First, the transaction queue scenario is as Ingo mentioned above, to which I concede this implementation is for a single RM only.  Second, the oft-pondered question – what if I need to change databases?  Well, there are so many more issues involved in changing a database out from under a real-world application, that I don’t give this a lot of weight.  The way to squeeze the absolute best performance out of your database is to leverage the proprietary extensions available in that database; for example, using stored procedures.  And as soon as I do that, I’ve got some real work to do if I want to switch databases.  Another issue is lock management – if you’re changing, for example, between MS SQL Server (lock-based) and Oracle (version-based, last I checked), you need to at least evaluate your transaction locking to make sure you’re getting the right result with the required isolation.

All that said, however, you could relatively easily change my signature and implementation with something that uses a generic interface (IMyTransaction, IMyConnection, etc.) and be able to switch data providers from under it.

On Clemens’ comments page, Tomas doesn’t like my method signature and implementation:

I agree this isn’t a great way of doing it. For one thing, it’s way to intrusive on your objects and methods, and hard to extend.

I could change the syntax to make this look a whole lot more like automatic transactions.  A method attribute, combined with a single-call hook in each method (or use the undocumented and soon-to-be-not-working-in-1.1 ContextAttribute and associated interception architecture), combined with implicitly “flowing” a transaction through CallContext, would reduce a lot of the requirements.  The example was kept simple to illustrate a point.  We’re talking about syntactic sugar.

Morten Abrahamsen has a few interesting comments, which I’ve quoted some snippets from here:

Suddenly a new component requires a new execution context (another process ?) and you get a lot of not so fun cross process marshalling to get the SqlTransaction to interact with your component code.

If this scenario were real, I’d agree, and use a distributed transaction in that case.  However, how often (in real life) do I really need a transaction to cross execution context boundaries (we’re still talking about a single RM scenario here)?  In my experience, it’s not all that common.  And as soon as you do, and start stretching arbitrary transactions across process/machine boundaries, you’re prone to “accidentally” holding database locks for quite some time.

The fear of distributed transactions is in my experience often driven by the fear of performance losses. However, today the performance loss isn’t all that big, yet it is there!

If it were all about execution performance, then I’d be all for distributed transactions.  On a typical request, I don’t mind paying a small performance penalty for convenience, and I can add cheap hardware to my business tier to make up for this (to a point, then I’m concerned about performance again).  But using a distributed transaction, and an out-of-process transaction coordinator (such as MSDTC), means there is more communication going on, transactions are lasting longer, and thus database locks are being held longer than necessary.  And when you’re hitting a wall in database throughput, it’s all about releasing locks sooner (assuming your bottleneck isn’t CPU or IO).

On another note, I always find it interesting when people automatically bind the need for EnterpriseServices with distribution transactions.

Well, I think for most applications, transaction management is really the only COM+ service being intentionally used.  Not all applications – but most.

I feel that there is a big difference between a component that accepts a SqlTransaction object as the first method parameter and a “transactional component”. It’s a conceptual difference, a design difference and an implementation difference.

I don’t understand this comment.  A “transactional component” is one that can/must do its work in the context of a transaction, and affect the outcome of said transaction.  How is the SqlTransaction version and the “transactional component” different?

Ingo also points to a great article that mentions:

In the future, Enterprise Services will support the concept of promotable transactions which will begin as a local transaction and will be promoted to a distributed transaction when required.

NOW we’re talking.  That would be awesome.

All in all, don’t get me wrong – I like EnterpriseServices.  I’m certainly not knocking the power of distributed transactions, and the convenience of declarative transaction boundaries.  In an application I’m working on right now, I may actually end up switching to using distributed transactions to be able to support MSMQ/SQL transactions together – and let me tell you, I’m going to get to delete a lot of code if I do this.  But for many applications, which use a single RM, I don’t think using distributed transactions should be the automatic choice; people should evaluate their needs, and intelligently choose a transaction management strategy.  Local transactions are lean, mean, and effective.  Let’s not dismiss them out of hand.

Category: Uncategorized | No Comments »