https://web.archive.org/web/20181004043647/http://www.addsimplicity.com/adding_simplicity_an_engi/2007/02/latency_exists_.html

Adding Simplicity - An Engineering Mantra

A designer knows he has achieved perfection not when there is nothing
left to add, but when there is nothing left to take away. ~Antoine de
Saint-Exupery -- Note, the opinions stated here are mine alone and
are not those of any past, present, or future employer. --

  * Home
  * Archives
  * Profile
  * Subscribe

<< Compute Power, Is Your Architecture Green? | Main | Build vs Buy -
One Perspective >>

Friday, February 09, 2007

Latency Exists, Cope!

I put this line on a slide recently for a presentation at work.
Looking around the room, I could tell that some of the people
understood, some were perplexed, and some annoyed. What does latency
have to do with architecture anyway? We're concerned with proper
component factoring, interfaces, and a collection of "ilities". How
could latency be relevant to any of these?

In any large system, there is are a few inescapable facts:

 1. A broad customer base will demand reasonably consistent
    performance across the globe.
 2. Business continuity will demand geographic diversity in your
    deployments.
 3. The speed of light isn't going to change.

Given these facts, latency is a critical part of every system
architecture. Yet making latency a first order constraint in the
architecture is not that common. The result are systems that become
heavily influenced by the distance between deployments and limit the
business's ability to serve their customers effectively and protect
itself against localized disasters.

So how do you design for latency? There are a few strategies that can
be applied to your architecture that will allow you to deploy your
components across diverse geographic locations. Here are the ones
that I find particularly important.

Good Decomposition - Highly coupled, monolithic applications are the
bane of any distributed architecture. Allowing components with little
functional overlap to be coupled either in code or during deployment
will pretty much kill any hope distributing your architecture across
a collection of global data centers. Do it badly enough and you will
kill any hope of distributing your architecture across two cities in
the same state. This sounds obvious, but there are plenty of
enterprise level applications in use today that have forced
themselves into data centers on the far edges of the same city as
their only business contingency plan.

Asynchronous Interactions - This is more than just using messaging
between components. It starts by setting the appropriate expectations
on your external interfaces be that SOA or a web page. Companies get
tripped up here by exposing an early version of an interface that
sets the clients expectation of synchronous, low latency
interactions. As the interface becomes more heavily used it becomes
more and more difficult to change that semantic. If the client has an
expectation of a synchronous response, the likelihood of leveraging a
collection of components with asynchronous interactions becomes low.
Start with an expectation of asynchronous behavior and you can more
readily add latency as needed to meet your deployment demands.

Monolithic Data - You can decompose your applications into a
collection of loosely coupled components, expose your services using
asynchronous interfaces, and yet still leave yourself parked in one
data center with little hope of escape. You have to tackle your
persistence model early in your architecture and require that data
can be split along both functional and scale vectors or you will not
be able to distribute your architecture across geographies. I
recently read an article where the recommendation was to delay
horizontal data spreading until you reach vertical scaling limits. I
can think of few pieces of worse advice for an architect. Splitting
data is more complex than splitting applications. But if you don't do
it at the beginning, applications will ultimately take short cuts
that rely on a monolithic schema. These dependencies will be
extremely difficult to break in the future.

Design for Active/Active - If you do a good job with the preceding
recommendations, then you've most likely created an architecture that
can service your customers from all of your locations simultaneously.
This is a more efficient and responsive approach than an active/
passive pattern where only one location is serving traffic at a time.
Utilization of your resources will be higher and by placing services
nearer your customers, you are better meeting their needs as well.
Additionally, active/active designs handle localized geographic
events better as traffic can simply be rebalanced from the impacted
data center to your remaining data centers. Business continuity is
improved.

Latency is another example of what you don't take into consideration
in your architecture will ultimately undo your design. It is one of
the more difficult constraints to design for correctly. As such, it
should be given more attention, early in your architectural process.
Are their other aspects of this that you think are important? I'd
love to hear them.

Technorati Tags: architecture, asynchronous, engineering, performance
, scalability, services, software, to_read, toread, web

Posted at 07:42 AM | Permalink

Reblog (0) | | | |

Comments

Noah Campbell

Can you provide some references on data architecture for performance?

Posted by: Noah Campbell | Friday, February 09, 2007 at 12:54 PM

Geva Perry

Excellent post, Dan. These are very similar principles to the ones we
refer to with our Space-Based Architecture. Other ideas for good
design for low latency include:

1. Co-location of the tiers (logic, data, messaging, presentation) on
the same physical machine (but with a shared-nothing architecture so
that there is minimal communication between machines)

2. Co-location of services on the same machine

3. Maintaining data in memory (caching)

4. Asynch communication to a persistent store and across geographical
locations

There's more about this here: http://www.gigaspacesblog.com/2006/12/
09/sbas-general-principles/

Posted by: Geva Perry | Friday, February 09, 2007 at 01:30 PM

Dossy Shiobara

As I explain it to colleagues, "latency is 'death by a thousand
papercuts' to your poor design."

I always warn folks away from premature micro-optimization (shaving
microseconds off an operation, etc.) but an effective design needs to
treat overall latency as a first-order constraint, as you point out.
Because, you know, "the speed of light isn't going to change"), and
latency translates into effective throughput.

Posted by: Dossy Shiobara | Monday, February 12, 2007 at 07:27 PM

Guy Nirpaz

I find the part on data decomposition as most insightful. One of the
killers for this state of mind is the ORM trend on top of relational
databases.

Posted by: Guy Nirpaz | Tuesday, February 13, 2007 at 09:19 PM

Bob Carpenter

It's a problem even if you don't have global networking issues.

My first real-world programming assignment was to integrate the
SpiderMonkey implementation of JavaScript into a cross-platform,
multi-threaded, highly memory constrained speech recognition engine.
The Javascript was being used to design grammars that defined what
people could say in response to prompts. Any latency was a customer
waiting for the system to say something.

To make a long story short, the requirements were something like 50ms
average response time and a maximum 200ms latency. The first
integration hit 20ms average response time over the test collection
(over 48 threads sharing the Javascript code). But the spikes in
latency were terrible -- up to 2000ms.

Of course, the culprit was garbage collection. The solution (hat tip
to Sol Lerner) was to build a completely new Javascript engine every
call on every thread (that is, every dialogue turn). The setup and
teardown cost was about 30ms, which was just fast enough to squeak by
under the average requirements and sail under the maximum latency
requirements.

Posted by: Bob Carpenter | Thursday, March 22, 2007 at 12:07 PM

jony mill

thanks for the article
all engineering topics team
http://carsnology.blogspot.com

Posted by: jony mill | Thursday, January 01, 2009 at 05:19 PM

The comments to this entry are closed.

Archives

  * December 2011
  * October 2011
  * September 2011
  * March 2011
  * December 2010
  * November 2010
  * October 2010
  * September 2010
  * June 2009
  * October 2008

Blog powered by Typepad
Bookmark and Share
My Photo

My Online Status

  * AIM AIM: driveawedge
  * Facebook Facebook: driveawedge
  * Google Plus Google Plus: 117720555087641740903
  * Google Talk Google Talk: driveawedge
  * LinkedIn LinkedIn: driveawedge
  * Skype Skype: driveawedge
  * Twitter Twitter: simplicityadded
  * Yahoo! Yahoo!: driveawedge

About

View Dan Pritchett's profile on LinkedIn
See how we're connected
Subscribe to this blog's feed

Items of Interest

  * Dan Pritchett - Quora
  * Scaling Vectors
  * Architecting for Latency
  * How eBay Scales Presentation

Recent Posts

  * Knowledge Progression
  * Mark Down Isn't a Discount
  * Cassandra for LOBS
  * Social Commerce? What Does It Mean?
  * My Not Agile, Agile Reading
  * I'm Still Here...
  * The Beginning of Personalization
  * Let's Kill Productivity!
  * Conway's Law
  * Consistency vs. Innovation

  * Adding Simplicity - An Engineering Mantra
  * Powered by TypePad

Quantcast *