https://www.usenix.org/publications/loginonline/understanding-software-dynamics

Skip to main content

USENIX supports diversity, equity, and inclusion and condemns hate
and discrimination.

Home

  * About
  * Conferences
  * Publications
  * Membership
  * Students
  * Search
  * Donate Today

  * Sign In
  * Search

 

  * About
      + USENIX Board
      + Staff
      + Newsroom
      + Good Works
      + Blog
      + Governance and Financials
      + USENIX Awards
      + USENIX Supporters
      + 2024 Board Election
      + Board Meeting Minutes
      + Annual Fund
  * Conferences
      + Upcoming
      + By Name
      + Calls for Papers
      + Grants
      + Sponsorship
      + Best Papers
      + Test of Time Awards
      + Multimedia
      + Conference FAQ
      + Conference Policies
      + Code of Conduct
  * Publications
      + Proceedings
      + Author Resources
      + ;login: Online
      + Writing for ;login: Online
      + ;login: Archive
  * Membership
  * Students
      + Conference Fees
      + Campus Representative Program
      + Student Grant Program
  * Search
  * Donate Today

Join the conversation
Back to ;login: Online

Understanding Software Dynamics

Donate Today
by Dick Sites
March 3, 2022
Bookreview
Authors: 
Rik Farrow
Article shepherded by: 
Rik Farrow

I started reading this book in December, and am still reading it as
of March 2022. I needed that much time as there is a lot to digest in
Sites' book. Also, I've enjoyed reading it, and like other books I
enjoy reading, I often put it down when I've finished a section I
want to spend more time thinking about.

While you might think that a book with this title would only be
important to programmers, its audience should be a lot wider. SREs,
operating systems designers, realtime systems designers and hardware
designers will all find much useful information in this book. The
author's focus is on uncovering the subtle causes of long
tail-latencies, but there is much to learn here.

The book is divided into four parts. The first two parts, more than
half the book, explains measurement and observation, tools and
techniques needed to understand the design of KUTrace, but also
providing great advice for SREs and programmers. In the first seven
chapters, Sites demonstrates the importance of measuring the four
major components of computer systems: CPU, memory, disks/SSDs, and
network. He includes Jeff Dean's famous chart depicting the
approximate time for completing various system activities, such as
reading from L1 cache, from main memory on a cache miss, or time to
read from a disk. Sites adds to this a column providing the order of
magnitude for each of the times given.

Sites strongly encourages readers to estimate how long their systems
should take to complete transactions. He starts with simple
arithmetic program examples, running long loops to make operations
requiring nanoseconds take long enough to easily measure, then
pointing out where compiler optimizations will completely wipe out
loops that do nothing further with their variables. In chapter five,
he starts with a matrix operation, one that is memory bound, and
shows how interference with how cache lines get chosen slows down the
matrix multiplication. In the end, Sites has reorganized how data has
been accessed, improving performance by an order of magnitude. He
then challenges readers to improve his sample program to squeeze out
another 20% performance gain as an exercise.

In Part two, observations, Sites provides clear information about how
to log, collect, and display information. Following his theme of
measurement, he points out how best to log data so as to avoid
slowing down the very systems you need to observe. The focus is on
being able to instrument systems in production, where the maximum
slowdown acceptable must be 1% or less. Doing so involves how data is
collected, how often, and how it is stored, with this strict focus on
usability and efficiency. He goes as far as describing how data best
gets used in dashboards.

While this might seem to have strayed far from programming, Sites
points out that that you need the ability to accurately measure the
systems you are observing, and having monitoring that distorts what
you are measuring is useless in finding out where the issues that are
creating long-tail latency are coming from. Profiling, for example,
can show you where you code executes most often based on timer
interrupts, and miss those rare occasions that are causing the very
long tail latency that you are striving to uncover.

Part three describes the design of KUTrace, kernel-user trace.
KUTrace is a complete tool chain that includes kernel patches, a
kernel module, and tools for converting the millions of data points
into comprehensible figures. There is a toolchain required for moving
from the trace output, inserting log observations, converting time
skews between systems, converting into JSON and creating an SVG and
HTML page that can display the data in useful form using browsers.

Part four provides examples of using KUTrace. You can get a feel for
these chapters by reading Site's June 2020 ;login: article that
incorporates examples from the book (https://www.usenix.org/system/
files/login/articles/login_summer20_05_sit...). This final section,
entitled Reasoning, covers  execution, slow instruction execution,
waiting for CPU, memory, disk, network, software locks, queues, and
timers. Like other places in the book, I found words of wisdom here
that anyone interested in improving the performance of software
services can learn from. When thinking about whether running multiple
instances of the same program, Sites writes:

Mixing programs that run well against themselves likely will
encounter little if any interference.

That was in a chapter examining interference between compute-bound
programs. It appears obvious, but so do a lot of things you can read
in this book. And they aren't really obvious until they have been
explained.

The HTML files created using the toolchain contain an incredible
amount of information. The diagrams use symbols, text labels, 256
colors, and even Morse code, and it takes practice to make sense out
of what you are seeing. The illustrations in both the print and
electronic versions of the book are in color, but I sometimes need to
magnify figures so I can see the details I was missing. For example,
small, pointed triangles indicating the IPC at different points
overlapped so closely that I couldn't make them out without
magnification. Younger eyes may not have any trouble with the
illustrations. When working with the HTML files in a browser, you can
zoom in, mark areas of interest, as well as select other ways to
display the data.

In the Preface, Sites mentions that he got many helpful suggestions
while teaching graduate-level courses after retiring from Google.
Unpacking this a bit, you can imagine that this is a book for
graduate students and advanced, professional programmers, written by
an older man who worked at Google for many years. I think that any
senior CS student or professional can benefit by reading this book.
While all the material in the first half of the book leads up to the
use of KUTrace, the first two parts are worth reading on their own by
anyone who wants to better understand the systems they are building
and using.

Understanding Software Dynamics

by Richard L. Sites

Addison-Wesley, 2022, 465 pages

ISBN-13: 978-0-13-758973-9

Article Categories: 
Operating Systems
Programming
Hardware
Last updated February 8, 2023
Authors: 
[farrow_3]

Rik Farrow has been a consultant for 40 years. He has written two
books, as well as worked as the technical editor for a UNIX magazine
and for two editions of a popular operating system book. He also
taught UNIX system administration and Internet security during the
90s internationally, and worked as a volunteer for USENIX program and
steering committees. Rik has been the editor of ;login: since 2005.

rik@rikfarrow.com

  * Log in or Register to post comments

Home

(c) USENIX
Website designed and built
by Giant Rabbit LLC

  *  
  *  
  *  
  *  

  * Privacy Policy
  * Contact Us

Sign up for Our Newsletter:
[                    ] [                    ] [                    ]
[Submit]