A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 12KB

8 months ago
12345678910
  1. title: command center: Simplicity
  2. url: https://commandcenter.blogspot.com/2023/12/simplicity.html
  3. hash_url: 6b26bff7f4772cf8fb78878ff4f9594f
  4. archive_date: 2024-02-18
  5. og_image: https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQvTZvSIAYokvx00Scs3gCg8rR2X74iruzto6wDM1Tth-ZeUPrEH01XM3NPwhLa62ga6pQlMMMzKXrsh3rP_BzOSL4eahGtRZw1LaESnDhmHFmot1OcEqsKsMr84KED9HO4m2F7VJcksZrE-U7WtacxMmXeRTeaveAIEvmHwvs36TOfSnbizPzzw/w400-h295/golang.org-2009.png
  6. description: In May 2009, Google hosted an internal "Design Wizardry" panel, with talks by Jeff Dean,  Mike Burrows, Paul Haahr, Alfred Spector, Bill Cou...
  7. favicon: https://commandcenter.blogspot.com/favicon.ico
  8. language: en_US
  9. <p><span>In May 2009, Google hosted an internal "Design Wizardry" panel, with talks by Jeff Dean, </span><span>Mike Burrows, Paul Haahr, Alfred Spector, Bill Coughran, and myself.</span><span><span> Here is a lightly edited transcript of my talk. Some of the details have aged out, but the themes live on, now perhaps more than ever.</span></span></p><p><span>---</span></p><p class="p1"><span class="s1"><span><br></span></span></p><p class="p1"><span class="s1"><span>Simplicity is better than complexity.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Simpler things are easier to understand, easier to build, easier to debug, and easier to maintain. Easier to understand is the most important, because it leads to the others. </span></span><span>Look at the web page for google.com. One text box. Type your query, get useful results. That's brilliantly simple design and a major reason for Google's success. Earlier search engines had much more complicated interfaces. Today they have either mimicked ours, or feel really hard to use.</span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>That's google.com. But what about what's behind it? What about GWS? How you do you invoke it? I looked at the argument list of a running GWS (<a href="https://en.wikipedia.org/wiki/Google_Web_Server" target="_blank">Google Web Server</a>) instance. XX,XXX characters of configuration flags. XXX arguments. A few name backend machines. Some configure backends. Some enable or disable properties. Most of them are probably correct. I guarantee some of them are wrong or at least obsolete.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>So, here's my question: How can the company that designed google.com be the same company that designed GWS? The answer is that GWS configuration structure was not really designed. It grew organically. Organic growth is not simple; it generates fantastic complexity. Each piece, each change may be simple, but put together the complexity becomes overwhelming.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Complexity is multiplicative. In a system, like Google, that is assembled from components, every time you make one part more complex, some of the added complexity is reflected in the other components. It's complexity runaway.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>It's also endemic.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Many years ago, Tom Cargill took a year off from Bell Labs Research to work in development. He joined a group where every subsystem's code was printed in a separate binder and stored on a shelf in each office. Tom discovered that one of those subsystems was almost completely redundant; most of its services were implemented elsewhere. So he spent a few months making it completely redundant. He deleted 15,000 lines of code. When he was done, he removed an entire binder from everybody's shelf. He reduced the complexity of the system. Less code, less to test, less to maintain. His coworkers loved it.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>But there was a catch. During his performance review, he learned that management had a metric for productivity: lines of code. Tom had negative productivity. In fact, because he was so successful, his entire group had negative productivity. He returned to Research with his tail between his legs.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>And he learned his lesson: complexity is endemic. Simplicity is not rewarded.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>You can laugh at that story. We don't do performance review based on lines of code.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>But we're actually not far off. Who ever got promoted for deleting Google code? We revel in the code we have. It's huge and complex. New hires struggle to grasp it and we spend enormous resources training and mentoring them so they can cope. We pride ourselves in being able to understand it and in the freedom to change it.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Google is a democracy; the code is there for all to see, to modify, to improve, to add to. But every time you add something, you add complexity. Add a new library, you add complexity. Add a new storage wrapper, you add complexity. Add an option to a subsystem, you complicate the configuration. And when you complicate something central, such as a networking library, you complicate everything.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Complexity just happens and its costs are literally exponential.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>On the other hand, simplicity takes work—but it's all up front. Simplicity is very hard to design, but it's easier to build and much easier to maintain. By avoiding complexity, simplicity's benefits are exponential.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Pardon the solipsism but look at the query logging system. It's far from perfect but it was designed to be—and still is—the only system at Google that solves the particular, central problem it was designed to solve. Because it is the only one, it guarantees stability, security, uniformity of use, and all the economies of scale. There is no way Google would be where it is today if every team rolled out its own logging infrastructure.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>But the lesson didn't spread. Teams are constantly proposing new storage systems, new workflow managers, new libraries, new infrastructure.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>All that duplication and proliferation is far too complex and it is killing us because the complexity is slowing us down.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>We have a number of engineering principles at Google. Make code readable. Make things testable. Don't piss off the SREs. Make things fast.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Simplicity has never been on that list. But here's the thing: Simplicity is more important than any of them. Simpler designs are more readable. Simpler code is easier to test. Simpler systems are easier to explain to the SREs, and easier to fix when they fail.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Plus, simpler systems run faster.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Notice I said systems there, not code. Sometimes—not always—to make code fast you need to complicate it; that can be unavoidable. But complex systems are NEVER fast—they have more pieces and their interactions are too poorly understood to make them fast. Complexity generates inefficiency.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Simplicity is even more important than performance. Because of the multiplicative effects of complexity, getting 2% performance improvement by adding 2% complexity—or 1% or maybe even .1%—isn't worth it.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>But hold on! What about our Utilization Code Red?</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>We don't have utilization problems because our systems are too slow. We have utilization problems because our systems are too complex. We don't understand how they perform, individually or together. We don't know how to characterize their interactions.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>The app writers don't fully understand the infrastructure.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>The infrastructure writers don't fully understand the networks.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Or the apps for that matter. And so on and so on.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>To compensate, everyone overprovisions and adds zillions of configuration options and adjustments. That makes everything even harder to understand.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Products manage to launch only by building walls around their products to isolate them from the complexity—which just adds more complexity.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>It's a vicious cycle.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>So think hard about what you're working on. Can it be simpler? Do you really need that feature? Can you make something better by simplifying, deleting, combining, or sharing? Sit down with the groups you depend on and understand how you can combine forces with them to design a simpler, shared architecture that doesn't involve defending against each other.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>Learn about the systems that already exist, and build on them rather than around them. If an existing system doesn't do what you want, maybe the problem is in the design of your system, not that one.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>If you do build a new component, make sure it's of general utility. Don't build infrastructure that solves only the problems of your own team.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>It's easy to build complexity. In the rush to launch, it's quicker and easier to code than to redesign. But the costs accumulate and you lose in the long run.</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>The code repository contains 50% more lines of code than it did a year ago. Where will we be in another year? In 5 years?</span></span></p><p class="p2"><span><span class="s1"></span><br></span></p><p class="p1"><span class="s1"><span>If we don't bring the complexity under control, one day it won't be a Utilization Code Red. Things will get so complex, so slow, they'll just grind to a halt. That's called a Code Black.</span></span></p>