nonesuch-beast

The Nonesuch Beast

I keep running into the same problem.

This problem, or maybe it's a class of problems, is hard to talk about because of its inherent complexity. But not being able to articulate the problem well hasn't actually deterred it from coming to bite me repeatedly. So I'll try to talk through it. Bear with me.

Let's start with "the documentation problem". You know, tech docs. Everyone knows it's a problem. Everyone says we need a great documentation system: the all-singing, all-dancing, uber-documentation system that's simple to use but does everything. You know the one I'm talking about — you've wished for it yourself, I'm sure.

Close your eyes for a minute and visualize this uber- documentation system. What properties does it have? Here are a few "obvious" ones that everyone mentions:

1) It should let me create documents. You know, sort of like a blog lets me create documents, except I want templates for all the standard document types we have at Amazon.

2) It should let me upload documents, and store them somehow. I certainly don't want to have to worry about mundane details like persistence, not when I'm busy writing docs for my project.

3) It should let me search for documents easily. It should have a good search engine, you know, like Google. Google is good.

4) Google sucks for browse, though. This UberDoc system should also have a browse tree, and I, for one, should not have to maintain it. Maybe it could automatically create a non-sucky browse tree by letting someone (other than me) attach heirarchical keyword attributes to docs. Or something. I just know how it should work; don't make me design it.

5) It ought to notify people when documents are stale. You know: it should, like, email someone. If the owner of the document doesn't exist anymore, maybe it emails the dev-services org, and they have to go find an owner for the doc. But definitely it should have owners, and staleness notifications.

6) It should have version tracking. And permissions. And a good content-authoring tool so I don't have to learn any technical stuff. I like WYSIWYG. But it should let me use HTML if I want to. Oh, and it should let me have attachments. And let users put in comments about the doc that aren't inline with the doc. Definitely meta-comments. And voting. People like to vote on things. I vote that UberDoc should have a voting system. Oh, and stylesheets. I hate it when they make me use their font. And...

7) It should be really simple to use. You know, kinda like how Wiki is simple. Except better than Wiki. Wiki's simple, but it's basically the internet equivalent of a refrigerator covered in post-it notes. I need something more powerful than that, for my projects. But it has to be simple.

Are you starting to see the outlines of the problem I'm talking about? I heard my boss use the term "irreducible complexity" once, and it had a nice ring to it. I think that's a good partial description of this class of problems.

The seven "obvious" properties I listed above are obvious only in isolation. When you combine them, you run into problems. One problem is that some of the wish-list items are poorly specified at best. If you were to drill into the details of the templates in #1, for example, you'd discover that everyone in the room had different ideas about what the templates should be. That's a rathole I won't touch further in this blog entry — but it's a subject that could be a fully mature rant, peer to this one. (That's what I write, you know — blog rants. Usually after having a few glasses of wine.)

The really nasty problem with the seven requirements above is that they're mutually incompatible. #7 is the offender here: it's incompatible with all of the other requirements. In fact, I'm going to argue, later in this BlogRant, that #7 is incompatible with itself, provided I don't first pass out on the floor.

Fundamental Tradeoffs

Here's how I think of it: it's an unavoidable tradeoff between feature-richness and usability. I'm finding that the general problem of design — whether it's software design, API design, UI design, or interior decorating for that matter — is hard because designs always suffer from fundamental tradeoffs. No matter what choice(s) you make, someone's going to be unhappy.

There's nothing novel about this idea: Computer Science is filled with fundamental tradeoffs like space complexity vs. time complexity, or static vs. dynamic type systems. There are plenty of problems for which you know you're going to have to live with some pros and some cons for any choice you make.

In fact, a design problem wouldn't be a design problem if it didn't have a least one hard choice to make. It would just be a done deal, and nobody would talk about it. The thing that makes design problems hard is that they make you choose the lesser of two evils, or worse, the least among N evils.

Well I'm here to tell you that the Uber Documentation system, as outlined by the Seven Guiding Principles above, is in fact a logical impossibility. Sorry to disappoint you.

To make my case, let's take a look at Wiki. Good Old Wiki, the Post-It™ note of the future. What are the observable properties of Wiki?

1) You can learn how to use it in four and a half minutes, even if you're on a concall with the president, and someone's escaped ferret has just run under your desk. With no distractions, you can learn it in 2 minutes. And believe me, that's all anyone wants to spend.

2) You can find what you're looking for, as long as you know pretty much exactly what you're looking for. Or you can surf and stumble on things randomly. Both are admirably useful in some situations.

3) You can create formatted documents using nothing but a web browser. There's no software download, no user manual, nothing. Just cough up an idea, et voila! It's now available for everyone else to try to find.

4) It's easy to create links to other documents inside the Wiki or out on the web. Some Wikis let you do attachments.

5) Other people can edit your work, for good or for evil.

Regarding #2 (search) — I was talking to a co-worker about this problem, and my co-worker insisted that Wiki search was great. I pretended to be trying to illustrate my point, and said "OK, think of... what was that project we did a few years ago, you know, the one where you could do bulk orders?" I knew darn well I was talking about Institutional Buying. My co-worker knew exactly which project I was talking about, but couldn't remember the name of it. "You mean the one where you could invoice things, right? Yeah, where big buyers could order lots of an item, yeah, what was that called again?"

Well, in Wiki, if you did a title search, you'd come up blank, because you used neither "Institutional" nor "Buying". You might get lucky with a content search, maybe, but more likely you'd just get a bunch of hits on supply chain or German-invoicing projects, or on stuff that's not project docs at all. Given that Wiki has no concept of namespaces, or heirarchies, or even AOL/IMDB-style keywords, there's no way to refine your search, and you might never find the thing you were looking for. [Note: co-worker was convinced by my trick — Wiki search sucks.]

Our Wiki really doesn't offer much functionality, but it sure is popular. It's pushing {Amazon confidential} documents now, which is undoubtedly the single largest shared repository of documentation at Amazon, and it's getting a lot of mileage.

Heck, people even use Wiki for making mission-critical sev-1 and sev-2 debugging information available to others, even though they knew they were introducing a dependency from our customer-facing, revenue-generating website to a system that could best be described as "sometimes available".

Here's how I feel about those people:

(I have deleted the 10 rewrites of how I feel about them. Let's just leave it as this parenthesized note.)

In any case, it does show that Wiki is a very popular tool, and that its simplicity is appealing enough to make people want to use it for... all sorts of stuff.

The Sweet Spot

Let's look at this from another angle: if you could add anything to Wiki, what would it be? What is the ONE feature that Wiki needs the most? Let's consider performance and availability not to be features for this discussion — assume someone is fixing those things. I want you to think of a genuine feature-request.

What would you add? Browse? Better search? Staleness notifications? Attachments? More formatting primitives? Templates?

In my opinion, there isn't a single feature that Wiki needs desperately — or someone would have added it already. (I'm referring to the open-source versions, not just our version.)

There's a product called Twiki, and we have it at Amazon. It has more features than Wiki, and some people like it. I personally don't know how to use it, because I took one look at it and thought: Gosh, I really ought to take the time to learn Twiki someday.

That's because it was patently obvious that it was going to take me longer than 2 minutes to learn. There are only two things that could induce me to try to learn Twiki:

a) Everyone in the universe starts using it, and it dawns on me that this is going to be like mp3s all over again, where even my 10-year-old cousins knew how to use Napster before I'd so much as heard they were getting shut down. I'd learn Twiki if it were clear that I'm really one of the last people on earth who hasn't learned it yet.

b) I somehow discover, probably via hallway discussion, that Twiki solves a particular problem I'm facing significantly better than any other product I use.

I'm not going to learn Twiki because Wiki is "good enough" for my purposes. Yeah, it sucks, but not bad enough to make me want to switch to something else.

Guess what? This is how we all operate, all the time. It's the principle of Pain Minimization, and it works pretty well, all things considered.

I didn't learn to use blogs until I realized it was a *much* better (i.e., easier) way to get random rants posted on the web than what I'd been doing before, namely, desperately trying to find out what the hell infrastructure had named my home directory (and what boxes it was accessible from) in the year since I'd last put something on my home page. Even then, I didn't learn it until I also realized I was one of the last people on earth not using blogs.

A "sweet spot" is (to my knowledge) the point on the face of a golf club that delivers the maximum force, with minimal wasted energy. It's really nice to hit the ball on the sweet spot, since it hooks or slices much further than usual. Since business people like sports metaphors, it's come to mean something like "juuuuust right", sort of like the third bear's bowl of porridge.

Getting a big population to adopt some new technology like TiVo or CDs or MP3s or Java or any other fad requires that it be in the sweet spot — it has to sit right at the nexus of being easy to learn, easy to use, easy to describe to others, and full of cool new features.

This doesn't happen very often. Why? Because the first requirement, "easy to learn", has to be true for people who are really busy, really dumb, or often both (like me).

Wiki is juuuuuuust right. Wiki is sitting in a sweet spot. If you add too much to it (maybe anything), then it might tip just enough that new users look at it and think: Gosh, I should really learn that someday.

But we're talking about mass-market adoption here (where "mass market" could mean "all Amazon employees" or "all US citizens" — a majority of any community, basically). What about specialty groups? Wiki doesn't help you at all if you're a PM trying to manage some huge project like Institutional Buying. Heck, people can't even find your docs in Wiki, unless they can remember the project name.

What do the PMs do, then? Wiki's clearly not the solution.

Well, PMs need one kind of system. Engineers need another kind of system. Executives need still another, and admins yet another. Case in point: some Executive Assistants here have asked me if it would be possible to put up a simple intranet content-management system; they just want to be able to create and update team pages, essentially. I can envision what they want, and I can assure you that it wouldn't make the PM crowd happy, nor would a tech-spec repository help the EAs much.

Hopefully I've convinced you by now that the UberDoc system doesn't exist. Even if it could exist, I'm not sure who would build it. Certainly not Developer Services, or we'd design it for developers. If there were a PM Services group (and maybe there should be!), they'd design it for PMs. And an Everyone Services group seems like a bad idea.

So it can't exist, and it can't be built. Disappointed? I was. I know how it feels to want the UberDoc system, since I write a lot of documentation. I want a content-creation system, and a content-management system, and good search, browse and retrieval facilities. I want attachments instead of links. I want better formatting capabilities without needing to actually commit the syntax to memory. I want it all, just like you.

But we aren't gonna get it. It's a pink elephant, a Nonesuch Beast.

But Wait — My Vendor Sales Rep Says Otherwise!

Ah. Yes, those sales reps. Veeeeeery slippery. I bet you're thinking of half a dozen products, none of which you can remember the names of right this second, that seem like they ought to offer the Seven Obvious Features that I started this blog entry with.

Well, in my nearly six years at Amazon, we've been in a perpetual state of "evaluating a vendor's documentation product". But we're not using one, not universally. The ones that we have, e.g. iBase, are pretty much reviled universally, so that sort of counts. It's like Stroustroup said: there are two kinds of programming languages — the ones nobody uses, and the ones everybody bitches about. What a charming guy, that Bjarne is.

But let's consider a different problem space: one that suffers from the exact same "irreducible complexity" problem. It's a system that everyone asks me for, and everyone says they can visualize, but that nobody wants to sit down and spec, because they'd prefer to leave it at the hand-waving stage: metrics.

Let me start with the obvious: everyone hates metrics. They hate creating metrics, that is; nobody minds looking at them if they're ready-made and nicely formatted.

People don't like instrumenting things because they're not being rewarded for it. They're ostensibly being rewarded for (and measured on) improving the things that they're measuring, not doing the measurement itself. So doing metrics is just an ugly necessity that needs to be addressed before you can start analyzing your problem domain and looking for ways to improve it.

But metrics seems like it should be soooooo eeeeeeeeasy. I know what metrics look like: they're these simple charts and graphs; I see them all the time. And we always seem to be doing the same kinds of metrics: performance, financials, availability, work-item backlogs. Anyone who chooses to think about the problem for 2 minutes will inevitably conclude that there ought to be an UberMetrics system that meets everyone's needs, or failing that, at least their own needs, so they don't have to roll their own.

The "requirements" I hear bandied about include wishes such as:

— I should be able to get data inputs that other folks are measuring, trivially, and include them in my own functions. People are measuring things all over the company, and those measurements ought to be available to me.

— I should be able to use widgets (e.g. graphers, control charts, pivot tables) that other people have built on my data sets, since my data is just data. Because these widgets are ubiquitous, I should not need to learn how to use them.

— My persistence requirements are simple: I just need to be able to store arbitrary quantities of data for all eternity, and query them trivially in real-time with a point-and-click interface, with no detectable performance issues. How hard can that be? It was easy to think, so it must be easy to do.

— I should be able to click on things. If I see a graph with an anomaly, I should be able to click on it and see the underlying data that produced the graph. I should be able to apply this rule recursively, even if I don't know what recursion is, until I can see the very ants crawling into the boxes on the loading dock.

— I should be able to do simple what-if scenarios, in which I say things like: "what if this sales figure had been twice as high a year ago?" I should be able to assemble these scenarios by using my mouse, or failing that, stating my queries in plain English.

— I shouldn't have to apply any developer resources to this problem. I may be able to spare a perl scripter or web-dev for a few weeks, but my developers are all busy implementing features. I cannot spare any time for metrics, nor should I need to, since they should be simple to assemble via a unified graphical interface.

Yes, Amazon employees have told me these things with straight faces; in fact some of them were actually yelling at me.

If you took the right (small) subset of the features above, you'd have the Data Warehouse. Another very small subset would get you Microsoft Excel. Still another view might get you the current Pizza Portal. But I can assure you that nothing that meets all the requirements above exists in the world today, and I'm not about to try to build it.

Lest this blog entry go on forever, I'll stop trying to use Proof by Exhaustion (i.e., by going into the complex details until you and I both die of exhaustion), although that effort would certainly succeed.

Instead, I'll use some simple case studies and existence "proofs", and we'll leave it at that for today.

Case Study #1: Actuate. We happen to have a license for a product called Actuate that purports to be the all-singing, all-dancing UberMetrics system. Here are some of its features:

— it can be instructed (via a custom scripting language) to pull data from multiple, heterogenous sources; e.g. flatfiles, databases, Excel files, etc.

— it can aggregate data samples into an intermediate database that can then be used for metrics/reporting.

— it generates fancy, web-based, click-through reports, similar to those produced by, say, Crystal Reports (another large, expensive 3rd-party metrics package.)

— it can be configured to run automated reports, or to do manual one-off reports, or even one-off queries.

— it has an Excel-like spreadsheet/graphing application, web-based, written in Java, that you can pull up to do ad-hoc slicing and dicing of your data.

I'm betting that MANY OF YOU READERS nearly broke your noses just now, because you dozed so hard your face almost made contact with your desk. I used all-caps above help guide your eye past the boring parts. Forgive me.

Actuate is basically the closest thing in existence to what everyone is envisioning, or at least it was 3 years ago when we (ERP) evaluated it and purchased it.

It's basically a dead initiative. Nobody's using it. Everyone who has tried to use it, those still left at Amazon anyway, say it's absolutely awful, and they laugh heartily when you ask for pointers to docs. It's not that we don't have docs; they were just thinking of something really funny that they don't want to share with you right now.

Why did Actuate fail? Can it be revived? I've spent a little time digging into it — not much, mind you — and it appears that it failed for a combination of reasons:

a) it required too much programming sophistication to make it do the things we wanted it to do. We were unwilling to put senior developers on something as trivial as metrics, and less experienced developers simply found it obtuse.

b) it was so finicky and complex that we needed quite a bit of support from the vendor, who, after we had pretty much negotiated them down to a shiny nickel, didn't really want to talk to us anymore.

c) it was (and is) closed-source, and the docs weren't all the vendor claimed they'd be (which doesn't surprise me, given that you can't actually read any of the whitepapers on the website unless you purchase some sort of license), so we really had no way of digging in and figuring out how to use it.

The three reasons above are (slightly) contradictory and incompatible, but again, I didn't dig into it that much. Maybe it can be made to work. But it certainly looks as if it's a product that offers a great deal of flexibility, but that flexibility comes at the cost of a painful learning curve and a lot of sophistication in order to use it properly.

Case Study #2: Amazon's Data Warehouse. This is a work in progress, and I don't want to bash on it, since it does its job well for a certain segment of our biz users. But the DSS UI (which we invested a lot of time in building) wasn't even *close* to the hand-wavy UI magic that people have been asking me to build, and it doesn't (and can't) hide the complexity of our data model. You have to be something of an expert in order to use the Data Warehouse. This isn't a bad thing, but it does show that our most successful metrics project to date still doesn't come close to meeting the needs of the hand-wavers.

Case Study #3: Microsoft Excel. If you've used Excel much, you'll see that it's a pretty good tool. In fact it's quite powerful — not surprising, really, given that Microsoft has invested hundreds of person-years and hundreds of millions of dollars (and tens of millions of lines of code) into designing and implementing it. You'd hope that it would be a pretty good tool.

But Excel doesn't come even close to meeting the UberMetrics requirements; for starters, it doesn't even have a database. Even if you simply chose to make it the frontend, and assumed that some suitable persistent backend were built to interoperate with Excel, you'd still find that Excel isn't what people are all wishing for. Why? Because it's hard to learn, and hard to use properly. You can't just click and drag and expect to have a beautiful chart. Heck, I have a thick O'Reilly book on Excel, and I've seen plenty of other books on it too. You have to commit a tremendous amount of time in order to develop proficiency with it.

So... how about that UberMetrics system? It should be clear by now that you can't even talk about such a system without occupying hundreds of pages of written material. I've barely scratched the surface here. Metrics systems, much like documentation systems, have a fundamental tradeoff: you can have a complex system with lots of features, or you can have a simple system with few features. But not both.

What's the Solution, Steve?

Like I said: I keep running into this problem.

Part of the solution is education (hence me taking the time to write this blog entry.) People need to understand when they're asking for something that's going to be too complex to learn and use.

Part of the solution is to keep trying, or at least keep looking. The most popular technologies are simple ones, but using them, you can sometimes build relatively complex systems. This blog is a simple(-ish) technology — at least, simple to use. (MovableType's source code is anything but trivial; even installing it takes half a day.) But the blog is sitting in an Apache-based system, on a Linux-based system, and it's using TCP networks, and browsers, and operating systems, and graphical displays, and other "simple" (to use) technologies to get my words to you.

There are some simple-to-use building blocks that can be glued together to form reasonable metrics systems. Relational databases and SQL are one such building block. Charting tools like Excel (and charting widgets like JFreeChart or GD::Graph) are another building block. If we keep our eyes open, we may find (or build) a better set of building blocks that are individually simple to use, but allow our teams to produce their metrics more efficiently than they can today. One can always hope. Who knows — maybe someone will figure out a way to use MRT, or the Idea Tool, to help with their metrics generation.

Part of the solution is for teams to just buckle down and do the bare minimum. Every team needs to produce some documentation, and some metrics. Managers and directors have to realize that this is part of the package; when you assign a bunch of SDEs and PMs to a task, some of their time is going to be "overhead" spent on docs and measurements.

But they don't need to build whiz-bang, fully-generalized, reusable metrics systems. That's not what my contact-reduction team did in 2003 — we knew we had to reduce customer contacts, so we scraped out just enough metrics to be able to identify the major root causes, and then we went to work on eliminating those causes. My boss would have liked for us to produce a generalized, self-sustaining metrics system that forever after allowed people to drill down on customer contacts with a point-and-click interface. I would have liked that as well. But it would have taken twenty staff-years, and it would have ultimately been unsatisfying to the users, who would still have had to learn how to use it.

Irreducible complexity. That nine-syllable phrase is, itself, irreducibly complex. It's over half a haiku. It's hard to talk about systems like this, and it's easy to wish that they were easy. Just keep in mind that unless (like Wiki) you can describe its entire operation in a few sentences, it's probably more complicated than you're giving it credit for. And it's not going to be in that sweet spot, so you're not likely to adopt it, even if someone does try to build it for you.

And don't file sev-2 tickets on Wiki!

(Published Aug 18th 2004)

Back to Stevey's Drunken Blog Rants™