Wednesday, August 18, 2010

The two faces of the mainframe: different economics for legacy and new workloads

Until I learned about the mainframe business, I always thought that if I have a computer program (whether purchased or homemade), the cost of running it on a given machine depends exclusively on how resource-hungry the software is: how much memory it allocates, how much CPU power it consumes, how much it writes on storage media, and so forth. But I never thought that age could make a huge difference if all other factors are equal.

Then I found out that IBM's mainframe business is kind of schizophrenic. There isn't just one mainframe price matrix based on performance. There are two sets of rules: one for old ("legacy") workloads and another for the new workloads. The price points for identical levels of performance are worlds apart.

What's even more paradoxical is that contrary to conventional IT wisdom, "old" costs far more than "new". I can see why that would be the case for vintage cars, art or wine. But it flies in the face of all I thought I knew about IT economics. The explanation: negative forces are in play and override healthy market dynamics.

For several decades, the mainframe was the only platform for heavy duty computing

When most of the world economy became computerized, starting in the 1950s, the only platform capable of crunching large amounts of business data was the mainframe.

For a few decades there were smaller players competing with IBM (which was only made possible by antitrust intervention), offering plug-compatible mainframes. So it wasn't always just a single-vendor platform like today's mainframe business, or the iPhone. But for the important first several decades of the age of computing, the mainframe category as a whole was the only game in town if you meant big business (or government).

As a result, most of the world's governments, banks, insurance companies and reservation systems developed (internally or with the help of subcontractors) custom software on the mainframe for the most critical parts of their IT operations.

Computerization progressed, so they added ever more functionality to existing programs and created ever more new ones. Circumstances (such as legal parameters or business models) changed, so they kept those custom programs up to date. They had to.

Much later, the PC revolution set in and in accordance with Moore's Law, Intel CPUs became ever more powerful and narrowed the gap. IBM just presented its new zEnterprise mainframe, breaking through the 5 gigahertz barrier for the first time in mainframe history. Today's Intel CPUs reach about 3.5 GHz. Still a difference, but not a huge one.

With programming techniques that enable a whole "farm" of servers to work together and share the load, distributed computing became an intriguing, cost-efficient alternative to the mainframe. New businesses such as Google and Facebook started and scaled out impressively on that technological foundation.

But all the existing operations, most of whom existed long before Google's and Facebook's founders were even born, didn't have the luxury to start from scratch. They had to keep going and going, updating and updating, from one incremental, evolutionary cycle to the next. They were -- and still are -- chained to their mainframes.

The collective value of all mainframe legacy code: $5,000,000,000,000

Theoretically, migration is an option. Any existing piece of software can be ported to another platform. But in practical terms this is difficult for an organization's most mission-critical IT infrastructure, and it's economically a tough choice if the existing program code represents an extremely expensive business asset.

IBM itself estimates that the total value of applications residing on today's IBM mainframe systems amounts to approximately $5 trillion. $5,000,000,000,000. That's about the annual GDP of Japan. In most parts of the world, $500,000 will buy you a home, and for $5 trillion you could buy 10 million such homes.

The number is mind-boggling but realistic. I mentioned in my previous posting on mainframe economics that there are about 200-300 billion lines of mainframe program code still in use. So if you multiply that number of lines with a development cost per line in the $20 range, you arrive at $5 trillion. Those mission-critical business applications are expensive development projects requiring a lot of planning and testing, limiting the output of programmers. The average cost per line of dynamic web page code in PHP or Visual Basic is presumably much lower.

Another consideration that makes the number realistic: the collective revenues of all mainframe customers are unbelievably massive. Each of the top 585 corporations that use mainframes generates, on average, annual revenues of $36 billion. Multiplying the two numbers results in collective revenues of $21 trillion. That is substantially more than the GDP of the European Union ($16.4 trillion) or the United States ($14.3 trillion). And that's just, roughly, the top 10% of mainframe customers and only one year's revenues. So it's not too hard to imagine that over the years all of them created mainframe code worth $5 trillion.

IBM's monopoly: the only platform capable of executing legacy mainframe code

The staggering numbers I just mentioned show what an enormous leverage IBM has as the only player in the market to offer platforms (in terms of hardware and operating systems) on which that legacy code can be executed.

In mainframe lingo, the distinction is made between "legacy workloads" and "new workloads". The term "workload" stems from the multi-tenancy concept of mainframes: multiple processes running in parallel on the same system. With today's multi-core CPUs and virtualization software, an analogy exists even for Intel-based PCs.

It's laughable that IBM claims the mainframe is just a "niche" of the overall server market and denies antitrust implications by claiming fierce competition from other server platforms, especially from distributed solutions. Alternative platforms can't run those mainframe legacy programs. For companies starting from scratch -- I mentioned Google and Facebook -- there are certainly some more cost-effective choices. But not for all those banks, insurance companies or governmental agencies who depend on their mainframe every day.

Migrations that enable such customers to dump their overpriced mainframes are few and far between. Not only are they rarely found but also do they usually relate to smaller solutions, therefore not representing a significant chunk of the overall mainframe business in financial terms.

I have seen the slides of an internal IBM presentation that was given last year and whose presenter claimed that for banks, (other) financial services, reservations, transaction accounts and batch workloads there's simply "no effective alternative on distributed [platforms]". So much for a competitive market.

The sheer cost of migration isn't the only obstacle. There are also other factors that make it very difficult in technical and organizational terms, and I'll elaborate on those some other time.

z/Linux isn't the answer to the lock-in problem

So where does z/Linux -- the mainframe version of the GNU operating system and the Linux kernel -- fit into all of this?

IBM recently celebrated the tenth birthday of z/Linux. Novell's SUSE Linux is still the most popular z/Linux distribution, and there are other choices. Virtualization makes it an option to run z/OS (the proprietary mainframe operating system on which the legacy code runs) and z/Linux in parallel.

Programs that are written for z/Linux can be easily recompiled (often without source code changes) for GNU/Linux on other platforms. But z/Linux isn't the answer to the existing lock-in problem: porting legacy code from z/OS to z/Linux is just as hard as porting it to any other platform.

New workloads are the raison d'être of z/Linux. If companies want to write new program code for an operating system that is available for different hardware platforms, z/Linux is eligible. In some cases, those new workloads may very well benefit from data exchange with legacy workloads. If a bank processes most of its transactions on a mainframe (which is what virtually all major banks do), it may choose z/Linux for the generation of dynamic web pages that use data from the transaction system. So z/Linux plays a complementary role -- but it isn't a substitutive force.

There may be a few exceptions, but it's a pretty accurate portrayal of the situation that mainframe legacy workloads run on z/OS and z/Linux becomes a choice only at the start of entire new projects.

IBM wouldn't have enabled the creation of z/Linux if its availability hurt its core business in any way. There was nothing stopping companies from developing software for new workloads on GNU/Linux for other servers, exchanging data with the mainframe via the network. There are important technical advantages to running everything on the same machine (which will be the subject of another posting), but the cost of mainframe equipment is so outrageous that z/OS would have lost the business of most of the new workloads to other operating systems (with GNU/Linux obviously capturing a large share on the server side) one way or the other.

The massive lock-in tax: ten times the price for identical performance

Recognizing that many new workloads have a real platform choice while legacy workloads are hopelessly locked in, IBM came up with a way to have it both ways: "coprocessors".

I put the word in quotes because IFL (Integrated Facility for Linux), zIIP (z Integrated Information Processor) and zAAP (z Application Assist Processor) aren't coprocessors in the traditional sense, such as an arithmetics coprocessor that can only perform some auxiliary computations. Instead, IFL, zIIP and zAAP are processors that execute actual program code. They are real CPUs.

The only way in which they're limited is that IBM erected artificial barriers (through microcode changes). Those have the effect that legacy workloads can't be executed on those processors. IFL can only be used for z/Linux, zIIP only for database purposes and zAAP only for Java programs and certain XML-related computations under z/OS, while legacy workloads are generally a matter of z/OS-based COBOL programs. But again, those limitations were built in on purpose.

Customers get two economic benefits from using those limited-purpose processors. One is that mainframe software license fees are calculated based on CPU power, and those limited-purpose processors are usually not factored in. The other is that IBM sells those limited-purpose processors at prices that are hugely lower than for the general purpose chips on which everything including the legacy code can be executed.

For an example, if you buy a mainframe and you "insert" a certain amount of processing power for all purposes (including legacy workloads), you pay about ten times as much as for the same hardware component that is limited to z/Linux. So for economic reasons, a mainframe customer will want to use the expensive general-purpose component only for legacy code and make as much use of lower-cost limited-purpose processors as possible for new workloads.

Unbelievable but true

You might wonder whether the price difference -- a factor of 10 in general -- is justified by any other reason than the artificial limitation I mentioned. There isn't any other reason. You get the same processing speed and in terms of hardware (at the level of the circuitry) it's also the identical product. The limited-purpose components aren't even optimized to be particularly efficient for Linux, Java or XML or whatever. It's just that IBM determines unilaterally what you are allowed to run on them -- and what you are not.

This is a unique two-tiered business model. Customers see that IBM can, if it wants, sell them everything at a tenth of the price and still make money (otherwise IBM wouldn't do it). However, to the extent customers are required to use general-purpose processors because of the $5 trillion lock-in, they have to pay ten times as much.

Even one tenth of regular mainframe costs is still high compared to, for an example, Intel-based servers. But it's a pricing that shows there's at least a minimum of competitive dynamics in play.

Actually, there is a technical solution on the market to make legacy workloads eligible for execution on those less expensive "specialty" processors: zPrime. IBM doesn't want that one to be used. Its vendor, NEON Enterprise Software, filed an antitrust lawsuit against IBM in the US last year and recently lodged a complaint with the European Commission.

The lock-in tax reduces to absurdity any claims that z/Linux competes with z/OS or that Intel-based servers compete with mainframes. IBM couldn't charge -- for identical performance -- ten times the price if it didn't have a monopoly for platforms that execute legacy mainframe workloads. It shows that competitive server platforms and the legacy mainframe business aren't in the same market.

Think of two gas stations, located next to each other in the same street. At one of them, it costs you 60 euros to fill up your car. At the other, it costs 600 euros -- quantity and quality being equal. So no customer in his right mind would go to the place that charges 600 euros -- unless there's some reason for which certain customers don't have a choice.

This must change.

If you'd like to be updated on patent issues affecting free software and open source (including but not limited to the antitrust investigations against IBM), please subscribe to my RSS feed (in the right-hand column) and/or follow me on Twitter @FOSSpatents.