11 Jun Welcome (to) The Machine by HP

Every generation of technology carries legacy artifacts and limitations from its formation. The user experience (UX) of everything before the Macintosh was based on a command line interface (CLI), which is still present as a deep control mode for mainframes, servers, and even PCs – even though most users have forgotten how to use it or have never developed a need to know. The UX of the Macintosh, which borrowed from Xerox and in turn heavily influenced Windows, was initially based on WIMP – windows, icons, mouse and pointer. Anyone who has used a modern Mac or Windows PC can attest to touch still not quite being a first-class citizen in a keyboard-first UX design. Smartphones and tablets are great for touch UX, but it’s amusing to see people yelling voice commands into them. And so progress marches on.

Some assumptions are even more basic – they were baked into the very foundations of electronic computing over half a century ago. For instance, mainframes were initially designed to run a single program from start to finish – called “batch” jobs. The concept was (and is) that a computer would run one program at a time, and each program owned the machine it was running on while it was running – the running program had exclusive access to all of the resources, all of the time, until it completed, if it completed (not a given). Often the only way to stop an inadvertent infinite loop was to reset the machine. We invented time slicing so that we might save the state of one job, run another job for a while, and go back to the first. Then we invented virtualization to hide the actual state of the machine so that a scheduler might play fancy tricks with exposing some features to applications code, but not all of them – and we learned how to expose different features to different applications running in the same virtual environment.

But, at the core, our enterprise hypervisors and operating systems have a frighteningly simple set of assumptions they inherited from simpler days, when electricity was cheap and dinosaurs ruled the Earth:

  • System memory is called “primary” memory– memory that is immediately accessible to compute resources. Secondary storage is where data goes to rest for a while when it is not needed at the moment – today we hard disk and solid state drives as secondary storage. Tertiary storage is usually considered archival media today – tape drives, optical media, and the like.
  • Primary memory – system memory, all of it, is assumed to run at the same speed. There are a variety of causes for this assumption persisting, some involving the relative ease of simply enforcing least common denominator memory configurations and some because no one invested in challenging conventional wisdom for OS design.
    • Non-uniform memory architectures (NUMA) let us split memory into pools with latency penalties for accessing memory in a pool far away, but the pools are still assumed by default to have homogeneous performance.
    • Many embedded and mobile OSes can run code from non-volatile memory, but there are no performance assumptions behind that feature – no OS general constructs to say that some code, but not other code, should be loaded into system memory before running it, for better performance, for instance.
    • Clever processor cache hierarchies do not count, they replicate memory addresses as a bandage for this very deep and very old assumption.
  • Processors are assumed to be expensive, scarce commodities. Code and data both must be moved into the processor’s system memory in order for the code (was a program, now is an application) to perform some kind of operation based on the data or apply a transformation to the data.
  • Processors in a system are assumed to run a uniform instruction set architecture (ISA). At least the “real” processors running the core applications should run the same ISA – our current mainstream virtualization environments make significant compromises to enforce a least common denominator ISA across a pool of processors. We use a homogeneous core processor complex to dole out code and data to offload engines, and while we are just starting to let those offload engines access their data from system memory as “first-class citizens,” we still use a homogeneous core ISA to schedule what are effectively batch jobs on increasingly sophisticated and specialized offload engines.

The substantial architectural impacts of these assumptions include:

  • In a “real-time” system, external inputs need to be collected in a file or dedicated buffer and then moved into system memory so that code might use it to make decisions or transform it in some way. We improved performance on the movement part by enabling direct memory access (DMA) offload engines to automatically move data from buffers to a processor’s system memory without directly involving the processor. But as people and things continue to spew increasing amounts of data, moving data from end-points to secondary storage and moving data between secondary storage systems becomes a structural problem we call “data ingress” – how fast can a receiving system digest incoming data?
  • Processors that share a physical pool (or multiple pools) of memory must do so in a coherent fashion, to prevent time-based errors such as one processor reading data from shared system memory that another processor has already changed but not yet written back to shared system memory. We are starting to see looser definitions of coherence for some workload types, but even they must enforce “eventual” consistency.
  • File systems are assumed to contain one known-good copy of a file, mostly to prevent human confusion. Versioning file systems have been around for a while – when was the last time you explicitly used one? And when was the last time you put a version indicator explicitly in a file name? We keep an offline backup of all of our files in case the (singular) online version becomes corrupt. Or we don’t and we lose files. Newer distributed storage systems, such as those that slot into OpenStack’s Swift APIs, are starting to break this convention, but the idea of deduping files in a NAS or SAN stems from this primitive assumption.

It is sobering to think that many of these core assumptions have not changed substantially in decades. Our hardware has improved by orders of magnitude and scaling out clouds is challenging some of our ideas about coherence and file systems, but our widely used operating system environments are still stuck in the past.

HP’s new system architecture concept, based on their memristor memory design and photonic communications (silicon photonics and optical media), will enable us to fundamentally rethink computer architecture at a wide range of scales. It’s called simply “The Machine.”

The easy prediction is that first-generation memristor designs will use memristor-based system memory DIMMs as yet another level of cache (YALOC). We believe this will be the least interesting use for memristor DIMMs. Even though eventually memristor DIMMs will eventually be fast enough to displace DRAM DIMMs, practically speaking that won’t happen for a while. That means, in order for HP to ensure success for memristor-based memory, HP must enable a heterogeneous memory universe. The real opportunity comes in mixing memory types with different features and performance characteristics over optical communications media.

And that’s why HP has invested in Machine OS as a fundamental enabler for The Machine. And why one of the first things HP intends to do is to open source the Machine OS architecture and design. Core OS features are not reinvented every decade – involving as many people as possible will give HP new implementation ideas, but also new usage models.

This is big stuff. We have heterogeneous networks. We are well down the path to heterogeneous processing. However, heterogeneous storage is viewed as a mix of secondary online and tertiary offline storage media, but that is missing the point.

The Machine is offering a new model for heterogeneous primary memory. It has the potential to make the concept of “offline” seem quaint. And it could conceivably span end-points to datacenters. The Machine is about as disruptive as new tech can get.