[Virtualization] Introduction

Why study virtualization?

  • Almost all cloud applications run in the virtualization environment
  • Most IT infrastructures run in the cloud or on-prem virtualization environment
  • Understanding virtualization is key to building cloud infrastructures
  • Understanding virtualization will help application design

Operating Systems

  • A piece of software that manages and virtualizes hardware for applications
    – An indirection layer between applications and hardware
    – Provides a high-level interface to applications
    – While interact with hardware devices with low-level interfaces
    – Runs privileged instructions to interact with hardware devices
  • Applications
    – Can only execute unprivileged instructions
    – Perform system calls or faults to “trap” into OS
    – OS protect applications from each other (to some extent) (e.g., address space)

Virtualization

  • Adding another level of indirection to run OSes on an abstraction of hardware
  • Virtual Machine (Guest OS)
    – OS that runs on virtualized hardware resources
    – Manages by another software (VMM/Hypervisor)
  • Virtual Machine Monitor (Hypervisor)
    – The software that creates and manages the execution of virtual machines
    – Runs on bare-metal hardware

History

Mainframes and IBM

  • Before we have datacenters or PCs, there were giant metal frames
  • Support computational and I/O intensive commercial/scientific workloads
  • Expensive (IBM 704 (1954) costs $250K to millions)
  • “IBM and the seven dwarfs” – their heyday was the late ’50s through ’70s

Issues with Early Mainframes

  • Different generations were not architecturally compatible
  • Batch-oriented (against interactive)
  • Meanwhile, ideas started to appear towards a time-sharing OS
  • The computer was becoming a multiplexed tool for a community of users, instead of being a batch tool for wizard programmers

IBM’s Response

  • IBM bet the company on the System/360 hardware family [1964]
    – S/360 was the first to clearly distinguish architecture and implementation
    – Its architecture was virtualizable
  • The CP/CMS system software [1968]
    – CP: a “control program” that created and managed virtual S/360 machines
    – CMS: the “Cambridge monitor system” — a lightweight, single-user OS
    – With CP/CMS, can run several different OSs concurrently on the same HW
  • IBM CP/CMS is the first virtualization systems. Main purpose: multiple users can share a mainframe

IBM’s Mainframe Product Line

  • System/360 (1964-1970)
    – Support virtualization via CP/CMS, channel I/O, virtual memory, …
  • System/370 (1970-1988)
    – Reimplementation of CP/CMS as VM/370
  • System/390 (1900-2000)
  • zSeries (2000-present)
  • Huge moneymaker for IBM, and may business still depend on these!

PCs and Multi-User OSes

  • 1976: Steve Jobs and Steve Wozniak start Apple Computers and roll out the Apple I, the first computer with a single-circuit board
  • 1981: The first IBM personal computer, code-named “Acorn,” is introduced. It uses Microsoft’s MS-DOS
  • 1983: Apple’s Lisa is the first personal computer with a GUI
  • 1985: Microsoft announces Windows
  • The PC market (1980-90s): ship hundreds of millions of units, not hundreds of units
  • Cluster computing (1990s): build a cheap mainframe out of a cluster of PCs

Multiprocessor and Stanford FLASH

  • Development of multiprocessor hardware boomed (1990s)
  • Stanford FLASH Multiprocessor
    – A multiprocessor that integrates global cache coherence and message passing
  • But system software lagged behind
  • Commodity OSes do not scale and cannot isolate/contain faults

Stanford Disco and VMWare

  • Stanford Disco project (SOSP’97 Mendel Rosenblum et al.)
    – Extend modern OS to run efficiently on shared memory multiprocessors
    – A VMM built to run multiple copies of Silicon Graphics IRIX OS on FLASH
  • Mendel Rosenblum, Diane Greene, and others co-founded VMWare in 1998
    – Brought virtualization to PCs. Main purpose: run different OSes on different architectures
    – Initial market was software developers for testing software in multiple OSes
    – Acquired by EMC (2003), which later merged with DELL (2016)

Server Consolidation

  • Datacenters often run many services (e.g., search, mail server, database)
    – Easier to manage by running one service per machine
    – Leads to low resource utilization
  • Virtualization can “consolidate” servers by hosting many VMs per machine, each running one service
    – Higher resource utilization while still delivering manageability

The Cloud Era

  • The cloud revolution is what really took virtualization on
  • Instead of renting physical machines, rent VMs
    – Better consolidation and resource utilization
    – Better portability and manageability
    – Easy to deploy and maintain software
    – However, raise certain security and QoS concerns
  • Many instance types, some with specialized hardware; all well maintained and patched
    – AWS: 241 instance types in 30 families (as of Dec 2019)

The Virtuous Cycle for Cloud Providers

  • More customers utilize more resources
  • Greater utilization of resources requires more infrastructures
  • Buying more infrastructure in volume leads to lower unit costs
  • Lower unit costs allow for lower customer prices
  • Lower prices attract more customers

Container

  • VMs run a complete OS on emulated hardware
    – Too heavy-weighted and unnecessary for many cloud usages
    – Need to maintain OS versions, libraries, and make sure applications are compatible
  • Containers (e.g., Docker, LXC)
    – Run multiple isolated user-space applications on the host OS
    – Much more lightweight: better runtime performance, less memory, faster startup
    – Easier to deploy and maintain applications
    – But doesn’t provide as strong security boundaries as VMs

Managing Containers

  • Need a way to manage a cluster of containers
    – Handle failure, scheduling, monitoring, authentication, etc.
  • Kubernetes: the most popular container orchestration today
  • Cloud providers also offer various container orchestration service
    – e.g., AWS ECS, EKS

Serverless Computing

  • VMs and containers in cloud still need to be “managed”
  • Is there a way to just write software and let the cloud do all the rest?
  • Serverless computing (mainly in the form of Function as a Service)
    – Autoscaled and billed by request load
    – No need to manage “server cluster” or handle failure
    – A lot less control and customization (e.g., fixed CPU/memory/memory ratio, no direct communication across functions, no easy way to maintain states)

Summary of Virtualization History

  • Invented by IBM in 1960s for sharing expensive mainframes
  • Popular research ideas in 1960s and 1970s
  • Interest died as the adoption of cheap PCs and multi-user OSes surged in 1980s
  • A (somewhat accidental) research idea got transferred to VMWare
  • Real adoption happend with the growth of cloud computing
  • New forms of virtualization: container and serverless, in the modern cloud era

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.