It used to be the case that when you wanted to deploy a new application you would need to buy new server hardware to host it on. Today however, there are many different virtualization technologies to choose from, each allowing you to have more than one virtual server per physical machine. Virtualization has a number of benefits — lower cost, power, space, and cooling. Of course, you need to have a machine powerful enough, but many services, especially internal ones such as company wikis and instant messaging servers, do not require the full resources of a physical server, and it makes sense to combine these using virtualization.
In many cases, web applications can be combined on a single server using virtual hosting facilities in Apache, but this is an imperfect solution. Inevitably the situation arises where you have an application that doesn't play well in a virtual hosting situation, be it badly written, or requiring specific versions of libraries or modules that conflict with another application. There are also administrative concerns — anybody who has access to one application has access to them all.
The virtual hosting method also eliminates one of the biggest benefits of using virtualization on entire servers. Many virtualization technologies provide some method of transferring a virtual machine between physical hardware — if a particular server is behaving badly, just transfer all the virtual machines onto replacement hardware with little to no loss of service and without having to reinstall the operating system/applications.
Here at OmniTI many of our servers run Solaris, giving us two very powerful features on which we heavily rely when it comes to making use of virtualization: Solaris containers (Zones), and ZFS.
Virtualization using Zones
Zones provide lightweight virtualization for Solaris. Unlike many other virtualization solutions such as VMWare or VirtualBox, Solaris zones don't emulate physical hardware on which several complete operating systems run; rather, there is one kernel running in the system with multiple partitions (the zones) in which user programs run.
This type of virtualization doesn't force you to pick a set amount of RAM for each virtual machine, or set up virtual disk images (although resource limits can be set for each zone). Because there is no hardware emulation going on, zones are also incredibly fast — fast enough that we are able to run multiple production services on a single machine without any perceptible slowdown. Even for high traffic sites that can saturate an entire (physical) server, we are still able to make use of zones (with just one non-global zone per server) without any significant performance hit. This allows us to benefit from the ease of moving a zone from one machine to another, either in the event of hardware failure, or to migrate to a more powerful machine.
Zones can also ease administration of multiple servers by centralizing package management. By default, any package installed on the global zone is automatically installed to all non-global zones. You can also specify that certain paths are inherited from the global zone, reducing disk space requirements per zone. The inherited paths become read only, forcing them to be the same across all zones. If all packages are installed from the global zone, and you make use of inherited paths, then you can be assured that every zone has the same software configuration.
However, this doesn't have to be the case — you also have the option of installing packages in the zones themselves if different zones need different packages installed. To do this, don't inherit any directories. This creates a 'large' or 'whole root' zone, and you are free to install whatever is needed inside the zone itself.
ZFS and Zones
ZFS has many
useful features that put it far ahead of most other filesystems that
are available. Several of them are of particular interest in that they make virtualization better: a pooled storage model, snapshots,
and the ability to transfer filesystems via the zfs send
command.
Pooled storage does away with the idea of having filesystems on individual partitions, and having to guess how much space will be occupied by individual filesystems. You just create one pool across the entire disk (or set of disks) that you want to store your data on. Any filesystems you then create in that pool will only use up as much space as needed to hold the data.
In practice, this means we can create individual filesystems for each of our zones without having to worry about how much space to assign to each. Having each zone on its own filesystem is required to be able to snapshot, backup, restore and transfer zones individually.
Snapshots give you almost instant point-in-time copies of your filesystem, each of which only take up enough space to hold what has changed since the snapshot was taken. The benefits of this are numerous including the ability to roll back to an earlier time and consistent backups (take a backup from the snapshot, and you won't have files being modified while the backup is in progress). From the point of view of virtualization however, one of the biggest benefits of snapshots is in combination with zfs send.
The zfs send
command allows you to send a snapshot of a ZFS
filesystem from one machine to another (or on the same machine, if you so
desire):
# zfs send data/zones/myzone@somesnapshot | \ ssh remote_machine zfs receive data/zones/myzone
This allows you to quickly move (or copy) a zone from one machine to another: detach your zone, zfs send the filesystem to another machine, attach the zone, and you have your zone up and running on a completely different machine.
You can also make use of incremental snapshots to minimize the amount of time the zone is down (a zone has to be halted in order to detach it): snapshot the zone's filesystem and send it across while the zone is still running, shut the zone down, detach it, snapshot the zone's filesystem once more and send the incremental snapshot across.
Until recently there were issues with upgrading zones that live on a zfs filesystem, but this has been fixed in Solaris 10/08 (u6), and Live Upgrade is now supported. There is now little reason not to use ZFS as the filesystem for zones.
Backing it all up with Zetaback
It doesn't take much thought to realize that the snapshot/zfs send tools can also be used to take backups of systems, especially when you make use of incremental snapshots. At OmniTI we have developed Zetaback, a backup tool based on zfs that automates much of the work of taking and managing backups.
With Zetaback, you specify a list of hosts, the retention policy, and how often to take a full/incremental backup. Then you just let it go. It connects to each host via ssh, scans the host for filesystems to back up, and by default will back up everything, automatically picking up new filesystems. You can filter the list using regular expressions if you want to limit what is backed up.
In addition to taking backups themselves, Zetaback provides tools to quickly restore zfs filesystems, view the status of backups and generate reports showing which filesystems violate the backup policy (e.g. those that have not had a successful backup in 1 week).
The choice to make use of virtualization is often an easy one, the choice of which solution to go with is somewhat harder. If Solaris meets the needs of your applications, then it is worth considering Zones. Combined with the features of ZFS and Zetaback, they provide a flexible and powerful solution.
Some real-world numbers
We're a web infrastructure and development shop, so we run a lot of development servers. Each environment needs the flexibility of its own software selection including version. To accommodate that, we run 37 zones on 2 development servers. Each development server has 8GB of RAM and two dual-core 64-bit AMD processors — in financial terms: about $2300 each. Our production boxes, that serve corporate mail, document management, version control, instant messaging, directory services, etc., all run in zones also. For that we have two boxes (just like the development ones) on which 17 zones happily reside. All of our important services run on a rather small set of machines -- easy to manage, cheap to power and cool. And for our purposes, it is far more efficient than heavy-weight virtualization like VMWare ESX.
We've been running this type of light-weight virtualization for over two years now. We're pretty happy with it. I suggest you give it a whirl.