z/VM’s ancestry dates back to the ’60s, although most sites today understand it purely as a facilitator of Linux virtualization—and for good reason.
z/VM’s career began with hypervisor technology that first appeared commercially in the early ’70s as VM/370, which allowed for guest operating systems to access services from a simulated machine created by the underlying operating system control program. The hypervisor literally ran on “bare metal,” supporting a variety of operating systems at one level removed from the hardware. z/VM was the first, and is still the favored approach, to allow multiple system images of diverse operating systems to function on a System z.
However, there are many other z/VM capabilities sites can exploit to their advantage that both complement and extend beyond virtualization. This article discusses how z/VM is making a difference for organizations—and what sites can do to get the most out of it.
The Current z/VM Market
IBM acknowledges that the ability to run virtual Linux servers captures the lion’s share of interest in z/VM these days. Forty percent of VSE customers also run production workloads in the z/VM environment, and practically 100 percent of TPF clients run z/VM for development and test support. “Approximately eight percent of z/OS clients also run test and development environments on z/VM,” says Reed Mullen of IBM System z Strategy and Technology. “They’re finding a savings in sharing system resources among multiple copies of z/OS running on z/VM in a single Logical Partition (LPAR). Developers and testers appreciate having their own copy of z/OS. They can test their environment on their schedule, without worrying about the impact their work may have on other users.” According to Velocity Software CEO, Barton Robinson, there also are pockets of sites that have built up libraries of custom applications in z/VM that aren’t vendor-dependent.
IBM’s Mullen says it’s hard to categorize all the benefits of z/VM because its options and uses are so broad.
“One reason z/VM is valuable to enterprises is because of its hypervisor technology,” Mullen says. “z/VM doesn’t constrain customers from using whatever System z technology they wish to use, and it literally frees them from some of the limitations a non-virtualized environment might impose. In a virtual machine environment, it’s possible to configure systems with assets that might not exist in the real hardware configuration. This allows clients to optimize their workloads and to experiment, innovate, and perform system performance assessments. In this way, sites can try out ideas in the virtual world before implementing them in production.”
Certainly, z/VM use is growing with the introduction of Linux. “The market had significantly dropped during the ’90s as IBM was sending out signals that the product was being killed off,” says MaristCollege’s Martha McConaghy. “But the introduction of Linux stopped that decline … I’ve consistently seen new faces at SHARE and other conferences and have heard anecdotal stories of new IBM customers who are purchasing hardware and z/VM in order to run Linux.”
Flexibility With Operating Systems and Facilities
With the focus on Linux, other mission- critical operating systems such as TPF (now known as z/TPF) aren’t often talked about—nor is the fact that TPF clients would struggle to support their production workloads without the z/VM layer providing cost-effective development and test support.
“The TPF operating system framework and z/VM are used in transactioncritical environments,” says David Boyes, founder and CEO of Sine Nomine Associates. “TPF requires z/VM for testing, and TPF was originally used for rapid transaction processing of reservations in the airline industry. TPF was then extended to other industries with rapid transaction processing requirements, such as hotels with reservation systems and telephone companies with billing systems. In this environment, transaction processing must be quick and response times must be guaranteed. z/VM is the facilitator that makes testing and production efficient.”
Until recently, software developers using the TPF operating system were limited to writing applications in a development environment entirely outside of TPF. However, the growth of management and development tools for Linux application development under z/VM also has directly transferred benefits to developers working in TPF. “TPF developers can now construct system images using Linux-based tools and then directly stick them into the TPF framework on the System z,” Boyes says. “The end result is that these developers now have dozens of development tools and frameworks to work with.”
Application development and test advantages with z/VM extend to other operating systems besides TPF and Linux.
IBM’s Mullen estimates that 40 to 50 percent of System z sites now use z/VM in test as well as in production environments because they’re able to recognize and resolve problems quicker. “We use z/VM internally at IBM for the same reasons,” Mullen adds.
Sites that have copies of z/OS often use z/VM to perform system level configuration testing, which they can’t do when they’re sharing z/OS in a big LPAR with other developers. “The early success of z/VM came from the ability to consolidate boxes and provide a cost savings for application development and test,” Mullen says. “There was newfound efficiency in sharing hardware assets between developers … and if the end user needed to reboot in a z/VM environment, who cared? There was no need to be sensitive to other resources going down or becoming unavailable.”
Equally significant is the fact that organizations can nimbly implement multiple system images since they can readily allocate and move resources in the z/VM virtual environment without having to disturb hardware assets. On the System z10 using the recently released z/VM Version 5 Release 4, z/VM has the ability to configure partitions with mixed Integrated Facility for Linux (IFL) engines for Linux and other operating systems. VM has always been able to run multiple operating systems in one partition—z/OS, z/VSE, TPF, and Linux on System z—but with this release it can now mix special purpose processors and application accelerators such as IFLs, IBM z10 Integrated Information Processors (zIIPs), and System z Application Assist Processors (zAAPs) with workloads dispatched on the appropriate real CPU. Fractions of physical processors can be allocated to guest systems, which is a nice economy for sites. It’s a difference from the System z9 environment, which has earlier processing technology and requires separate LPARs to run Linux on IFLs and z/OS on general purpose CPUs.
Management, Automation, and Workflow Control
As they move to virtualization and hosting multiple operating systems, sites familiar with some of the earlier VM utilities that have continued to be developed and brought forward with z/VM also are finding ways to achieve new forms of workload management, automation, and workflow control.
Two legacy tools that have remained a part of z/VM are the Conversation Monitor System (CMS) and the Rexx programming language.
“CMS’ scripting language is what makes it so valuable,” says Boyes. “With CMS scripting, a z/VM systems programmer can automate many different z/VM operations. This enhances the overall operations of z/VM because Rexx scripting allows sites to control the processing of multiple applications on a single machine and save money over a distributed environment. Sites also get full mainframe-strength Quality of Service (QoS), availability, and security.”
Sites can additionally use CMS as a systems management tool and as a control interface for z/VM’s virtual hypervisor layer. “CMS, part of the z/VM product, is a single user environment that runs in its own virtual machine and allows a systems programmer to script workflow and automation routines,” says Mullen. “Many clients still use CMS to host business applications, but CMS serves an even more valuable role in providing infrastructure support to help manage large-scale, virtual server deployments. CMS provides an execution environment for running performance monitoring tools such as the Performance Toolkit for VM feature, as well as external security managers such as the z/VM RACF feature. It allows you to write scripts to automate operational tasks and respond to virtual machine demands as they present themselves.”
Because of its history with the System Control Program (CP), CMS has much richer and more well-developed interfaces for interacting with CP than Linux does. These include the ability to use Rexx and obtain success/fail information about commands, as well as access to tape library management and other complex System z devices that simply don’t yet exist for Linux. If a site adopts the approach of starting CMS first, doing setup, and then starting Linux, IT gains access to 35-plus years of knowledge about CP that CMS and Rexx already have to get automation and environments set up before Linux takes control. This allows the guests to adapt to conditions that are outside a specific Linux guest.
An execution environment such as CMS and operating systems such as TPF also are being used by sites for rapid transaction processing, and for automation and workflow environments that deliver both maturity and sophistication in their solutions. All are part of an enormous collection of tooling that characterizes z/VM by virtue of its longevity as a System z operating platform. “CMS and TPF bring a lot to the table, but I think one of the most underexploited resources in z/VM is Rexx,” says Boyes. “Rexx is an interpreted programming language often used for automation and workflow control, and is one of the best descriptive language creators. Even our UNIX guys can read its code!”
Business process flows can be captured and automated with Rexx, which also provides access to system functions and gives the ability to write well-structured, understandable code. Multi-thousand- line Rexx applications run in production at installations around the world.
“If you’ve used a procedural programming language, you can recognize the coding structure of Rexx,” says Boyes. “Rexx is a language designed for non-computer-literate people. Also, if you’re a site that’s had VM for a longer period of time, you’ll remember how wonderful Rexx was when it first was introduced in 1982. But even if you’re a new site with a z/VM focus primarily on virtualization and you have no immediate CMS or workload management needs, you can still use Rexx if you’re open to exploration. System education is driven by applications—and Rexx can do some very sophisticated things.”
Two of these things are streamlining workloads and payloads. A site can easily get the system to a point where it has one LPAR that runs everything under z/VM. This can translate to three people supporting 15 to 20 systems under z/VM, as compared to z/OS, which requires three people by itself. “This makes everything easier to manage,” says Boyes. “You can have open systems such as Solaris under z/VM and at the same time have mainframe strength quality of service and automation.”
Scalability and Continuity
Because of its vast tooling and flexible capabilities for virtualization, z/VM also is widely employed in disaster recovery and business continuation. Its strength rests in being able to rapidly replicate a real machine environment in a virtual machine environment. Using this feature, a disaster recovery site can have a pre-set client configuration equivalent to the normal operating environment set up in a virtual machine, and can quickly deploy this in the event of an emergency. The time to deploy is significantly less than what it would be using LPARs.
“This is an extremely efficient way to do business, and is an easy way for a client to satisfy a concern of whether its production environment will recover and run acceptably in a failover situation without the look and feel of an actual production environment,” says Mullen. “When you compare virtual vs. LPAR costs, the differences are dramatic and you can easily see the savings in using z/VM.”
Boyes says that from a disaster recovery point of view, the optimum benefit of z/VM is to get parallel recovery streams going. “To address this, you can construct a very simple, one-pack system using z/VM,” he says. “In comparison, with z/OS you normally need to restore eight to 10 physical volumes to have a viable recovery system. With z/VM, you can restore and boot from one pack and then restore a large number of virtual systems in one simultaneous operation. If you add Rexx to automate that process, you can load tapes on as many tape drives as you have and literally walk away. There is an advantage to this automation from a human standpoint. Once the single system pack goes on, you’re free to work on something else and the time to restore is shorter. Sites with a history and familiarity with z/VM are using this now and have been for decades, and the z/OS world is just discovering how cool it is.”
The “Wish List” for z/VM
z/VM possesses enormous flexibility, but IBM still positions it as a virtualization environment for System z that gives sites flexibility with the scaling of their IT architecture and with asset management.
“You can start very modestly with a single CPU of capacity on the mainframe— and then scale this to your needs, all the while taking advantage of z/VM’s ability to over-commit CPU resources, which helps control software costs,” says Mullen. “Today on the IBM System z10, there are up to 64 physical CPUs that are separately configurable. You also have the full benefit of mainframe- strength security with SSL and data encryption capabilities. With z/VM 5.4, you can add hardware to a System z/VM LPAR without having to bring down the LPAR.”
Nevertheless, there are capabilities z/VM doesn’t have today that sites and industry experts would like to see.
“IBM has improved the LPAR flexibility under z/VM, but sites still don’t have the ability to dynamically change large parts of the system environment without resetting an LPAR,” says Boyes.
IBM’s Mullen comments,” There are some support options available with LPAR technology that aren’t concurrently available in z/VM from the standpoint of hardware support. Most notable might be the fact that z/OS guests running on z/VM can’t access a real Parallel Sysplex environment. At the same time, it should be noted that one can host a virtual Parallel Sysplex environment wholly in a single copy of z/VM. z/VM will host a copy of the Coupling Facility Control Code (CFCC) in any number of virtual machines. Any number of z/OS guests connect to the CFCC to form a Parallel Sysplex. This allows users to test production workloads and profile certain changes they want to make in this test environment before going live. z/VM also is a training environment for the z/OS staff using Parallel Sysplex. It helps that you don’t have to worry during a test if you break the system. The same is true in the Linux space. If clients bring instances of Linux to the mainframe, they can host additional copies of z/VM, which give staff opportunities to practice using z/VM facilities and features. This capability is quite unique to the industry because you can actually run the z/VM hypervisor on top of itself. This isn’t possible with virtualization products from VMware or Microsoft.”
Boyes would like to see expansion of integrating z/VM into the larger arena of enterprise system automation.
“Many tools have been developed in the past to handle the sequencing of events for multiple platforms,” says Boyes. “z/VM surrounds the other operating systems running as guest virtual machines, and has a complete toolset for sequencing and automation. An interesting area would be extending this capability outside of z/VM. Many things are built into z/VM like a programmable operator. These workload management tools could be brought to bear outside the System z. The emphasis should be on system management and on answering the question of how network and storage components can be incorporated for pieces of transactions that cross multiple systems. Distributed sites are starting to talk about this, and z/VM could deliver.”
Another area is the enhancement of NJE. “One big printer vendor didn’t want to make channel-attached hardware anymore, so the question for a large and complex mainframe is how do you integrate with devices like this with minimum input?” says Boyes. “NJE can fill this need because it was introduced back in the late ’70s to handle remote NJE workstations. It also has a combination of hardware and software support, but it never made it outside the mainframe system. It has unattended file transfer for printouts and deliveries, a spooling system and automation that automatically take place.
Since NJE can provide peer services, sites suddenly have access to transport log files and real-time log data, which can be moved up to mainframe tools. “With the mainframe, you have analytics that can be applied,” says Boyes. “If you have network connectivity between distributed systems and the mainframe, you can start to apply mainframe automation to a distributed environment. The bottom line is that it’s easier to manage from one platform. It also pleases the auditors, because mainframe resources meet the requirements for audit trails, which isn’t always the case with distributed architectures.”
Marist’s McConaghy would like to see “live migration,” where a Linux workload could be moved from one VM system to another without losing network connection or data. “This isn’t a ‘fail over’ to another physical server, but the movement of the server from one LPAR or processor to another,” she says. “This would allow for planned outages, without taking an outage of the service. When you run dozens or even hundreds of virtual servers on a VM system, it becomes very difficult to find a window in which to do system maintenance. In some cases, VM systems haven’t been shut down for years because there are no windows of opportunity. The capability of being able to move live workloads from one system to another would solve many problems like that. Though no announcement has been made, I know IBM is very seriously looking at this.”
Future Outlook and Concluding Remarks
IBM announced a new release of z/VM Version 5 Release 4 on Aug. 5, 2008. This new product release became generally available on Sept. 12, 2008.
In addition to supporting dynamic memory upgrade and z/VM-mode LPARs, which are System z10 LPARs that can be configured with IFLs and general purpose CPUs, z/VM 5.4 provides support for an added level of CPU utilization for guest systems. A virtual Linux server configured with multiple CPUs can dynamically turn off some of the CPUs when the multi-processing characteristics of its workload diminishes, all without reducing overall performance of a Linux on z/VM environment by reducing the number of unnecessary virtual CPUs z/VM must manage.
“This means, for example, if a Linux guest is configured with four virtual CPUs, and the workload running in the guest can be dispatched on no more than two CPUs, Linux can turn off two of the CPUs and z/VM will automatically allocate the capacity of those CPUs to those that are still active,” Mullen says. “It’s a case of optimizing the efficiency of the machine and wringing out every bit of capacity, enhancing the Total Cost of Ownership (TCO) associated with running Linux on z/VM.”
z/VM will continue to be a System z “workhorse” operating system. To this end, sites utilize TPF, Rexx, CMS, and applications of disaster recovery and business continuity in isolation, sometimes depending on their histories of using z/VM. The next step could well be where each site individually takes stock of all z/VM has to offer to maximize it as a winning solution in multiple phases of IT infrastructure.