As a continuously developed, steadily enhanced operating system, VSE offers many advantages, such as system stability, compatibility, and performance, in rapidly changing heterogeneous environments. However, the interrelations between many different system components and their huge number of functions and parameters involve some complexity, which is often difficult to understand, even from a systems programmer’s perspective.
This situation has led to the idea of an automated tool to provide an overall preventive Health Check by collecting VSE status and performance data and displaying aggregated information in a user-friendly, comprehensive way. Retrieved information can also be analyzed by applying a given set of rules, making it possible to express recommendations about how to optimize relevant system parameters. (For a detailed analysis on this, see “Design and Implementation of an XML-Based Rule Component for Health Checking on the VSE Operating System,” a diploma thesis by Jürgen-Hendrik Kuhn, Berufsakademie Stuttgart, September 2004, IBM Lab Boeblingen, Germany.)
Initial Prototype
From the first idea about a VSE Health Checker utility in Hans Joachim Ebert’s workshop, “Health Check for VSE/ESA” in 2002, it took one year to develop an initial prototype that was presented to customers at the Guide Share Europe (GSE) conference in Ulm, Germany, in October 2003. The prototype included these major functions:
• Automated retrieval of VSE system and performance data
• Graphical data display
• Portability of collected data by allowing storage of all collected information on disk in XML-format, facilitating data exchange via e-mail.
Only VSE base functionality is used for data retrieval, no optional products or third-party products. The tool never applies any changes to the VSE system, it just reads data.
Figure 1 shows the Health Checker’s main window with a display of the CICS TS storage layout below the line. The VSE Health Checker’s Graphical User Interface (GUI) is divided in a tree view on the left side of the main window, showing all considered VSE system components in a hierarchical view. For each component, there’s a tabbed pane on the right side of the main window showing selected parameters and their actual values. Graphical charts help to visualize the information.
Advantages
The first prototype collected selected system data in an automated way, relieving the user from typing console commands, submitting jobs, invoking CICS transactions, and manual data evaluation. Here are some additional benefits of VSE Health Checker:
• Displayed information is aggregated from the output of multiple console commands or job listings. With priority settings, for example, data from console commands PRTY, D DYNC, and MAP for each static and dynamic partition can be collected and put together (see Figure 2).
• Displayed values are calculated from different single retrieved values. (In
Figure 2, see size of the POWER partition above the line. This value is important, because if there are at least 2 to 3MB free space above the line, the POWER queue file can be kept in memory.)
• Unimportant information (such as VTAM buffer usage) is removed. (In Figure 3, see the simple table from a large textual output of the DNET,BFRUSE command.)
Automated rule-based analysis is beneficial, as is the ability to detect possible problems before they show up (an example of the preventive service concept). System stability and performance can be optimized with recommended values of the various system parameters. Finally, working with the tool might help systems programmers hone their skills.
A-B-C Analysis
In the first step, the Health Checker provided functions to collect, display, import, and export selected system data. In the second step, a rule-based component has been added for analyzing the collected data. Data can be roughly divided into three areas, where data collection and data evaluation have different purposes:
A: Checking the most important things—20 percent of the work for 80 percent of the success
B: Getting into the details—additional 30 percent work for 15 percent more success
C: The fine tuning—another 50 percent work for the last 5 percent of the success.
While the C-analysis is the most time-consuming task for a human expert, this is a good subject for a tool, which can significantly speed up this process by automatically retrieving and processing the necessary data. In its first implementation, the VSE Health Checker covers many areas of the Aanalysis and selected parts of the Banalysis. The C-analysis with both information display and rule-based analysis will require further development of the tool.
The Rule Component
The rule-based analysis component enables the VSE Health Checker to apply expert knowledge, stored as a given set of rules with proposed optimal parameter values in a knowledge base, against the previously gathered data. The knowledge base is the most valuable part of the tool, as it has required question and answer sessions with many human experts to transfer their knowledge into the Health Checker’s knowledge base.
Analysis results based on checked parameter values appear in the GUI with a traffic light analogy; they show as either green (good value), yellow (subject for review), or red (probable problem). There’s also a function to generate an analysis report in HTML format for printing. As rules are written in XML, customers can modify pre-defined rules according to their system environment and add new rules by editing the XML rules file, which represents the knowledge base. A graphical dialog for handling rules represents a potential future enhancement.
The Rule Syntax
Rules are written in XML and have two main parts:
• The rule section specifies up to three conditions, which are mapped to the colors red, yellow, and green. The sections may appear in any order, but they’re evaluated in the given order (i.e., the last positive evaluation determines the color).
• The explanation section refers to a local HTML file with a text description of the rule.
Figure 4 shows an example rule that says: In CICS TS, the TIMES_AT_MXT value shouldn’t be higher than 8. CICS can handle a maximum number of tasks in parallel and this value tells how many times this limit was reached. A value of zero is marked green; values between 1 and 8 are marked yellow. Values greater than 8 are marked red.
A formal syntax for conditional expressions is included in the tool’s online documentation. Here’s a short description of the rule syntax:
• Variables are prefixed with a dollar character such as $TIMES_AT_MXT. You can choose variable names, which are not case-sensitive.
• Due to restrictions of the XML file, the characters in Figure 5 must be noted as an HTML tag
• The equality expression is coded, as it would be in C or Java, with two equal signs (e.g., a == b)
• Boolean constants are true and false:
<green>
($DEBUG == false);
</green>
• Float numbers are coded with a dot:
<red>
($MAX_USED >= $MAX_TOTAL * 0.9);
</red>
For more on how to use variables in a rules file, see the Health Checker’s online help.
Data Pool Mapping
In Figure 5, there’s a variable, $TIMES_AT_MXT, which is assumed to contain the actual value of the CICS “times at MXT” value. In the data_pool_mapping section of the rules file, this variable maps to a Java method for a particular Java class in the Health Checker’s internal data pool. The data pool contains all data retrieved from a running VSE system or loaded from a previously saved XML file (see Figure 6).
StatDataCollection is the Java wrapper class that provides access to the CICS TS statistics data for each CICS partition. The data class provides methods to get parameter values out of the Health Checker’s data pool. In Figure 6, there’s a method getTimesAtMxt(). The mapping just references the name of the related Java method without the “get” prefix. A complete list of data classes and the methods to get data is included in the tool as Java Application Program Interface (API) documentation (Javadoc).
GUI Mapping
To visualize the analysis result in the Health Checker GUI, all analyzed parameters are mapped to their related GUI components, currently text labels or table cells. The GUI mapping looks like this:
<TIMES_AT_MXT>
[CICS:General;Times at MXT]
</TIMES_AT_MXT>
CICS references the major tree node.General references the sub tree node under which the parameter is shown on the right side of the Health Checker’s main window. Times at MXT is the static text that appears next to the analyzed parameter value (see Figure 7).
The Scope of Rules
The knowledge base has different sections so you can define the scope of a particular rule. Rules can be divided into several categories. Some rules:
• Apply only for CICS (i.e., the analysis must check each CICS partition)
• Reference TCP/IP parameters and must be evaluated for each TCP/IP partition
• Are independent of a partition and reference basic system parameters
• Apply only for one specific VSE system or one specific partition (e.g., the POWER partition).
These categories are reflected by a hierarchical structure of the knowledgebase. The scope of a rule is defined by adding the rule into one of the related sections. The rule component of the VSE Health Checker helps you processs the rule for each applicable host and partition.
Using the Tool
The VSE Health Checker provides single snapshots of the system’s state. There’s no measurement of the system’s behavior over time such as that implemented by the VSE Navigator. You can download both the free VSE Health Checker and the VSE Navigator at: http://www.ibm.com/servers/eserver/zseries/zvse/downloads/). The Navigator tool is intended to support preventive analysis runs every few weeks and a comparison of the result with previous runs. An automated comparison of different snapshots may be another future enhancement.
Prerequisites
The VSE Health Checker has these prerequisites:
• VSE/ESA V2.6 (PQ88809/UQ88864) or higher
• VSE/ESA V2.7 (PQ88809/UQ88865) or higher
• z/VSE V3.1 GA or higher
• TCP/IP running on the VSE side (TCP/IP for VSE can be analyzed; other stacks can be used to retrieve data, but can’t be analyzed)
• VSE Connector Server running on VSE (job STARTVCS)
• VSE Connector Client installed on the workstation side
• Java 1.4 or higher installed on the workstation side
• The STAT transaction (program DFH0STAT), which must be defined to obtain CICS TS statistics data
• An additional transaction (CHKT) must be defined to obtain a list of TS queues. (A link job to catalog the related phase is part of the Health Checker package.)
Future Work
In the future, the tool may provide display and analysis of DB2, DL/I, and VSAM-related data. Other requested enhancements and new parameters will also be considered. The most important task is to steadily increase the knowledge base with the help of IBMers and customers.
Acknowledgements
The VSE Health Checker wouldn’t exist in its current development state without the help of numerous people from inside and outside IBM. Many thanks to Andreas Groeschl and Jürgen-Hendrik Kuhn, students at the Berufsakademie Stuttgart; Ingo Franzki, Karsten Graul, Wilhelm Mild, Wolfgang Bosch, and August Madlener from VSE development; Hans Joachim Ebert and Dagmar Kruse from Technical Sales zSeries; and Heinz Peter Maassen from Lattwein GmbH for providing code and many ideas for the tool.
More Information
To learn more, access these resources:
• VSE/ESA e-Business Connectors User’s Guide, SC33-6719 (accessible as a downloadable PDF at http://www.ibm.com/servers/eserver/zseries/zvse/downloads/)
• IBM z/OS and Sysplex Health Checker User’s Guide, Version 3 Release 0, Document Number SA22-7931-03, 2002.