zjournal
 
   




SPONSORS
This department is sponsored by:


  


 
 

::

Shared Data Tables and CICS TS

 

A Shared Data Table (SDT) is simply a VSAM Key Sequenced Data Set (KSDS) that has been loaded into virtual storage and accessed as a table—reducing physical I/O and improving response time. The feature for using SDTs has been available in CICS TS since CICS/ESA 3.3, but SDT use has been somewhat limited in many installations for several reasons:

 

• The amount of real storage required to support data sets defined as SDTs

• The recommendation that SDTs be used for data sets that reflect a high read-to-write activity ratio or for read only

• Several restrictions, when data tables were initially announced, that required programming changes

• The original design of the data table, which called for using CICS region storage.

 

For these reasons, many installations limited use of SDTs to small- or intermediate-size, read-only data sets. The recommended read-to-write ratio Rule of Thumb (ROT) for data sets that require output operations has been around 90 percent; that is, the data set receives 90 percent of its activity doing read operations. However, with the availability of more real storage on processors today, users should re-evaluate the use of SDTs to obtain better performance and response times.

 

This article offers tuning recommendations for SDTs, including insight on the break-even point for a data set in terms of its read-to-write ratio.

 

Virtual Storage: A Concern?

 

We haven’t mentioned virtual storage as a concern in a CICS TS address space when using SDTs. The reason is because SDTs are placed into data spaces that have their own virtual storage, independent of the CICS TS address space. The CICS TS address space contains only control blocks that identify where the information is kept in the data spaces. Small control blocks also are placed into the z/OS Extended Common Storage Areas (ECSA) used when the SDT is shared by more than one CICS region. However, virtual storage may be an issue in the data spaces themselves. CICS TS can generate up to 100 data spaces, each up to 2gB. The total capacity available for data tables is 200gB.

 

The actual number depends on several factors, including the total number of data sets declared as data tables and the installation System Management Facility (SMF) Exit, IEFuSI, which can be used to limit both the total number of data spaces CICS TS can support and the maximum size allowed for each data space. So, check with your operating system staff to see what, if any, limitations have been placed on CICS TS regarding the use of data spaces. Also, the default values for the parameters in IEFuSI may not suit your needs. The allocated data spaces are retained until CICS TS is shut down regardless of whether the file is closed and the space is no longer used.

 

The names CICS TS assigns to the data spaces are DFHDT001 to DFHDT100. CICS TS initially creates three data spaces when the first data table is opened. DFHDT003 is used to store the actual data. Should this data space fill up, CICS TS will allocate the next data space available, if any. In this case, it would be DFHDT004. The first data space, DFHDT001, is used to store table entry descriptors while DFHDT002 is used to store the index nodes.

 

Space allocations in a data space occur in 16MB units for the data component and are allocated to individual tables in increments of 128KB. So, each 16MB allocation can hold up to 128 extensions of 128KB increments. If additional space is required, another 16MB piece is acquired, if available. If not, then another data space is acquired, if available. If not, then a “no more space available condition” is raised. Data for the table in the 128KB is stored in page-aligned frames that can accommodate the maximum record length. The process continues until there are no data spaces available or the index node or table entry descriptor data spaces are full.

 

Data Page Frames

 

The concept of data page frames can be loosely equated to a VSAM Control Interval (CI). Data with similar keys are stored in the frames. However, in contrast to VSAM, records aren’t necessarily stored in ascending sequence in the frame. The reason is that records are located by indirect means using record descriptors. For this reason, records can’t be moved to consolidate free space because it would affect the table entry descriptors used to access the records and would limit concurrent access to records.

 

So, if you have many insertions and/or the the record length is expanded, you could wind up using more virtual storage. The extra virtual storage can be up to almost double the size of the original data table requirements. Free space resulting from record deletions can be reused by new records or record expansions. Virtual storage monitoring may be required when IEFUSI limits the number or size of the data spaces.

 

Space for the index nodes and table entry descriptors is acquired in increments of 32KB. There’s a table entry descriptor for every record in the data table. In addition, there’s a table entry descriptor for each gap in the key sequence where one or more records may have been eliminated from a CICS Maintained Data Table.

 

The size of the table entry descriptor is the record key length plus 9 bytes of control information rounded to a double word. Table entry descriptors are placed into DFHDT0001 data space. Space for index nodes also is allocated in 32KB increments and is placed in the DFHDT002 data space. The size of the keys varies and depends on several factors such as distribution and format of the key values and the total number of records in the data table. For example, the table in Figure 1, from the IBM Shared Data Tables Guide (CICS TS V3.R2 [SC34-6836]), provides insight on the size of the index node keys.

 

When the space in either DFHDT001 or DFHDT002 is full, then there’s no more room for data tables. Due to the limited use of data tables in many installations, this hasn’t been a major issue. However, as users begin to use more data tables for larger data sets, this could become an issue, especially if IEFUSI is used to limit the size and number of data spaces that CICS TS can create. So, monitoring the amount of virtual storage used in DFHDT001 and DFHDT002 may be necessary.

 

SDT Types

 

There are two types of SDTs available for use in CICS:

 

• CICS Maintained Data Table (CMDT)

• User Maintained Data Table (UMDT).

 

Another type of data table, the Coupling Facility Data Table (CFDT), isn’t discussed in this article. SDTs are easy to implement in CICS TS. The definition occurs in the file definition using Cross-Environment Data Access (CEDA). Two important parameters are used under the title of “DATATABLE PARAMETERS”:

 

TABLE {NO|CICS|USER|CF}

MAXNUMRECS {NOLIMIT|1-99999999}

 

The two parameters identify the data set to be used as a CMDT (CICS) or a UMDT (USER) data table. The second parameter can be used to identify the size of the table by indicating how many records will be loaded. If you’re going to specify the number of records in the data table, normal installation disk space allocation standards should apply when determining the table space. You would probably want to reserve space for a static data set (one that receives no insertions) so the data occupies around 90 percent of the space allocated.

 

For a dynamic data set (one that receives insertions), you would probably want to reserve space so the data set occupies between 70 to 80 percent of the allocated space. This leaves room for expansion, but the actual value will depend on the rate of insertions. Leaving space removes the requirement for constant monitoring. Under-allocation can be a problem with records not loaded in the table requiring access to the disk. The maximum number of records supported is 16,777,215 for an SDT.

 

You may be a little confused regarding the growth and space allocation for a data table because we mentioned that data tables are mainly used for read-only or low output activity (10 percent). Recall that a data set may not receive any insertions in CICS TS, but there may be a batch process that may increase the number of records in the table.

 

The no-limit default eliminates the need to specify the number of records and possible continuous monitoring of the data table to ensure there’s sufficient space. The only potential danger is if a data set assigned to a data table significantly grows, it can possibly affect the amount of real storage required and introduce potential paging. This type of situation usually comes as a “surprise,” causing system problems and requiring potential explanations to upper management as to why it happened and wasn’t being monitored.

 

SDT Support

 

Support for SDTs requires the file be defined to Local Shared Resources (LSR). This support is required for operations that require physical access to the data set, such as an update or insertion. Naturally, you would want this type of operation to be kept at a minimum (e.g., less than 10 percent). The number of strings also can be an issue. If the data set is generally read-only or has little output activity, then the default number of strings is taken. This can be a problem when two simultaneous output operations are required. So, you may also want to add a few strings to avoid string waits affecting response times. The rest of the CEDA parameters depend on the data set requirements.

 

There are some important differences between the CMDT and UMDT. A CMDT is a data table that’s coordinated with the source VSAM data set. Any updates are automatically reflected in the source VSAM data set. A CMDT is easy to implement and requires no application program changes as the full Application Program Interface (API) support is available. When the data set defined as a data table is opened, the data set is read and the data table is created. The data set is maintained, opened, and associated with the data table. Update operations are synchronized and are automatically made to both the data table and source VSAM data set.

 

A CMDT supports both fixed and variable length records. The user can suppress the loading of certain records via an exit (XDTRD). However, records loaded into the table must be of the same layout as those on the VSAM data set. That is, the user can’t extract only the fields (subset) desired when loading the file. This lets CICS maintain the CMDT and source VSAM data set synchronized during update operations. Note that as records can be suppressed from the CMDT through the exit when loading the data table, some physical I/O activity can occur when reference is made to a record that’s not in the data table but is present in the source VSAM KSDS. We won’t be considering this type of I/O here, but the tuning recommendations similarly apply.

 

A UMDT may require changes to the application program; it doesn’t support the full API. Different from a CMDT, a UMDT isn’t associated with the source data set once the data table is loaded. The data set is closed. Thus, any updates to the UMDT won’t be automatically done to VSAM source data set. The user would have to handle this condition through application programming.

 

A UMDT supports only variable length records. However, the user can load records from other sources (e.g., IMS or DB2) and extract certain information when building the table through the use of an exit (XDTRD). So, the exit (XDTRD) can be used to suppress records from being loaded and/or modifying the records to be loaded in the case of a UMDT. This can be done because there’s no automatic synchronization with the source VSAM data set. Another exit is available for a UMDT data table called XDTAD that can be used to add records to the table after the initial load. However, this exit isn’t allowed to modify the records. A UMDT is good for information that doesn’t have be altered or saved when the file is closed.

 

SDT Use and Benefits

 

A major advantage of using SDTs is the performance the user can obtain because there’s a reduction in or no physical I/O associated with the request. Even if a data set has all the index and data CIs loaded in an LSR pool, the SDT will provide an improvement over LSR even though no physical I/O occurs. This is because there’s a reduction in CPU required to access the record. Some benchmarks (see the CICS TS Performance Guide) have seen up to a 70 percent reduction in overall CPU utilization over LSR. However, the actual improvement will vary and will depend on the application usage of the data set.

 

What data sets are best suited for data tables? First, consider using CMDT data tables mainly because they are easy to implement and don’t require any programming changes. The best performer is a read-only data set that requires no physical I/O. So, consider this type of data set first, then consider predominantly read request data sets accessed from other regions.

 

Many installations have File-Owning Regions (FOR) and duplicate read-only data sets in other CICS TS regions to avoid the function shipping cost for a read-only operation. By using an SDT, these duplicated read-only data sets can be eliminated. Data sets that have a low output activity (10 percent or less) also can be considered candidates. However, insertions may be more costly in terms of physical I/O and should be analyzed.

 

From where does the magic ROT of 90 percent read operation come? Let’s analyze this and see whether the 90 percent is a valid objective and what can be done to improve the response times associated with data tables that have output activity. When a request such as a read for update is issued for a record in the data table, CICS TS issues the request through VSAM. The data set has to be defined in LSR. This operation occurs as if the data set wasn’t a data table. A search through the buffers is performed for the indices and data. If not found, a physical I/O results for each CI not found in the buffers. So, a worst-case scenario would be that none of the desired CIs were found in the LSR buffers, requiring a physical read to the data set for the index and data CIs. Figure 2 indicates the total number of I/O operations that would be required, depending on the number of index levels. So, in a worst-case scenario, we find the number of physical I/Os would be from two to four, depending on the number of index levels associated with the data set.

 

Let’s analyze the I/O operations for a data set that receives 100,000 requests during the day. Suppose 90 percent of the I/O requests are for read requests, or 90,000. In this case, there are no physical I/O operations and you have a look-aside hit ratio for this portion of the requests of 100 percent. This leaves 10 percent of the operations, or 10,000, that are update requests. Figure 3 shows the requirements for the 10,000 update operations.

 

The total I/O operations doesn’t take into account the rewrite or delete of the record to disk because we’re only interested in the effect the I/O has on the front side of the operation. As you can see, the physical I/O operations vary by number of index levels associated with the file. This is a worst-case scenario where we’re assuming a 0 percent look-aside hit ratio. However, this scenario does help reveal the effectiveness of the data table with 90 percent read operations.

 

Performance Analysis

 

So, we now can analyze different hit ratios and see the effect that output I/O has on the data table’s performance. We’ll analyze three hit ratio scenarios: 0 percent, 50 percent, and 90 percent. These hit ratios are for the LSR pool. Remember that the I/O operations occur using an LSR buffering technique. We’ll be using the following formula to compute the look-aside hit ratio for the data table:

 

Look-aside % = (hits requiring no I/O /

(hits requiring no I/O + I/O operations)*100)

 

“Hits requiring no I/O” represents the reads where the data record was found directly in the data table or LSR buffer, so no physical I/O occurs. In this case, we’re referring to the 90,000 reads that represent 90 percent read requests. “I/O operations” represent the physical I/O operations required for the output requests, or 10,000, which represents 10 percent of the total requests to the data table. However, as the hit ratio in the LSR pool improves, the higher the number of hits without I/O. We must analyze two different look-aside hit ratios: the ratio to the data table (in this case, 90,000) and the ratio in the LSR pool where the output operations for the data table (in this case, 10,000) are handled.

 

As the look-aside hit ratio in the LSR pool improves from 0 to 90 percent, the number of look-aside hits increases for the data table. Some of these look-aside hits take more CPU time than if the record had been found in the data table because they have to be made in the LSR pool. At a 0 percent look-aside hit ratio in the LSR pool, we would have to do physical I/O for the record desired. So, we would require an additional 20,000, 30,000, or 40,000 operations, depending on the number of index levels. However, as you improve the look-aside hit ratio to, say 50 percent, then one-half (50 percent) of the 10,000 records requested would be located in the LSR pool, increasing the look-aside hit ratio for the data table to 95,000 from 90,000. The remaining 5,000 records would require a physical I/O operation. The number of operations in this case would be 10,000, 15,000, or 20,000, depending on the number of index levels for the file. An LSR look-aside hit ratio of 90 percent would provide even better results.

 

So, using the previous percentage LSR hit ratios, we can compute the look-aside hit ratio for the data table shown in Figure 4. We can see from these tables that the ROT is valid only for certain data set configurations and much depends on the hit ratio we can obtain in the LSR pool. Generally, you would want to get a combined index and data hit ratio for an LSR data set to be around 90 percent or higher. This should be the objective for a data table. We consider a 50 percent or lower look-aside hit ratio in an LSR pool to be poor. So, to get a relatively good look-aside hit ratio for a data table that has output operations would probably require a fair LSR hit ratio in the 70 to 75 percent range to occur, especially for a multilevel index data set. The higher the LSR hit ratio, the more output requests for a data table can be handled.

 

Buffers and Data Table Tuning

 

You may wonder if you can’t control the LSR pool look-aside hit ratio for data sets containing output operations, then why not assign the data set to an individual LSR pool and dedicate enough buffers to accommodate all the data and index CIs? Once the entire data set is in the LSR buffers, there wouldn’t be any I/O operations except for the update request. However, there’s a huge cost in loading these buffers as every single CI (data and index) has to be read directly to get them into the pool. The data table is loaded using a sequential browse of the data set, reducing the I/O activity to load the entire data set. You could write a program that came up during the Program List Table Post Initialization (PLTPI) phase that could do the same (read the entire data set sequentially to populate the LSR buffers) but that requires additional programming.

 

Data table tuning information is available from the IBM-supplied STAT transaction (DFH$STAT) or from your performance monitor. Important information regarding the total amount of storage allocated to data spaces (in particular, DFHDT001 and DFHDT002) are areas to analyze. Obtaining the look-aside hit ratio is a little more complex and requires analyzing LSR statistics. However, remember that the LSR look-aside figures apply to all data sets that are using that particular buffer size. When you look at a look-aside hit ratio (say for the 4KB buffer) and see that it had a 90 percent look-aside hit ratio, this means all the data sets that use this buffer had a 90 percent hit ratio. This may not be what the data table is actually getting. Some data sets may be getting a higher percentage while others get a lower percentage.

 

If there’s a lot of contention for a particular buffer size used by a data table, then there’s a good probability that the data table request will result in a physical I/O. This is because the objective of the data table is to do minimum I/O so it wouldn’t reference the LSR buffers as often to maintain its data in the buffer for the Least Recently Used (LRU) algorithm that VSAM LSR uses.

 

Tuning the LSR pool for a data table data set requires some planning. As the activity for the data table to the LSR pool occasionally occurs (e.g., 10 percent of the time, if you follow the ROT), the odds are high that the buffers used by the data tables in the LSR pool have been replaced by higher activity data sets. Although some performance recommendations mention that you may want to reduce the number of buffers in an LSR pool when you convert a data set to a data table to reduce the real storage requirements, this may not be a good decision based on the information previously mentioned. On the contrary, you may want to move these buffers to a separate LSR pool that can be dedicated only to data sets that are accessed as data tables. This way, the only competition for buffers comes from other data tables. Since the quantity of output requests is lower, there’s a better chance that the data table’s buffers, especially the index, are still in the pool and haven’t been replaced.

 

Tuning your LSR pools is a simple matter with the right tools. For additional insight, you can read a previous article on LSR tuning published in z/Journal that’s available at http://zjournal.com/index.cfm?section=article&aid=1025.

 

 

Acknowledgement

 

The author would like to thank Randy Horowitz for his assistance in reviewing this article.


 
   
 
Untitled Document
ARTICLE INFO
ISSUE:
DEPTS: CICS Spotlight

SIMILAR ARTICLES

Legacy Migrations: Experiences of the Industry

full story

Legacy Migrations: Experiences of the Industry

full story

CICS Transaction Server for z/OS V3.2

full story

Automation vs. Quality in Application Modernization

full story

DB2 Packages: A Deeper Look

full story



ABOUT THE AUTHOR

Eugene S. Hudders
email: es_hudders@actpr.com...
website: click to visit

 





 

©2010 Thomas Communications, Inc.
Site development by everitt.company.
about us | editorial calendar | advertising | subscribe | contact | privacy policy