Mainframe systems have always used sophisticated I/O data buffering to mask disk mechanical activity, and with up to 1.5TB of memory, today’s System z servers can buffer massive amounts of data. Similarly, System z storage has traditionally used a large data cache to retain frequently referenced data in memory to also avoid disk I/O service times. So, given current memory and subsystem cache data buffering, how can SSDs help?
These days, one disk I/O can take anywhere from 10 to 20 milliseconds to service, and reading data from subsystem cache can take from 1 to 2 ms. But a storage cache can hold only 256GB of data.
In contrast, SSDs available from EMC, Hitachi Data Systems, and IBM can hold anywhere from 73GB to 400GB of data per drive. Also, SSD storage needs no mechanical movement to access data and as such, easily reads data as fast as subsystem cache. Thus, one subsystem supporting 8GB to 400GB SSDs could sustain just as much low-latency I/O as more than 12 caching disk-only subsystems at considerably less capital expense and power. With data on SSDs, it’s like having more data in cache, all the time.
Furthermore, given current size limitations, a subsystem cache normally holds only the most frequently referenced data. Caching such data works well for most applications but occasionally applications also need to access infrequently referenced data, resulting in long disk service times. Consequently, this infrequently referenced data will be flushed out of cache after being read and as such, often shows symptoms of long “disconnect times” or high “read-misses” when accessed. It is this data that will experience maximum performance benefits when placed on SSDs.
Thus, with SSDs the problem now becomes discovering which infrequently accessed data constrains application or transaction performance. EMC, IBM, and HDS all supply tools (e.g., “FLASHDA”) that analyze SMF data to identify potential candidate data sets. However, only IT personnel can know whether candidate data sets when stored on SSDs can improve system performance. Likely suspects may include page and swap data sets. Such data is often infrequently referenced, but when accessed, can significantly delay application or transaction execution. Also, any data bound to cache would be good candidates for SSD storage.
Other SSD considerations include:
• System z storage subsystems normally support multiple backend data paths. As such, spreading SSDs across as many backend paths as possible will help balance I/O performance.
• Most storage subsystems limit the number of SSDs they support. Contact storage vendors to understand SSD limitations for your subsystems.
• Storage subsystems write data first to cache and then destage it out to disk. As such, heavy write workloads may not benefit from SSDs. However, some write activity can become destage-limited, and when this occurs regularly, SSDs can help.
• Storage subsystems often spread sequentially accessed data across many disk spindles to speed up performance. Such data may not benefit from SSD storage, according to Clod Barrera, IBM’s chief technology strategist.
• Some subsystem vendors recommend installing IBM’s High Performance FICON for System z (zHPF) to take advantage of SSD performance. However, according to John Grehl, director of Mainframe Business at EMC, zHPF only provides marginal improvement over normal SSD performance and as such, may not justify the added expense.
SSD Déjà Vu
SSDs aren’t new to mainframe environments. In the late ’80s, multiple vendors introduced DRAM SSDs for mainframes. Such DRAM SSDs sustained blistering performance but were very expensive, often 1,000 or more times the price of equivalent disk capacity. DRAM’s high price and IBM support of shared external memory ultimately combined to drive these SSDs out of the mainframe market.
In contrast, today’s NAND (aka FLASH) SSDs have similar, DRAM-like performance but at only 15 to 40 times the cost of disk. Unlike DRAM SSDs, NAND SSDs retain data without requiring any battery backup. Also, NAND SSDs now match hard drive form factors, capacities and interfaces, allowing today’s SSDs to replace hard drives on a one-for-one basis in a subsystem rather than requiring a separate subsystem as before. As such, NAND SSDs inherit all the fault tolerance available in today’s mainframe storage. Given all this, current SSDs have become much more affordable than past DRAM versions and as such, much more popular.
Summary
Today’s NAND SSDs can dramatically improve System z application and transaction performance at reasonable expense. However, determining the optimal data sets to place on SSDs will take some effort. Fortunately, when done properly, the right data on SSDs can easily shrink a batch window from 22 to 17 hours, reduce transaction response time by 40 percent, and/or increase transaction postings by more than 25 percent, which can easily justify the purchase of SSD storage for your System z environment. As a result, this time SSDs are here for good.