VLINK="#008040" ALINK="#000000">

Compressed CKD Dasd Emulation

Introduction
Shadow Files
File Structure
Methodology
Quick Start
Using Shadow Files
Options
Utilities
FAQ
Changes
Bugs
cckddump os/390 hlasm program

Introduction

Using compressed CKD files, or cckd files, you can significantly reduce the file space required for emulated CKD dasd and possibly gain a performance benefit because less physical i/o occurs. Using the shadow function, you can minimize the amount of data loss in the event of file corruption.

A cckd file contains track images, which may be compressed or uncompressed, and overhead blocks which are headers, lookup tables, and free space. Compressed track images may be compressed by zlib or bzip2.

Track images are addressed by track number using a two table lookup method: track number divided by 256 (trk >> 8) indexes into the primary lookup table, which contains the file offset to the secondary lookup table; the remainder of track number divided by 256 (trk & 0xff) indexes into the corresponding secondary lookup table, which contains the offset and length of the track image.

There is a single primary lookup table and a variable number of secondary lookup tables. The maximum number of secondary lookup tables is the number of tracks for the device type divided by 256, rounded up. For example, a 3390-3 contains 50085 tracks and would require at most 196 secondary lookup tables.

A regular CKD file contains a 512 byte header followed by track images, each taking the same amount of space: the maximum track size. The offset of a track image can be readily calculated by the track number:

offset = 512 + trk * maxtrksz

A cckd file can take significantly less file space than a regular CKD file because

A track image only occupies its length in the file and not the maximum track size
A track image may be compressed, reducing its length
Unused or null tracks do not occupy any space at all

Performance improvements may also occur because less data is read and written to the hard drive.
However, the lookup tables must be accurately maintained, track images must be compressed and uncompressed, and free space must be kept track of, and dealt with by a garbage collector. This results in a more complicated file structure, more CPU activity, and the possibility of file corruption due to program failure or bug. The introduction of shadow files, however, reduces the impact of possible file corruption.

Shadow Files

Malcom Beattie originally introduced the concept of shadow files in a post to the newsgroup 8 December 2000. The function is actually implemented as a kind of snapshot, where a new shadow file can be created on demand. A CKD emulated dasd is represented by a base file and 0 or more shadow files. The base file can be either a regular CKD file (with some restrictions) or a cckd file. All files are opened read-only except for the current file, which is opened read-write.

Shadow files are implemented using the same file structure as base cckd files. By default, there can be up to 8 shadow files in use at any time for an emulated CKD device. The base file is designated file [0] and the shadow files are files [1] up to file [8]. The highest numbered file in use at a given time is the current file, where all writes will occur. Track reads start at the current file and proceed down until a file is found that actually contains the track image.

A shadow file, then, contains all the changes made to the emulated CKD dasd since its creation, until the creation of the next shadow file. The moment of the shadow file's creation can be thought of as taking a snapshot of the current emulated CKD dasd at that time, because if the shadow file is later removed, then the emulated CKD dasd will revert to the state it was at when the snapshot was taken.

Using shadow files, you can keep the base CKD file on a read-only device such as cdrom, or change the base CKD file attributes to read-only, ensuring that this file can never be corrupted.

Hercules console commands are provided to add a new shadow file, remove the current shadow file (with or without backward merge), and display the shadow file status and statistics.

CCKD File Structure

Like a regular CKD emulation file, the first 512 bytes of a compressed or shadow file contains a CKDDASD_DEVHDR block. The eye-catcher at the beginning is different to distinguish the file:

CKD_P370 Regular CKD file
CKD_C370 Compressed CKD file
CKD_S370 Shadow CKD file

The next 512 bytes contain a compressed device header or CCKDDASD_DEVHDR block. This contains the version-release-mod level of the file, options, space statistics, and total number of cylinders for the device. Next is the primary lookup table or the L1TAB. Each 4 byte entry in the L1TAB contains the file offset to a secondary lookup table (or L2TAB) or 0x00000000 (indicating that the secondary lookup table is null), or 0xffffffff (indicating that the previous file should be searched instead).
The size of the L1TAB is dependent on the number of tracks on the emulated device.

CKDDASD_DEVHDR

CCKDDASD_DEVHDR

L1TAB

. . .

Following the L1TAB, in no particular order, are L2TABs, compressed track images, and free spaces.

L2TABs contain 256 8-byte entries,and each are, consequently, 2048 bytes in length. Each entry contains the offset and length of a track image. If the offset is 0x00000000 then the track image is null; if the offset is 0xffffffff then the previous file should be searched instead.

L2TAB entry

offset
4 bytes length
2 bytes [unused]
2 bytes

A compressed track image contains the following two fields:

A track header, also called a home address or a track index. The track header is never compressed.
The track image, beginning with the R0 count and ending with the end-of-track marker, which is a count field containing all hex 0xff's. The track image may or may not be compressed.

HA
5 bytes track image (compressed or uncompressed)
length-5 bytes

The HA contains 0CCHH, that is, a byte of zeroes, 2 bytes indicating the cylinder of the track, and 2 bytes indicating the head of the track on the cylinder. Both CC and HH are stored in big-endian byte order. The track is computed by


trk = (((CC[0] << 8) + CC[1]) * trks_per_cyl) + (HH[0] << 8) + HH[1]

Since the first byte of the HA is always 0x00 (at least in emulated CKD files), this byte as stored in the file actually indicates the compression algorithm used for the remainder of the track image (0 = no compression, 1 = zlib compression, 2 = bzip2 compression).

Free space contains a 4-byte offset to the next free space, a 4-byte length of the free space, and zero or more bytes of residual data.

Free Space entry

offset
4 bytes length
4 bytes residual
(length - 8) bytes

The minimum length of a free space is 8 bytes. Since free space is ordered by file offset and no two free spaces are adjacent, offset in the free space entry is always greater than the current free space offset + the current free space length, unless the offset is zero, which indicates the free space chain is terminated.
The free space chain is read when the file is opened for read-write and written when the file is closed; while the file is opened, the free space chain is maintained in storage.

Methodology

This section is tedious; you probably want to skip to the next section unless you are genuinely curious as to how cckd actually works. This section is for my edification as much as anything...

Initialization

When a CKD dasd emulation file is initialized, function ckddasd_init_handler in ckddasd.c is called. After ckddasd_init_handler has completed its initialization, if the file is a cckd file, or if shadowing was specified, function cckddasd_init_handler in cckddasd.c is called.

cckddasd_init_handler obtains a cckd extension and stores its address in field cckd_ext in the DEVBLK (the control block that represents the device).

The compressed device header (CCKDDASD_DEVHDR) and L1TAB for the file is read; however, if the file is a regular file, a dummy CCKD_DEVHDR is built (all zeroes) and a dummy L1TAB is built (all 0xff's).

If shadow files exist, they are opened and their CCKD_DEVHDRs and L1TABs are read. If the last file opened could only be opened read-only, then a new shadow file is created.

The basic point is that the CCKD_DEVHDR and the L1TAB for the base file and each shadow file is read and stored in an array in the cckd extension, and each file is opened read-only, except the current file, which is opened read-write.

File I/O

In the course of executing a channel program, routines in ckddasd.c normally call the lseek, read, and write c library routines. These routines (as we all know;-) perform the following functions:

lseek Sets the current file offset to the specified value
read Reads the specified number of bytes from the current file offset to a buffer and increments the current file offset accordingly
write Writes the specified number of bytes from a buffer to the current file offset and increments the current file offset accordingly

If the cckd extension is present, however, routines cckd_lseek, cckd_read and cckd_write in cckddasd.c are called instead.

Routines cckd_read and cckd_write are not very interesting, they merely copy data to/from the caller's buffer from/to the active uncompressed track image buffer. cckd_write, it should be noted, sets a bit indicating that the active track image has been updated. All the interesting work results from cckd_lseek being called...

Track Switching

From the offset passed to cckd_lseek by ckddasd.c, using the formula described above, the requested track can be calculated using:

trk = (offset - 512) / maxtrksz If the calculated track number is the same as the current track number, then the new offset is noted and the function returns. Otherwise, an event known as a track switch occurs, and high-level function cckd_read_trk is called to make the new track the active track and the new track number the current track number.

cckd_read_trk

Function cckd_read_trk scans the track cache array to see if the new track image is cached. (By default, a cylinder's worth of track images are cached). If the new track is found to be cached, then the timestamp in the cache entry is updated, and the track image pointed to by the cache entry is made active (this is known as a cache hit). Otherwise, the cache entry with the oldest timestamp (called the least recently used or lru entry) is stolen.

If the stolen cache entry has had it's track image buffer updated (by cckd_write), then high-level routine cckd_write_trk is called to place the buffer on the deferred write queue and a new buffer is obtained.

If writes have previously occurred for the file, then the deferred write queue is scanned to see if the new track image has been queued to be written. If the new track image is in the deferred write queue then the current lru buffer is discarded and replaced with buffer scheduled to be written and cckd_read_trk returns, counting the encounter as a cache hit since no physical i/o to be performed. (Also, a cache bit, called the writing bit is turned on, indicating that if this cache entry is later stolen, then a new buffer must be obtained).

Otherwise, low-level routine cckd_read_trkimg is called to physically read the track image; the image is uncompressed (if necessary), and the cache entry timestamp is updated.

If cckd_read_trk was called by the i/o thread (ie by cckd_lseek), and the new track number is one more than the current track number, function cckd_readahead is called to asynchronously read 1 or more following track images, if those track images are not already in the track cache and readahead is enabled. cckd_readahead signals 1 or more readahead threads, implemented in function cckd_ra. cckd_ra, when signalled, calls cckd_read_trk to read the requested track. Note readahead is currently disabled for windows32 due to some, as yet unknown, problem in the pthreads implementation.

If the stolen cache entry had had its track image buffer updated (and cckd_write_trk was called), then the deferred-write-thread is signalled to actually begin the process of writing the old updated track image. We see that the updated track image is not sheduled to be written until after the new track image has been read and the readahead threads have been signalled. Further, we see that an updated track image is not scheduled to be written until its cache entry has been stolen; hence the moniker very-lazy-write.

After all this, cckd_read_trk returns, and the new track image buffer is made active and the new track number is made current.

cckd_write_trk

Function cckd_write_trk is called by cckd_read_trk whenever a stolen cache entry's track image buffer has been updated. The function obtains a deferred-write entry and places the entry at the head of the deferred-write queue. If this is the first time that cckd_write_trk has been called for the device (ie the first write), then a bit is set on in the CCKDDASD_DEVHDR indicating that writes have occurred for the file, and the deferred-write threads and the garbage collection thread are created. It is the responsibility of the caller of cckd_write_trk to actually signal the deferred-write thread (cckd_dfw) to initiate the write process.

cckd_dfw

Function cckd_dfw is a high-level routine that actually causes a track image to be written. It pops an entry off the deferred-write queue, compresses the track image, calls the low-level routine cckd_write_trkimg to perform the physical i/o, and releases the track image buffer (unless the track is still in the track cache, then the updated bit is turned off).

cckd_dfw will also throttle the deferred-write queue if it becomes too large; this causes the callers of cckd_write_trk (eg cckd_read_trk) to be suspended until the queue drops below its threshold.

More than one deferred-write thread can be created, by specifying a parameter on the device statement. The benefits, if any, of multiple threads has not yet been shown.

High level vs Low level routines

It has been casually mentioned above that cckd_read_trk, cckd_write_trk and cckd_dfw are high-level routines and cckd_read_trkimg and cckd_write_trkimg are low-level routines. In this context, high-level routines have no dependcy on the underlying file structure while the low-level routines do. The high-level routines are thread-aware while the low-level routines are not.

The Low level routines

There are low level routines to read and write each of the components of the cckd file structure (headers, l1tabs, l2tabs, track images, and free spaces); each are cognizant of the base file and shadow files, if any. These routines consist of the following functions

cckd_read_chdr read the compressed header
cckd_write_chdr write the compressed header
cckd_read_l1 read the primary lookup table
cckd_write_l1 write the primary lookup table
cckd_write_l1ent write a primary lookup table entry
cckd_read_fsp read the free space chain
cckd_write_fsp write the free space chain
cckd_read_l2 read a secondary lookup table
cckd_write_l2 write a secondary lookup table
cckd_read_l2ent read a secondary lookup table entry
cckd_write_l2ent write a secondary lookup table entry
cckd_read_trkimg read a track image
cckd_write_trkimg write a track image
All writes occur to the current file. The compressed header and the primary lookup table are kept in storage, so they are read once. Free space is read when the first write occurs for the file and is written when the file is closed. Secondary lookup tables are cached, so cckd_read_l2 doesn't necessarily perform physical file i/o. The base file can be a regular CKD file.

Shadow file routines

The routines that manipulate the shadow files are

cckd_sf_name generates a file name for a given shadow or base file
cckd_sf_init performs shadow file initialization
cckd_sf_new creates a new shadow file
cckd_sf_add adds a shadow file (sf+ panel command)
cckd_sf_remove removes a shadow file, with or without backwards merge (sf- panel command)
cckd_sf_newname sets a new shadow file name if shadowing is not currently active (sf= panel command)
cckd_sf_stats display base and shadow file statistics (sfd panel command)

The garbage collector

The garbage collection thread is created when the first write occurs to the file. The garbage collection thread is only active for the current file. When a new track image is written, the space it previously occupied in the file is freed, and new space is acquired. The garbage collector moves track images and secondary lookup tables to combine free spaces and tends to move free space towards the end of the file so it can drop off. The garbage collector also schedules track images to be written if they haven't been referenced in some amount of time.

Byte order

As described above, a number of fields in the various blocks that comprise the spaces in a compressed CKD Dasd emulation file contain offsets and lengths that are more than 1 byte in length. Values in multiple bytes may be stored in either little-endian or big-endian byte order. For example, Intel architecture stores values in little-endian byte order and S390 architecture stores values in big-endian byte order. Consider the value 0x00010203; stored in little-endian byte order, we would see "03020100"; stored in big-endian byte order, we would see "00010203". The values in the compressed CKD Dasd emulation file are stored in byte order of the host machine; a bit in the CCKDDASD_DEVHDR indicates which order its values are stored. If a file is opened with the wrong byte order, then the initialization routine will automatically reverse all the values before continuing.

If a base file or shadow is read-only and contains the wrong byte order, then the fields are automatically converted when the blocks are read.

Quick Start

The ckd2cckd utility can be used to create a new compressed CKD file from a regular CKD file. Your disk images can be a combination of regular CKD files and compressed CKD files. Simply specify the names of your new compressed ckd files in hercules.cnf in place of the regular CKD file names. You can also use the cckddump program on an os/390 system to build a compressed CKD file from a real disk that can be transferred to your Hercules machine and used right away.

Using Shadow Files

Shadow files enable you to make updates to cckd emulation files and not worry about possibly corrupting your entire disk image. I strongly urge those of you who use cckd to start using shadow files immediately and change your base file to read-only. This, in turn, reduces the amount of data you have to back up, increasing the amount of file savings cckd has to offer. You can even change shadow files to read-only, as long as a new shadow file can be created. You can also use shadow files for regular (non-cckd) files.

Shadow files are automatically enabled for cckd files; you must explicitly enable them for regular CKD files. To enable shadowing for a CKD device, specify

sf=shadow_file_name
on the device statement in the hercules.cnf file. shadow_file_name should include a spot in the file name, similar to multiple CKD dasd files, that can be used as a sequence number, for example, sf=../mvs/shadows/mvsres_1.500. The naming convention substitutes the shadow file number (1 thru 8) on the character preceding the period after the last slash, or the last character if no period follows the last slash. Example

0500 3390 ../mvs/disks/mvsres.500 sf=../mvs/shadows/mvsres_1.500

If you did not specify sf= for a cckd file, or you wish to change the shadow file name for a cckd or regular file, but no shadow files are in use, then you can issue the following command on the Hercules console:

sf=xxxx shadow_file_name
where xxxx is the device unit address. For example, sf=0500 ../mvs/shadows/mvsres_1.500.

Specifying a shadow_file_name does not explicitly create a shadow file if the base file or current shadow file is able to be opened read-write. Otherwise, if the base file and all existing shadow files (if any) can only be opened read-only, then a new shadow file is created.

To explicitly create a new shadow file, issue the following command on the Hercules console:

sf+xxxx
where xxxx is the device unit address or * (for all eligible units). For example, sf+0500. All updated track images that haven't been written are written and the current file is hardened. Note that if a lot of write activity is ocurring at the time the sf+ command is entered, then the exact state of the hardened file can not be predicted. A new shadow file is created and all new writes are directed to it.

To remove the current shadow file, issue either of the following commands on the Hercules console:

sf-xxxx sf-xxxx nomerge
where xxxx is the device unit address or * (for all eligible units). For example, sf-0500. If nomerge was not specified, then the current shadow file contents are merged into the preceding shadow file or base file. The current shadow file is deleted and the preceding shadow file or base file is made the current file. If the preceding file is read-only, then an error message is issued. If possible, you can make the preceding file read-write and re-issue the command. Note that if merge is specified or implied, then the command may take some amount of time depending on the size of the old shadow file.
[hmmm... note to myself -- if sf-xxxx nomerge was specified and preceding file is read-only, then delete the current file and recreate it ??]

To compress the current shadow file issue the following command on the Hercules console:

sfcxxxx
where xxxx is the device unit address or * (for all eligible units). For example, sfc0500.

To display the status and statistics for a shadow-enabled file, issue the following command on the Hercules console:

sfdxxxx
where xxxx is the device unit address or * (for all eligible units). For example, sfd0500. This command displays status and statistics for the base file and all shadow files representing the emulated dasd. The following data is displayed:

	size	The total size of the file
	free	The amount of free space in the file as a percentage of the file size
	nbr	The number of free spaces in a file
	st	File open status - ro=read-only; rd=read-only, but can be opened read-write; rw=read-write
	reads	Number of times cckd_read_trkimg performed physical read i/o
	writes	Number of times cckd_write_trkimg performed physical write i/o
	l2reads	Number of times a secondary lookup table was read
	hits	Number of times cckd_read_trk found a track image in the track cache when called by the i/o thread
	switches	Number of times cckd_read_trk was called by the i/o thread (cckd_lseek)
	readaheads	Number of track images read by the readahead threads
	misses	Number of track images read by the readahead threads that were never referenced when the track cache entry was stolen

Special note when using shadowing for regular CKD files

You can use shadow files with regular CKD files providing that the regular CKD file is contained in a single file (since only 1 base file is supported) and if the sf= option was specified on the device initialization statement. If shadowing is active for a regular CKD file, then all i/o for the file is performed by the cckd code. Interestingly, if shadowing is specified for a regular CKD file, but the CKD file is opened read-write, and no sf+ command is issued to create a shadow file, then the regular CKD file is processed as a cckd file, with asynchronous readaheads, deferred writes and garbage collection (all the garbage collector does in this case is schedule updated track images for write after a specified amount of time). The final caveat is that cckd files, and by extension shadow files, are always the size of the device type, while regular CKD files can be less. For example, you can specify a 100 cylinder 3390 regular CKD file, but with shadowing, the file size will appear to be 1113 cylinders (the size of a 3390-1 device). The device may have to be varied offline and back online to the operating system (or the equivalent) for the new space to be recognized. However, if you do write data to the newly provided space, then a backwards merge cannot be performed (sf-).

CKD Options

In this section I will attempt to document all the options that can be specified for a CKD file (regular or compressed) in the hercules.cnf file (or on the attach panel command).

	Regular	cckd	Function
syncio syio nosyncio nosyio	X	X	Specifies whether or not synchronous I/O will be attempted for the device. For synchronous I/O, the channel program will be executed within the scope of the SIO or SSCH instruction as long as all data referenced by the channel program is already cached. If a ccw attempts to reference data that is not cached, then the channel program is restarted asynchronously at that ccw. Synchronous I/O reduces threading overhead, which may resut in a performance boost. The default is syncio for cckd files and nosyncio for regular ckd files.
lazywrite nolazywrite	X		Data written to a cached track image will not be immediately written, but will be written when a track switch occurs. Thus, only one write will occur for a track image while it is the active image. nolazywrite, the default, specifies that all writes are performed when requested.
fulltrackio fulltrkio ftio nofulltrackio nofulltrkio noftio	X		Specifies whether or not a full track will be read when a track switch occurs. Subsequent reads to this track image will not cause any physical I/Os. Turning on fulltrackio can considerably enhance CKD device response time. However, if you are sharing CKD disk images with more than 1 instance of Hercules at the same time when writes could occur, you should specify nofulltrackio. The default is fulltrackio.
readonly rdonly ro	X	X	Causes the CKD file image to be opened read-only. Attempts to write to the emulated device will cause an I/O error unless option fakewrite is also specified. If readonly is specified for shadowed file images, then the base file will be opened readonly and a shadow file will be created if one doesn't exist.
fakewrite fakewrt fw	X	X	Writes to a readonly file will be considered successful even though no write actually occurred. This option is only meaningful if readonly is also specified. Fakewrite is ignored for shadowed file images.
cache=n	X	X	Specifies the number of track images that will be cached. The default is the number of tracks per cylinder for the device. [For cckd files, the default is the number of tracks per cylinder plus the number of readahead threads]. If nofulltrackio is specified for a regular CKD file, then no caching occurs. Caching always occurs for cckd files, although you can set the cache value to 1.
sf=file_name	X	X	Specifies the name of the shadow file(s) for the emulated device. The name should have a spot where the shadow file number can be inserted into the name (see above).
l2cache=n	*	X	Specifies the number of Secondary Lookup Tables (l2tabs) that will be cached for the cckd or shadowed device. (Each l2tab is 2048 bytes). The default is 32.
dfwq=n	*	X	Specifies a threshold for the size of deferred-write-queue where processing will be throttled if the size exceeds this number. Each entry in the deferred- write-queue contains a pointer to a buffer whose size is max-track-size. The default is 64.
wt=n	*	X	Specifies the time in seconds that an updated track image will be written after its last reference. The garbage collector is responsible for scheduling these old track images to be updated. The default is 60 seconds.
ra=n	*	X	Specifies the number of readahead threads (and number of tracks to be read ahead) when sequential access to the emulated device is detected. That is, each track that is read ahead of time is read by a different thread. A value between 0 and 9 can be specified. Currently, readahead should be disabled for Windows32 due to an unknown error involving the pthreads implementation. Default for WIN32 is 0 otherwise the default is 2.
dfw=n	*	X	Specifies the number of deferred write threads. A number between 1 and 9 may be specified; the default is 1. It has not been shown that specifying a greater number results in any performance improvements.

* Option is only applicable if shadowing is active for the regular CKD file.

Generally, the defaults for all options (except sf=) should not be changed unless there is an explicit reason for doing so. If you use cckd files, then I strongly recommend that you start using shadow files. If you use regular CKD files, then you can use shadow files if you want to gain the snapshot benefit .

Utilities

ckd2cckd [options] source-file target-file

Description Copies a regular CKD Dasd emulation file to a compressed CKD Dasd emulation file. The target file cannot previously exist. If the emulated Dasd device is in more than 1 file then specify the first file. After the copy completes, the target file contains no free space, imbedded or otherwise.
Options
- -compress n
  Compression Algorithm
  - 0 don't compress
  - 1 compress using zlib
  - 2 compress using bzip2
- -dontcompress n
  Same as -compress 0
- -maxerrs errs
  Maximum number of errors that can occur before the copy is terminated; if 0 then errors are ignored. Default is 5.
- -nofudge
  [deprecated]
- -quiet
  Quiet mode; don't display status
- -z parm
  Parameter passed to compression
  
  zlib compression level:
  0 = no compression
  1=fastest ... 9=best
  
  bzip2 blockSize100k value:
  1=fastest ... 9=best

cckd2ckd [options] source-file target-file

Description Copies a compressed CKD Dasd emulation file to a regular CKD Dasd emulation file. The target file cannot previously exist. More than 1 target file may be created.
Options
- -cyls n
  Number of cylinders to copy if the entire file isn't to be copied. If 0 then only the number of cylinders in use are copied.
- -maxerrs errs
  Maximum number of errors that can occur before the copy is terminated; if 0 then errors are ignored. Default is 5.
- -quiet
  Quiet mode; don't display status
- -validate
  Validate track images [default]
- -novalidate
  Don't Validate track images

cckdcdsk [-level] file-name

Description Performs compressed or shadowed CKD Dasd emulation file integrity verification and recovery and repair.
Options
- -level
  A digit 0, 1 or 3 that specifies the level of checking. The higher the level, the longer the integrity check takes.
  - 0 Minimal checking. Device headers are verified, free space is verified, primary lookup table and secondary lookup tables are verified.
  - 1 Same checks as level 0 plus all 5-byte track headers are verified.
  - 3 Same checks as level 1 plus all track images are read, uncompressed and verified.

cckdcomp [-level] file-name

Description Removes all free space from a compressed or shadow CKD Dasd emulation file. (Compresses or compacts a cckd file ... your choice!). If level is specified, then cckdcdsk is called first with the specified level; this is a short-hand method to call both functions in one utility call.
Options
- -level
  A digit 0, 1 or 3 that specifies the level of checking. The higher the level, the longer the integrity check takes.
  - 0 Minimal checking. Device headers are verified, free space is verified, primary lookup table and secondary lookup tables are verified.
  - 1 Same checks as level 0 plus all 5-byte track headers are verified.
  - 3 Same checks as level 1 plus all track images are read, uncompressed and verified.

cckdfix file-name

Description This is a skeleton program that is not compiled during make. It can be edited to change/repair the device headers.
Compiling Enter `cc -o cckdfix -DARCH=390 cckdfix.c' to compile and link the edited program.

cckddump

Description This is an os/390 hlasm (High Level Assembler) program that will create a compressed CKD emulation file from an actual CKD device. See below for a description on how to build and run this program.

FAQ

Q. What devices are supported ?

A. 2311, 2314, 3330, 3340, 3350, 3375, 3380, 3390 and 9345.

Q. Is a 3390 model 9 supported ?

A. The short answer is "no". Long answer, "sort of". A 3390-9 should compress to a file size less than the 2G limit. However, the compressed dasd program "hooks" into ckddasd.c by replacing the lseek, read and write library calls with a call to an intermediate function. The file offset parameter passed to lseek is a 32-bit signed number. For a compressed file, the cckd code treats this number as unsigned (for SEEK_SET) and uses this number to calculate the dasd track and offset. That is, for a compressed file, the file offset maintained by ckddasd.c is just a number that indicates a track and the offset into the track. That means that the largest offset is 4G-1, which is not a problem for a 3390-3 but only references about half of a 3390-9. It would be possible to modify ckddasd.c to use long long when dealing with file offsets, but I wanted to minimize changes to ckddasd.c and this change seemed a little too intrusive.

Q. How can I get rid of the free space in my files ?

A. Once the total amount of free space falls below 6% of the total file size, the garbage collector is not very aggressive about eliminating free space. To remove all free space from the file while Hercules is running use the sfc console command. See Using Shadow Files above. Otherwise, you can use the cckdcomp utility. See Utilities above.

Q. How can I display the space statistics for a compressed file ?

A. The statistics are displayed when the compressed file is opened. Currently, there is no supplied method to display these statistics at any other time. However, it shouldn't be too hard to write a shell script (similar to dasdlist) to display these statistics. The statistics are contained in the CCKDDASD_DEVHDR which is at offset 512 in the compressed file; the header is mapped in hercules.h.

Q. What is a "null track" anyway ?

The term "null track" is just something I made up. It is what is returned when a zero offset is found in either the primary or secondary lookup table for the track. It contains the folllowing fields:

`0CCHH`	Home address
`CCHH0008 00000000`	standard R0
`CCHH1000`	end-of-file marker
`ffffffff`	end-of-track marker

When a null track is written, space previously occupied by the track is freed and the offset in the secondary lookup table is set to zero. If all offsets in the secondary lookup table are zero, then the secondary lookup table is freed and the primary lookup table entry is zeroed.

Q. I want to try bzip2 but I'm getting compiler errors. What am I doing wrong ?

A. Probably bzip2 is not installed or is not installed properly. You can obtain bzip2 from here. If bzip2 is installed, then you need to find the directory where bzlib.h is installed and the directory where libbz2.a is installed. You can then add "-I bzlib.h-directory" to the CFLAGS in the make file and add "-L libbz2.a-directory" to the LFLAGS.

Q. Which is better, zlib or bzip2 ?

A. This is a religious question. I have no actual preference, I just wanted to make a choice available.

Q. Can other compression programs be used ?

A. Yes. The program is architecturally structured so that other compression algorithms can be added rather painlessly. This will require, of course, an update to the source.

Q. Can this compression scheme be used for FBA devices too ?

A. I have not worked with FBA devices for over 20 years. However, it seems to me that a similar program for FBA devices should be simpler than this program for CKD devices (none of those count/key/data fields mucking everything up). Since an FBA block is 512 bytes, it might not be efficient to have each block compressed individually; it might be better to compress blocks in 32K or 64K chunks. If someone asks very nicely, I may consider looking into it;-)

Changes

0.2.0
- This release greatly enhances the stability of cckd files.
- Free spaces are read at file open time, maintained in storage, and written at file close time. If a file is not successfully closed, then free spaces are easily recovered by the chkdsk function when the file is opened next.
- Imbedded free space is deprecated. Because the free space chain is now in storage, the overhead required to keep an updated track image at its same location in the file is no longer necessary; the penalty for traversing the free space chain to find a new location is greatly reduced. Also, because a free space header is no longer written at the beginning of a freed space when it is freed, that space is eligible for recovery in the event of a failure.
- Garbage collection is more efficient because it can combine the most free spaces per iteration now that the penalty for scanning free space is gone.
- Secondary lookup tables (or l2tabs) are now cached. Most overhead I/O for cckd files is now eliminated.
- Support for read-only files is added.
- Utility cckdcomp is provided to remove all free space from a cckd file.
0.2.1
- The concept of shadow files is implemented, which logically performs a snapshot function.
- Windows 32 support is enabled.
- Seems like there should be more but I can't mush my brain much farther. Some new options have been added so #defines in hercules.cnf don't have to be changed; overflow tracks are supported (I hope) -- Thanks Valery!!

BUGS

This code is absolutely bug free; if you encounter any problems then it must be a personal problem and you've done something wrong. Also, there are no enhancements that can be made because I've already thought of them all and implemented them. By the way, I have some prime soon to be ocean front property in Tennessee to sell to the highest bidder;-)

cckddump os/390 hlasm program

The cckddump program (supplied in file cckddump.hla) is an os/390 assembler language program that creates a compressed CKD Dasd emulation file from a real DASD volume. This program must be APF-authorized since it modifies the DEB to be able to read all tracks from the real device. The program executes 16 or so instructions while in supervisor state/key 0; otherwise the program runs entirely in problem state/key 8. It is not the prettiest assembler language program I've ever written, and there are plenty of enhancements that I originally intended to put into the program that I haven't yet; once I got the program working good enough, I spent the rest of my time writing the fun stuff, the Hercules part.

The real CKD Dasd volume that is dumped must be an ECKD device (ie support 'Locate Record' and 'Read Track' CCWs); this shouldn't be a problem because I don't think any os/390 release supports a non-ECKD device. The output file must be a DASD file; its characteristics are LRECL=4096, BLKSIZE=4096, RECFM=F. The program only dumps allocated tracks (plus track 0) and only dumps tracks up to DS1LSTAR for DSORG=PS and DSORG=PO files. The program will call zlib to compress the track images if the zlib routines have been linked with the program; however, I don't think the program will be advantageous if it can't call zlib.

Preparing zlib

zlib can be obtained from here
Copy or ftp the *.c files to a LRECL=255,RECFM=VB partitioned dataset; here we will call the dataset prefix.ZLIB.C
Similarly, copy or ftp the *.h files to a LRECL=255,RECFM=VB partitioned dataset; we'll call it prefix.ZLIB.H

Edit member prefix.ZLIB.H(ZCONF). Near the bottom, before the 2nd to last #endif, add the following lines:

    #   pragma map(compress,"COMPRESS")
    #   pragma map(compress2,"COMPRES2")
    #   pragma map(uncompress,"UNCOMPRE")

Allocate an object partitioned dataset prefix.ZLIB.OBJ; LRECL=80,BLKSIZE=3200,RECFM=FB.
Submit the following job to compile zlib:

//         JOB
//CC    JCLLIB ORDER=(CBC.SCBCPRC)
//*
//ADLER32 EXEC EDCC,INFILE='prefix.ZLIB.C(ADLER32)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(ADLER32),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//COMPRESS EXEC EDCC,INFILE='prefix.ZLIB.C(COMPRESS)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(COMPRESS),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//CRC32   EXEC EDCC,INFILE='prefix.ZLIB.C(CRC32)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(CRC32),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//DEFLATE EXEC EDCC,INFILE='prefix.ZLIB.C(DEFLATE)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(DEFLATE),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//EXAMPLE EXEC EDCC,INFILE='prefix.ZLIB.C(EXAMPLE)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(EXAMPLE),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//GZIO    EXEC EDCC,INFILE='prefix.ZLIB.C(GZIO)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(GZIO),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFBLOCK EXEC EDCC,INFILE='prefix.ZLIB.C(INFBLOCK)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFBLOCK),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFCODES EXEC EDCC,INFILE='prefix.ZLIB.C(INFCODES)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFCODES),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFFAST EXEC EDCC,INFILE='prefix.ZLIB.C(INFFAST)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFFAST),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFLATE EXEC EDCC,INFILE='prefix.ZLIB.C(INFLATE)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFLATE),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFTREES EXEC EDCC,INFILE='prefix.ZLIB.C(INFTREES)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFTREES),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//INFUTIL EXEC EDCC,INFILE='prefix.ZLIB.C(INFUTIL)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(INFUTIL),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//TREES   EXEC EDCC,INFILE='prefix.ZLIB.C(TREES)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(TREES),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//UNCOMPR EXEC EDCC,INFILE='prefix.ZLIB.C(UNCOMPR)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(UNCOMPR),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H
//*
//ZUTIL   EXEC EDCC,INFILE='prefix.ZLIB.C(ZUTIL)',
//             CPARM='RENT,LIST,SOURCE,LONGNAME,AGG,OPT(2)',
//             OUTFILE='prefix.ZLIB.OBJ(ZUTIL),DISP=SHR'
//USERLIB   DD DISP=SHR,DSN=prefix.ZLIB.H

Prelink zlib using the following job:

//        JOB
//PLKED  EXEC PGM=EDCPRLK
//SYSMSGS  DD DISP=SHR,DSN=CEE.SCEEMSGP(EDCPMSGE)
//SYSLIB   DD DISP=SHR,DSN=prefix.ZLIB.OBJ
//         DD DISP=SHR,DSN=CEE.SCEEOBJ
//SYSOUT   DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN    DD DISP=SHR,DSN=prefix.ZLIB.OBJ(ADLER32)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(COMPRESS)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(CRC32)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(DEFLATE)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(GZIO)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFBLOCK)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFCODES)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFFAST)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFLATE)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFTREES)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(INFUTIL)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(TREES)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(UNCOMPR)
//         DD DISP=SHR,DSN=prefix.ZLIB.OBJ(ZUTIL)
//SYSMOD   DD DISP=SHR,DSN=prefix.ZLIB.OBJ(ZLIB)

Assemble and linkedit cckddump

Allocate partitioned dataset prefix.cckddump.source; LRECL=80,RECFM=FB and copy or ftp file cckddump.hla

Submit the following job:

//        JOB
//C      EXEC PGM=ASMA90
//SYSLIB   DD DISP=SHR,DSN=SYS1.MACLIB
//         DD DISP=SHR,DSN=SYS1.MODGEN
//SYSPRINT DD SYSOUT=*
//SYSIN    DD DISP=SHR,DSN=prefix.cckddump.source(CCKDDUMP)
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(1,1))
//SYSLIN   DD DISP=(,PASS),DSN=&&OBJ,UNIT=SYSDA,SPACE=(CYL,(1,1))
//            LRECL=80,BLKSIZE=3200,RECFM=FB
//L      EXEC PGM=HEWL
//SYSPRINT DD SYSOUT=*
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(1,1))
//SYSLIB   DD DISP=SHR,DSN=CEE.SCEESPC
//         DD DISP=SHR,DSN=CEE.SCEELKED
//ZLIB     DD DISP=SHR,DSN=prefix.ZLIB.OBJ
//SYSLMOD  DD DISP=SHR,DSN=apfauth.load
//SYSLIN   DD DISP=(OLD,DELETE),DSN=&&OBJ
//         DD *
  INCLUDE  ZLIB(ZLIB)
  INCLUDE  SYSLIB(EDCXHOTL)
  INCLUDE  SYSLIB(EDCXHOTU)
  INCLUDE  SYSLIB(EDCXHOTT)
  ORDER    MAIN(P)
  ENTRY    MAIN
  SETCODE  AC(1)
  NAME     CCKDDUMP(R)

The assemble step (C) should complete with condition code 4. This is a `feature' due to the way IBM macro IECSDSL1 is coded. The linkedit step (L) should complete with condition code 0.

Executing cckddump

The volume to be dumped is identified by the SYSUT1 DD statement; the output compressed CKD Dasd emulation file is identified by the SYSUT2 DD statement.

Submit a job similar to the following:

//        JOB
//S1     EXEC PGM=CCKDDUMP
//STEPLIB  DD DISP=SHR,DSN=apfauth.load
//SYSPRINT DD SYSOUT=*,RECFM=VB,LRECL=255,BLKSIZE=4096
//SYSUT1   DD DISP=OLD,UNIT=SYSDA,VOL=SER=volser
//SYSUT2   DD DISP=(,CATLG),DSN=prefix.volser.cckd,
//            UNIT=SYSDA,SPACE=(TRK,(7500,1500),RLSE),
//            LRECL=4096,BLKSIZE=4096,RECFM=F

Make the file available to Hercules

Copy or ftp prefix.volser.cckd in binary mode to your platform running Hercules.

Feedback

Questions ?? Problems ?? Comments ?? Suggestions ?? Corrections ?? Bugs ??
Let me know at gsmith@nc.rr.com

greg smith

Last updated 11 January 2001

Compressed CKD Dasd Emulation

Contents

Preparing zlib

Assemble and linkedit cckddump

Executing cckddump

Make the file available to Hercules