This chapter describes the logical and physical concepts that apply to the Data Division. In addition, this chapter presents the general formats for all Data Division entries and clauses, describes their basic elements, and lists rules of use.
The Data Division defines the data processed by your COBOL program in both physical and logical terms. It also specifies whether the data is contained in files, a database, Oracle CDD/Repository, or is developed only for local use in your program.
The File and Report Sections of your program define data contained in files. A file description, sort-merge file description, or report file description entry creates a logical structure, or file connector, that refers to the physical file. It also can contain clauses that define physical file characteristics. A file description or sort-merge file description entry must be associated with at least one record description entry. A record description entry is a set of one or more data description entries, organized in a hierarchical structure which logically defines a set of related data within the file. The data description entries specify all the data used in your program. You logically define the record hierarchy by the level numbers you use for the data description entries (or entry) within the record description entry. Your logical link to a record or to a field in a record is the data-name you assign in a corresponding data description entry. The clauses in a data description entry also specify physical data attributes, such as storage format and initial values.
A report description entry must be associated with a report group description, which specifies both the logical hierarchy of data in the report and the data's physical attributes.
A screen description entry describes a video form or a portion of a video form.
The Working-Storage and Linkage Sections also contain data description entries, which describe characteristics of data developed for use in your program.
The following sections explain in more detail how a COBOL program
specifies physical and logical characteristics. Additionally, the
following sections describe how record descriptions impose logical
structures on data, and how the physical attributes of data affect the
way data is stored and manipulated.
5.1 Logical Concepts of Data Storage
Because a record description is a logical, rather than a physical structure, a program can define more than one record description for the same data. However, this redefinition does not mean that the physical data changes in any way. Multiple record descriptions for the same data all apply to one physical data unit on the file medium.
When you refer to a data-name in a Procedure Division statement, you are referring to a logical unit, either a logical record or a logical subset of that record. When your COBOL source statements execute, the logical units to which they refer are mapped to physical units on media. The logical units are then manipulated according to their physical attributes.
The correspondence between a logical record and a physical record is not necessarily a one-to-one correspondence. The term physical record applies to a data unit that is media dependent and defined by the I/O system. On OpenVMS Alpha systems, the I/O system is called OpenVMS Record Management Services (RMS). A logical record may correspond to one physical record, either alone or grouped with other logical records. Or, on disk, a logical record may need more than one physical record to contain it.
Several COBOL clauses (in the Environment and Data Divisions) describe the relationships between logical records and physical records. Programs can then access data as logical entities with little regard to the physical data definitions that the I/O system requires.
During program execution, data transfer between the program and a
physical record can involve translation if the SELECT clause contains a
5.1.1 Record Description Entries
Logical records do not have to be subdivided; however, they often are. Subdivision can continue for each of the record's parts, allowing progressively more detailed data definition.
The basic subdivision of a record is the elementary data item (or elementary item), which you define by specifying a PICTURE clause (except for COMP-1 or COMP-2). As the term implies, elementary items are never subdivided. A logical record consists of one or more sets of elementary items, or is itself an elementary item.
A group data item (or group item) is a data set within a record that contains other subordinate data items. The lowest-level group item is always a named sequence of one or more elementary items. Group items can combine to form more inclusive group items. Therefore, an elementary item can be subordinate to more than one group item in the record.
Figure 5-1 represents a personnel record that illustrates how elementary and group items can be related in a record hierarchy. The record contains three group items directly subordinate to the top level: Identification Data, History, and Payroll Data. The first group item, Identification Data, directly contains two elementary items, Name and Job Title, and two other group items, Employee Number and Address. The group item, Employee Number, contains two elementary items: Department Code and Badge Number. The group item, Address, contains four elementary items: Street, City, State, and ZIP Code. The elementary item, City, belongs to three group items. It is subordinate to Address, Identification Data, and Personnel Record. The second group item, History, directly contains three elementary items: Hire Date, Last Promotion Date, and Termination Date. The third group item, Payroll Data, also directly contains two elementary items: Current Salary and Previous Salary.
Figure 5-1 Hierarchical Record Structure
Record description entries use a system of level-numbers to specify the hierarchical organization of elementary and group items. Level-numbers that specify hierarchical structure can range from 01 to 49.
The record is the most inclusive data item; that is, there is no hierarchical relationship between one record description entry and any other. However, there is a hierarchical relationship between a group item and its subordinate group or elementary items. The level-number for records is 01. Less inclusive data items have greater (although not necessarily consecutive) level-numbers.
All items subordinate to a group item must have level-numbers greater than the group's level-number. In a record description, a group item is delimited by the next level-number that is less than or equal to that group's level number.
Figure 5-2 shows how level-numbers specify hierarchical structure and how the presence of the PICTURE clause defines an elementary item. Although line indentation can make record descriptions easier to read, it does not affect record structure; only the level-number values specify the hierarchy. The ellipsis (...) indicates that parts of the program line have been omitted.
Figure 5-2 Level-Number Record Structure
Three special level-numbers---66, 77, and 88---neither specify hierarchical structure nor actually indicate level. Rather, they define special types of data entries:
Example 5-1 shows a sample file description entry (FD) that contains three record description entries. The three record description entries define three logical templates the program can impose on a record to access data from it.
|Example 5-1 Multiple Record Definition Structure|
FD MASTER-FILE. 01 T1. 02 T1-ACCOUNT-NO PIC 9(6). 02 T1-TRAN-CODE PIC 99. 02 T1-NAME PIC X(13). 02 T1-BALANCE PIC 9(5)V99. 02 REC-TYPE PIC XX. 01 T2. 02 T2-ACCOUNT-NO PIC 9(6). 02 T2-ADDRESS. 03 T2-STREET PIC X(15). 03 T2-CITY PIC X(7). 02 REC-TYPE PIC XX. 01 RECORD-TYPE. 02 PIC X(28). 02 REC-TYPE PIC XX.
The three record description entries in Example 5-1, T1, T2, and
RECORD-TYPE, each define a fixed-length record of 30 characters. Once
the program reads a record, it can use the last two characters
(REC-TYPE) to determine which record description entry to use.
5.2 Physical Concepts of Data Storage
The size of a physical record and the way it is recorded depend on the hardware device involved in an input or output operation. For example, tape and disk media store physical records differently. On tape, a physical record is written between interrecord gaps. On disk, a physical record is written in multiple units of a fixed number of bytes, which is determined by the hardware and operating system involved.
On OpenVMS Alpha systems, the term used for a physical record differs according to file organization. A physical record in a sequential file is called a block. A physical record in a relative or indexed file is called a bucket. A block or bucket corresponds to the unit used by the I/O system software to transfer records from a file to your program (and vice versa). The number of records (in logical terms) actually transferred by an input-output operation depends on the following:
The maximum physical record size depends on file organization and device. On OpenVMS Alpha systems, the maximum physical record sizes for sequential files on tape devices and for sequential, indexed, and relative files on disk are shown in terms of number of bytes in Table 5-1.
|Type of File||Magnetic Tape Devices||Disk|
|Sequential||65,535 bytes||65,024 bytes|
|Relative||N/A||32,255 bytes <>|
A compile-time informational diagnostic appears if the physical record size exceeds 65,024 bytes for a sequential file. However, Compaq COBOL programs are device-independent. Therefore, a fatal run-time error can also occur if the file is assigned to disk when the program runs.
The size and storage format of an elementary data item depend upon what class and category of data it represents and how that data can be used. A data item's PICTURE clause determines its class and category. The item's PICTURE clause and USAGE clause, in combination, specify its size and storage format. See the Section 5.3.37 and Section 5.3.52 clauses for more information.
When an arithmetic or data-movement statement transfers data into an elementary item, the category of the item affects the way the data is positioned in storage. The COBOL Standard Alignment Rules (see Section 5.2.2) specify the relationship between category and positioning.
Depending on the symbols contained in its PICTURE clause, every elementary item belongs to one of the classes and categories of data items shown in Table 5-2. COMP-1, COMP-2, index data items, and index-names do not have PICTURE clauses; the format of these elementary items is specified by the compiler and they belong to the numeric category.
The class of a group item is treated as alphanumeric regardless of the class of elementary items subordinate to it. Therefore, your program statements should not specify a group item when a numeric item is expected or required.
If the JUSTIFIED clause applies to the receiving item, the rules for the
JUSTIFIED clause override rule 3. See the Section 5.3.28 clause for more
5.2.3 Additional Alignment Rules for Record Allocation
As stated in Section 5.2.2, the COBOL Standard Alignment Rules specify data positioning only within elementary data items. Compaq defines additional alignment rules that affect the positioning of:
Compaq COBOL offers the option of allocating subordinate record items along performance-optimal boundaries through the use of the alignment compiler option or directives (or the SYNCHRONIZE clause on OpenVMS Alpha). If you select one of these options, subordinate data items will be aligned automatically along optimal boundaries for their data type. The compiler may have to skip one or more bytes before assigning a location to the next data item. These skipped bytes, called fill bytes, are spaces between one data item and the next. See the Compaq COBOL User Manual for information on using alignment compiler options and directives.
The presence of fill bytes can make a record's structure different from what you might expect. In particular, if a record contains many items requiring alignment, its size can increase significantly. If, unaware of the fill bytes, you tried to move a group item containing fill bytes to a single data item, right-end truncation would occur. You would not have this problem, however, if you moved the record into another identically defined group item. The method the compiler uses to allocate storage ensures that identically described group items have the same structure, even when their subordinate items are aligned on their required boundaries.
Figure 5-3 shows alignment boundaries for a record. The boundary is the leftmost location of the 1-, 2-, 4-, or 8-byte area. All boundaries are relative to the beginning of the record as byte number 0.
Figure 5-3 Record Alignment Boundaries
The Compaq COBOL compiler allocates storage for data items within records according to the rules of the major-minor equivalence technique. The major-minor equivalence technique ensures that identically defined group items have the same structure, even when their subordinate items are aligned. Therefore, group moves always produce predictable results. This technique is based on the following two rules:
Location equivalence forces a group (major) item to the same storage location as its first subordinate (minor) item. This forced positioning occurs regardless of the boundary alignment of either the group or subordinate item.
See the Compaq COBOL User Manual chapter on aligning binary data for information on how location equivalence allocates storage.
The following example results in the major-minor location format:
01 ITEM-A. 03 ITEM-B. 05 ITEM-C PIC 9(4) COMP SYNCHRONIZED. 03 FILLER PIC X. 03 ITEM-D. 05 ITEM-E PIC 9(4) COMP SYNCHRONIZED. 03 ITEM-F PIC X.
The following example (omitting SYNCHRONIZED) results in the left-right location format:
01 ITEM-A. 03 ITEM-B. 05 ITEM-C PIC 9(4) COMP. 03 FILLER PIC X. 03 ITEM-D. 05 ITEM-E PIC 9(4) COMP. 03 ITEM-F PIC X.
Table 5-3 compares the major-minor technique of storage allocation with the left-to-right technique that assigns locations to a group item before its subsidiary items. Note that major-minor storage allocation adds a fill byte before ITEM-D. This forces location equivalence with ITEM-E, which is explicitly aligned by the SYNCHRONIZED clause.
The following diagram also shows the storage allocation for the record ITEM-A in Table 5-3 using both techniques. A hyphen (-) represents fill bytes caused by explicit alignment; an asterisk (*) represents the FILLER data item.
Regardless of the record allocation technique, an elementary move always produces the expected result. For example:
MOVE ITEM-C TO ITEM-E
A group move may produce an unexpected result, as in the following two situations:
Boundary equivalence forces a group item to a boundary determined by the alignment of its subordinate items.
Within a record, a group item aligns on a boundary as large as the forced alignment boundary of any data item that:
Refer to the Compaq COBOL User Manual chapter on alignment for more information about boundary equivalence.
Figure 5-4 shows how the compiler determines the boundary where each item begins when you specify the no-alignment compiler option.
Figure 5-4 Effect of Boundary and Location Equivalence Rules on Sample Record
Figure 5-5 Storage Allocation for Sample Record
Note the location of ITEM-D. Location equivalence requires only that it have the same location as ITEM-E, its first subordinate item. ITEM-E requires only 2-byte boundary alignment. However, another of ITEM-D's subordinate items, ITEM-F, contains ITEM-I, which must be aligned on a 4-byte boundary. Therefore, boundary equivalence forces ITEM-D to a 4-byte boundary as well, causing two fill bytes between ITEM-E and ITEM-F.
This example shows how boundary equivalence helps make group moves predictable:
01 ITEM-A. 03 ITEM-B. 05 ITEM-C PIC X. 05 ITEM-D PIC 9(8) COMP SYNC. 03 ITEM-E PIC X. 03 ITEM-F. 05 ITEM-G PIC X. 05 ITEM-H PIC 9(8) COMP SYNC. 03 ITEM-I PIC XX.
The descriptions of ITEM-B and ITEM-F are equivalent. Therefore, you would not expect the following sentence to change the values of ITEM-C and ITEM-D:
MOVE ITEM-B TO ITEM-F MOVE ITEM-F TO ITEM-B.
Figure 5-6 shows how storage for the record would be allocated without and with boundary equivalence. A hyphen (-) indicates fill bytes caused by the SYNCHRONIZED clause. A plus sign (+) represents fill bytes resulting from boundary equivalence.
Figure 5-6 Storage Allocation Without and With Boundary Equivalence
Without boundary equivalence, ITEM-B occupies 8 bytes, and ITEM-F occupies 7 bytes. Moving the contents of ITEM-B to ITEM-F truncates the last byte of ITEM-D. Moving the contents of ITEM-F to ITEM-B pads the last byte of ITEM-D with a space character.
In contrast, boundary equivalence eliminates this unforeseen result. The elementary items occupy the same relative positions in each group. Therefore, the structures of ITEM-B and ITEM-F are the same, and the results of both group and elementary moves are predictable.
The following series of examples show major-minor storage allocation. The notes after each example indicate its significant features. A hyphen (-) represents fill bytes.
WORKING-STORAGE SECTION. 01 ITEM-A. 03 ITEM-B PIC X. 03 ITEM-C. 05 ITEM-D. 07 ITEM-E PIC 999 COMP SYNC. 07 ITEM-F PIC X(10). 05 ITEM-G REDEFINES ITEM-D. 07 ITEM-H PIC 9(14) COMP SYNC. 07 ITEM-I PIC XXXX. 01 ITEM-J. 03 ITEM-K. 05 ITEM-L PIC 999 COMP SYNC. 05 ITEM-M PIC X(10). 03 ITEM-N REDEFINES ITEM-K. 05 ITEM-O PIC 9(14) COMP SYNC. 05 ITEM-P PIC XXXX.