Chapter 13. File I/O functions and commands for REXX.

This chapter will cover the different tools and techniques that a REXX programmer can use to read a file.  It primarily focuses on VM/ESA but looks at other platforms too (OS/390 and OS/2).  When appropriate, similarities and differences are highlighted.  Writing to files will be discussed in lesson 4.


Overview of methods for reading sequential files.

These are the functions or commands that can be used to access sequential files:

Function Part of REXX VM/ESA OS/390 OS/2
EXECIO no yes yes no
LINEIN yes yes(fn 1) no yes
LINEOUT yes yes(fn 1) no yes
CHARIN yes yes(fn 1) no yes
CHAROUT yes yes(fn 1) no yes
CMS Pipelines no yes no no
XEDIT no yes no no
ISPF/PDF no (yes) yes no
CSL routines no yes no no

These are, in short, the major differences between the different functions and commands:
EXECIO is not a REXX function or command(footnote 2), but is an external utility to access the files.  It is record oriented and allows sequential and direct access (CMS or GCS only) to file records.  It serves both input and output.
LINEIN is a REXX function.  It allows record oriented read access to files.  VM/ESA allows direct access to a specific record.  OS/2 allows only sequential read, or to position back to the first record.
LINEOUT is a REXX function.  It brings record oriented write access to files.  VM/ESA allows direct access to a specific record.  OS/2 allows only sequential write.
CHARIN is a REXX function.  It allows character oriented read access to a file.  Direct access to a specific byte in the file is possible, both on VM and OS/2.
CHAROUT is a REXX function.  It brings character oriented write access to a file.  You can (over-)write specific characters in the file. 
CMS Pipelinesis not a REXX function or command, but is an external utility that can be used to read or write files.  As CMS Pipelines has so much built-in features, it became the preferred method for file I/O in VM/ESA.
XEDIT XEDIT can indeed be used to read or write files from within a REXX procedure.
ISPF/PDF the ISPF/PDF editor (which can also be installed on a VM/ESA system), allows to manipulate files in a way similar to XEDIT.
CSL provide routines callable from any language, including REXX.  This is the only way to fully exploit all SFS bells and whistles.

REXX level 4.00 (see Appendix D: REXX Versions for a complete list) provides extra functions that go along with some of the functions listed above.  These are:

Function Part of REXX VM/ESA OS/390 OS/2
STREAM() yes yes(fn 1) no yes
CHARS() yes yes(fn 1) no yes
LINES() yes yes(fn 1) no yes

This is what these functions provide;
STREAM() is a REXX function that allows to Open, Close, Query or set a position in a file.
CHARS() is a REXX function that, on OS/2, returns the number of characters remaining in the file from the current read position till the end of the file.  On VM/ESA, this function will only return 1 if there are still characters in the file after the current read position, else it returns 0.
LINES() is a REXX function that, on OS/2, returns 1 if there are still records remaining in the file after the current read position, while on VM/ESA, it return the number of records remaining after the current read position.

Note: This is an example of the differences between host and PC systems.  OS/2 knows exaclty how many characters (CHARS()) are in the file but has no idea about how many records (LINES()) there are.  CMS, on the contrary knows exaclty how many records (LINES()) there are, but is ignorant about the number or characters. 

There are other functions or commands, mostly platform specific, that allow to query or manipulate files.  This is a non-exhaustive list of such functions.  We will not enter into details for these.

Operation VM/ESA OS/390 (TSO/E) OS/2
Split components of file names (e.g. path) n/a (use PARSE) n/a (use PARSE) FILESPEC()
Query about drive characteristics (size, %use...) QUERY DISK n/a SysDriveInfo()
List accessed disks QUERY ACCESSED n/a SysDriveMap()
Delete a file ERASE DELETE SysFileDelete()
List files LISTFILE LISTCAT SysFileTree()
Obtain file information LISTFILE, STATE, STREAM() LISTDSI, SYSDSN SysFileTree(), STREAM()
Make a directory CREATE DIRECTORY n/a SysMkDir()
Remove a directory ERASE n/a SysRmDir()
Query access order (path) Q ACCESSED n/a SysSearchPath()
Generate a temporary file name n/a n/a SysTempFileName()
Search for data in a file CMS Pipelines, XEDIT, EXECIO, NAMEFIND EXECIO, ISPF/PDF SysFileSearch()
Stack manipulation... MAKEBUF, DROPBUF MAKEBUF, DROPBUF, NEWSTACK, DELSTACK RxQueue()
Closing a file explicitly FINIS, STREAM() EXECIO STREAM()
Detailed information about SFS objects CSL calls n/a n/a

Let's now compare the various alternatives to read files.  This is just an overview, but we want to cover all alternatives to be complete.  Some are less useful, others are 'old-fashioned'.  In next lesson, we will put more emphasis on the advantages and disadvantages of the different alternatives.

We will start with EXECIO as this was/is the most used method in CMS.  Then we'll see examples of the more modern File I/O solutions available in the latest releases of VM/ESA, such as LINEIN(), CHARIN() and CMS Pipelines.  Finally, we'll mention the more specialized solutions like XEDIT, CSL and NAMEFIND.


The EXECIO DISKR command

The EXECIO command has been introduced in CMS with VM/SP Release 1 to give I/O capabilities to EXECs.  Our discussion will be brief as we guess this isn't really new for you.  This is the command format for reading a file:


+-*--1----------------+ >>--EXECIO--+-lines-+---DISKR--fn--ft--+---------------------+--------> +-*-----+ ! +-1-------+ ! +-+-fm-+--+---------+-+ +-*--+ +-linenum-+ >--+----------------------------------------+------------------! ! (2) (3) ! +-(-----+-------+--! Options !-----+---+-+ +-FINIs-+ +-)-+ Options: +-Zone--1--*---+ !--+-------------------------------------+-+--------------+-+------+--> !-+-FInd--/chars/---+--+------------+-! +-Zone--n1--n2-+ +-SKip-+ ! !-LOcate--/chars/-! !-LIFO-------! ! ! +-Avoid--/chars/--+ !-FIFO-------! ! ! +-STEm--stem-+ ! +-VAR--xxxx---------------------------+ +-Margins--1--*---+ >--+-----------------+--+-------+--+--------+-------------------------! +-Margins--n1--n2-+ +-STRIP-+ +-NOTYPE-+

Most important to remember for now is that you can specify a starting record number (linenum parameter) and a number of records to read (lines parameter).  Also, the output can be placed either in the CMS stack or in a REXX variable or stem (array).

If linenum has a value of zero or is not specified, reading begins at the first line, or when the file was already open, at the current line pointer.  Let's give some examples.  Here we read a file, one record at a time:

  /* EXECIO: reading a file */
  fileid='SAMPLE FILE E'
  do forever
     'EXECIO 1 DISKR' fileid '(VAR RECORD'
     if rc<>0 then leave
     if left(record,1)='*' then iterate
     /* process record */
  end
  'FINIS' fileid /* do not forget to close the file */
  exit

Previous example isn't the fastest way to work.  Learn the general rule, valid on all systems, always and forever:

Switching between environments is costly.  In general, you should ask as much as possible & useful in one statement.  So,

  1. Never read one byte or record at a time (unless you need only one), but rather the whole file at once if possible.
  2. Combine commands for an environment, for example:
    • 'CP SP PRT RSCS' '15'x 'TAG DEV PRT MYPRT'
      (do you remember the linends between CP commands ?)
    • as REXX has no Change command, it is better to design a simple subroutine than to call XEDIT for just this one function as the overhead associated to calling another environment is not negligible.
  3. Pass data directly rather than indirectly (see Chapter 11. "The EXECCOMM interface").

So, our first example can be improved, with measurable gains, if we read the file into an array:

  /* VM/ESA reading a file */
  fileid='SAMPLE FILE E'
  'EXECIO * DISKR' fileid '(STEM REC. FINIS'
  if rc<>0 then call errexit rc,'Problems reading file' fileid
  do i=1 by 1 to rec.0
     if left(rec.i,1)='*' then iterate
     /* process record */
  end
  exit

Question 21

Previous example can give problems when handling large files, as the complete file has to be stored in a stem, hence in storage.  If your virtual storage isn't large enough to hold the file records, then you're stuck.

Would you in that case have to revert back to the line-by-line reading or do you have a better solution ?  Give a short example of the logic flow you would use in your solution.


The LINEIN() function.

The LINEIN() function is defined as part of the REXX SAA Level 2 set of functions.  It is available on VM/ESA, OS/2 and Object REXX.

Format

      LINEIN ( name , line , count )

returns count (0 or 1) lines read from the character input stream or file name.  The default count is 1.  If name is omitted, then the line will be read from the default input stream, the terminal input buffer (STDIN in OS/2).

For files, a read position is maintained for each file.  Any read from the file will by default start at the current read position.  When the read is completed the read position is increased by the number of characters read.

On VM/ESA, a line number may be given to set the read position to the start of a specified line.  This line number must be positive and within the bounds of the file.  A value of 1 for line refers to the first line in the file.

On OS/2 or Object REXX, the read position may be set to the beginning of the stream by giving line a value of 1, the only valid value for variable line on these platforms.

If count is 0, then the read position will be set to the start of the specified line but no characters will be read and the null string is returned.

Let's have some examples.  We read a file into a stem:

  /* OS/2 reading a file */
  fileid='E:\MYDIR\SAMPLE.FIL'
  do i=1 by 1 while lines(fileid)>0
     rec.i=linein(fileid)
  end
  rec.0=i-1
  call stream fileid,'C','CLOSE'
  ...
  exit

We'll come back on the LINES() and STREAM() functions later.

For now, remember that on personal systems, LINES() returns a value 1 as long as there are records in the file (hence we didn't reach the end of the file), and becomes 0 once we reached the end of the file.

On VM/ESA, however, LINES() returns the actual number of records left in the file.  This is of course due to the differences between record-oriented host systems and byte-oriented personal systems.

This means that we can avoid to call the lines() function at each iteration on VM/ESA, but then we loose the portability...

Compare these:

  /* Reading a file, general solution */  | /* Reading a file, CMS only */
  fileid='SAMPLE FILE E'                  | fileid='SAMPLE FILE E'
                                          | rec.0=lines(fileid)
  do i=1 by 1 while lines(fileid)>0       | do i=1 to rec.0
     rec.i=linein(fileid)                 |    rec.i=linein(fileid)
  end                                     | end
  rec.0=i-1                               |
  call stream fileid,'C','CLOSE'          | call stream fileid,'C','CLOSE'
  ...                                     |
  exit                                    | exit

Finally, remark that on personal systems, LINEIN() is meant to be used only for text files, hence files where records are separated from each other by a CRLF combination.  The separator will not appear in the result of the function.


The CHARIN() function.

Format

     CHARIN ( name , start , length )

returns a string of up to length single-byte characters read from the character input stream name.

We mention CHARIN() function for completeness, but will discuss its details only when we have explained all details about stream I/O in next lesson.


CMS Pipelines

CMS Pipelines deserves a course on itself and it would take us too far to explain everything here, but we need to include at least one simple example to be complete on the File I/O subject.

If you read the comments in the procedure below, you'll be able to understand grossly what the procedure does.  The solution is equivalent to the one using XEDIT later in the text, but does not use the stack.

  /* An exec read a -preprocessed- file via CMS Pipelines */
  address command
  'MAKEBUF'                    /* be sure our commands are first */
  oldq=queued()              /* how many lines stacked already ? */
  'PIPE',                                  /* Start the pipeline */
    '!< MY DATA A',                 /* Read the file in the pipe */
    '!CHANGE 10-20 /MICKEY/MOUSE/',   /* Change in cols 10 to 20 */
    '!NFIND *',!!                    /* remove all comment cards */
    '!SORT 1-5',                       /* sort remaining records */
    '!STEM RECORD.'        /* put all selected lines in an array */
  do i=1 to record.0          /* Process the records of the stem */
     /* process the records */
  end

If CMS Pipelines are still brand new for you, a few more words of explanation may be required.


Reading a file via XEDIT

Before we had CMS Pipelines and EXECIO, an editor was the tool to work with files.

Normally, the use of the CMS stack is needed in this case.

XEDIT has many sub-commands acting on the data, and many procedure writers are familiar with XEDIT.  In next lesson, we will list the cases where XEDIT still proves to be an excellent tool.

A first very classic example is one where we modify a file with XEDIT:

  /* An exec to change some string in a file */
  address command
  'MAKEBUF'         /* be sure our commands are first in the stack */
  queue 'ZONE 10 20'          /* change only columns 10 thourgh 20 */
  queue 'CHANGE /MICKEY/MOUSE/ *'
  queue 'COMMAND FILE'
  'XEDIT MY DATA A (NOPROF NOMSG'      /* avoid end-user's PROFILE */
  savrc=rc
  'DROPBUF'                                  /* remove our MAKEBUF */
  exit savrc

This example updates the file directly, but it is also possible to pass the file records to the REXX procedure.  In next case, we apply some changes to the contents of the records, remove the comment records (the ones starting with an asterisk) and put the remaining records in a REXX stem:

  /* An exec read a -preprocessed- file via XEDIT*/
  address command
  'MAKEBUF'                      /* be sure our commands are first */
  oldq=queued()                /* how many lines stacked already ? */
  queue 'ZONE 10 20'          /* change only columns 10 through 20 */
  queue 'CHANGE /MICKEY/MOUSE/ *'
  queue 'ZONE 1 1'                  /* ALL must only look in col 1 */
  queue 'ALL ^/*/'                     /* remove all comment cards */
  queue 'SORT * 1 5'                     /* sort remaining records */
  queue 'STACK *'         /* place all selected lines in the stack */
  queue 'COMMAND QUIT'           /* do not update the file on disk */
  'XEDIT MY DATA A (NOPROF NOMSG'/* avoid end-user's PROFILE & MSG */
  savrc=rc
  do queued()-oldq                /* Process only data of the file */
     parse pull record
     /* process the records */
  end
  'DROPBUF'                                  /* remove our MAKEBUF */
  exit savrc

This is a fine, and even good performing method to read a file, but:

  1. we must use the stack
  2. we cannot test return codes of the stacked XEDIT commands (you can for example not know if ALL actually excludes some lines).  This problem can only be solved by using an XEDIT macro.

Anyway, nowadays XEDIT will be used less as CMS Pipelines can let you perform similar functions, as we have seen an example of in previous topic.


File reading through CSL.

CSL (or Callable Services) were introduced in VM/SP Release 6.  Briefly said, the Callable Services provide a kind of API (Application Programming Interface) towards several functions or environments in CMS (such as SFS, APPC/VM or CMS Multitasking).

To be complete we include an example using the CSL interface, but we don't expect you to use it often in REXX.  It is however the only method by which you can fully exploit SFS possibilities, and this method proves to be most useful for other programming languages, such as PL/I or COBOL.  Applications could be prototyped in REXX first. 

In our example we will read a file, but ask SFS to let the file's last reference date unchanged.  It is possible to specify a directory name instead of a filemode, so that there is no need to access the directory to work with SFS files.

 parse upper arg fid
 if fid='' then fid='TC NAMES IECSYSU:FSCIPOE.TC.TCVM1'

 parse value 'READ OLDDATEREF;COMMIT  1   200 0    0',
       with   OpenType      ';'commit one twh retc reason token

 parse value length(fid) length(OpenType) length(commit),
       with  l_fid       l_OpenType       l_commit

 call csl 'DMSOPEN retc reason fid l_fid OpenType l_OpenType token'
 if retc>4 then do
    if reason=44000 then call errexit 28,'File not found'
    call errexit retc,'----> DMSOPEN error: retc='retc '; reason='reason
 end

 do i=1 until retc^=0
    call csl 'DMSREAD retc reason token one twh buffer twh lrecl'
    if retc>4 then call errexit retc,,
                   '----> DMSREAD error: retc='retc '; reason='reason
    if reason=90103 then leave i /* end-of-file */
    record.i=left(buffer,lrecl)
 end i
 record.0=i-1
 call csl 'DMSCLOSE retc reason token commit l_commit'

You can see why reading with CSL routines will not become popular very soon, though sometimes you may need to use it.  We won't explain all the CSL possibilities, but simply "walk" through the exec.

 call csl 'DMSOPEN retc reason fid l_fid OpenType l_OpenType token'

In REXX all CSL routines are called by call csl routinename  followed by some parameters.  All parameters used here are REXX variables.  You see also that you have to pass the lengths of the parameters.  The reason for that is that other languages require explicit length definitions for their parameters.

 call csl 'DMSOPEN retc reason  ...

retc is the variable where the CSL's return code will be saved.  Negative numbers mean there is an error in the parameters, some of which are only documented in the REXX manuals.  REASON is the variable getting the reason code, which explains the real reason why the CSL function failed and complements the return codes which give only a general error code, such as 4 for a warning, 8 for a failure, etc.

 call csl 'DMSOPEN ..... fid l_fid OpenType l_OpenType token'

Here we pass the fileid and the type of OPEN we want to perform.  The variable token is a unique token (or file handle) that the routine will return back, and that token has then to be used in subsequent CSL calls handling the same file (such as in the DMSREAD and DMSCLOSE).

 call csl 'DMSREAD retc reason token one twh buffer twh lrecl'

Variable one is input and contains the number of records to read (must be 1 for variable files, but can be a greater value for fixed record length files).
Variable twh is the number of bytes we want (important if more than 1 record is requested).
Variable buffer is where the record(s) will be stored, it is followed by it's length (200 in our case).
Variable lrecl will contain the number of actually read bytes.

One can see that this CSL interface looks much like the traditional way of performing I/O on OS/390 or VSE/ESA.

Note: For more information on other parameters, just issue the command HELP ROUTINE DMSxxxx.

You should at least remember this about the CSL interface:

  1. CSL services can be called from REXX, but also from any other language in VM/CMS (Assembler, Cobol, PL/I, ...).
  2. CSL is the way by which new CMS API functions are implemented to make them available to languages other than just Assembler.
  3. Some return codes are only documented in the REXX Reference Guide.


Reading files via NAMEFIND

NAMEFIND is a very specialized command.  It can only be used to read files that have so called tagged data.  A good example is your NAMES file that contains nicknames for your VM correspondents.  But using "tagged data files" can be a useful technique to store control information for your applications.  The filetype of such a file must be NAMES.

For example, the file TC NAMES looks like this:

:nick.TCVM1    :type.C  :name.Advanded REXX

:nick.DECEULAE :userid.DECEULAE :type.I :course.REXX
               :name.Guy De Ceulaer
:nick.BUELENSC :userid.BUELENSC :type.I :course.REXX
               :name.Kris Buelensc
:nick.ANDREAS  :userid.ANDREAS  :type.S :course.REXX
               :name.Andreas Baader

Each entry starts with a :nick. tag and can span more than 1 sourceline.  Each entry has fields identified by other tags (for example the :name. tag), a field goes from one tag to the next, trailing blanks are ignored.

This particular file lists different users, some of which have a type I (instructor) and others a type S (for students).  We had such a file when we ran the course in the past. A procedure could then extract the instructors, such as here...

 'NAMEFIND :type I :course TCVM1 :userid :name (FILE TC STEM NF. *'
 if rc<>>0 then call errexit rc ,'Problem: no instructors found for',
               'course TCVM1 in TC NAMES'
 do i=1 by 2 to nf.0
    'CP SP D TO' nf.i
    'PIPE Literal' userid() 'has ended lesson' lesson,
        '!Literal Message to' value('nf.'i+1),
        '!Punch'
    call diag 8,'SP D CLOSE NAME TcDone' lesson
 end
 call diag 8,'SP D OFF NONAME'
return

We specify :type I :course REXX as a complex search key (both criteria have to be true):

 'NAMEFIND :type I :course REXX ...

Next, we specify which fields we want to get back ; hence we specify the :userid and :name tags as parameters:

 'NAMEFIND .....  :userid :name ...

We mention the filename of the NAMES file to use as first option:

 'NAMEFIND .....  (FILE TC ...

We want the results in a REXX stem:

 'NAMEFIND .....  ( ... STEM NF. ...'

and we want ALL matching records, not just the first one:

 'NAMEFIND .....  ( ... *'

As NAMEFIND is only for specifically formatted files, we won't deepen this subject any further.  Just remember that it can be useful whenever your exec needs some "side" information from which you should be able to search for keywords.  Any NAMES file can be maintained by the NAMES command, which will present a full-screen panel (issue for example the command names (file anyname). 

We interrupt the discussion on File I/O here.  Next lesson will deepen out other aspects in this matter.  But we consider this a good moment to put what we have explained into practice.


Footnotes:

(1) This function was added to REXX in VM/ESA Version 1, Release 2.1.
Back to text

(2) On VSE/ESA and MVS/ESA EXECIO is delivered along with REXX, but is not to be considered as to be part of the REXX language itself.
Back to text