https://colinpaice.blog/2022/08/06/there-are-many-ways-to-fail-to-read-a-file-in-a-c-program/

Skip to content
[8c634342e7]

ColinPaice

Menu

  * Home
  * Contact
  * Server startup - Second and third time
  * Server sends ServerHello

There are many ways to fail to read a file in a C program.

Colin Paice C, Z/OS August 6, 2022August 12, 2022 8 Minutes

The 10 minute task to read a file from disk using a C program took a
week.

There are several options you can specify on the C fopen() function
and it was hard to find the one that worked. I basically tried all
combinations. Once I got it working, I tried on other files, and
these failed, so my little task turned into a big piece of research.

Depending on the fopen options and the use of fgets or fread I got
different data when getting from a dataset with two records in it!

  * AAAA - what I wanted.
  * AAAAx'15' - with a hex 15 on the end.
  * AA
  * AAAAx'15'BBBBx'15'- both the records, with each record terminated
    in a line-end x'15'.
  * AAAABBBB - both records with no line ends.
  * x'00140000' x'00080000'AAAAx'000080000'BBBB.
      + This has a Block Descriptor Word (x'00140000') saying the
        length of the block is x14.
      + There is a Record Descriptor Word (RDW) (X'00080000') of
        length x'0008' including the 4 bytes for the RDW.
      + There is the data for the first record AAAA,
      + A second RDW
      + The second data
      + The size of this record is 4 for the BWD, 4 for the first
        RDW, 4 for the AAAA, 4 for the second RDW, and 4 for the data
        = 20 bytes = x14.
  * Nothing!

Sections in this blog post

  * There are different sorts of data
  * Different ways of getting the data
  * Introduction to fopen()
  * The file name
  * The options
  * How much data is read at a time?
  * Reading the data
  * You can use the fread() function
  * You can use the fgets function
  * Results from reading data sets and files
  * To read a dataset
  * Read a normal(EBCDIC) OMVS file
  * Read an ASCII file in OMVS
  * To read a binary file
  * Read a binary normal(EBCDIC) OMVS file
  * Read a binary ASCII file in OMVS
  * Reading data sets from a shell script
  * How big a buffer do I need?
  * How can I tell the size of the buffer I need?

There are different sorts of data

  * z/OS uses data sets which have records. Records can be blocked
    (for better disk utilisation) It is more efficient to write one
    block containing 40 * 80 bytes records than write 40 blocks each
    of 80 bytes. You can treat the data as one big block (of size
    3200 bytes) - or as 40 blocks of 80 (or 2 blocks of 1600 ...) .
    With good blocking you can get 10 times the amount of data onto a
    disk. (If you imagine each record on disk needs a 650 byte header
    you will see why blocking is good).
  * You can have "text" files, which contain printable characters.
    They often use the new line character (x'15') to indicate end of
    line.
  * You can have binary files which should be treated as a blob. In
    these characters line x'15' do not represent new line. For
    example it might just be part of a sequence x'13141516'.
  * Unix (and so OMVS) uses byte-addressable storage.
      + "Normal files" have data in EBCDIC
      + You can use enhanced ASCII. Where you can have files with
        data in ASCII (or other code page). The file metadata
        contains the type of data (binary or text) and the code page
        of the text data.
      + With environment _BPXK_AUTOCVT you can have automatic
        conversion which converts a text file in ASCII to EBCDIC when
        it is read.

From this you can see that you need to use the correct way of
accessing the data(records or byte addressable), and of using the
data(text or binary).

Different ways of getting the data

You can use

  * the fread function which reads the data, according to the file
    open options (binary|text, blocked)
  * the fgets function which returns text data (often terminated by a
    line end character).g

Introduction to fopen()

The fopen() function takes a file name and information on how the
file should be opened and returns a handle. The handle can be used in
an fread() function or to provide information about the file. The C
runtime reference gives the syntax of fopen(), and there is a lot of
information in the C Programming guide:

  * Introduction to C and C++ input and output
  * Types of C and C++ input and output
  * Opening files

There is a lot of good information - but it didn't help me open a
file and read the contents.

The file name

The name can be, for example:

  * "DD:SYSIN" - a reference to a data set via JCL
  * "//'USER.PROCLIB(MYPROC)'" - a direct reference to a dataset
  * "/etc/profile" - an OMVS file.

If you are using the Unix shell you need to use "//'datasetname'"
with both sets of quotes.

The options

This options string has one or more parameters.

The first parameter defines the operation, read, write, read+write
etc. Any other parameters are in the format keyword=format.

The read operation can be

  * "rb" to read a binary file. In a binary file, the data is treated
    as a blob.
  * "rt" to read a text file. A text file can have auto conversion
    see FILETAG C run time option.
  * "r"

The type can be

  * type=record - read a record of data from the disk, and give the
    record to the application
  * type=blocked. It is more efficient in disk storage terms, to
    build up a record of 10 * 80 byte records into one 800 byte
    record (or even 3200 byte records). When a blocked record is
    read, the read request returns the first N bytes, then the next n
    bytes. When the end of the block is reached it will get retrieve
    the next block from disk.
  * not specified, which implies byte access, rather than record
    access, and use of fgets function.

An example fopen

FILE * f2 fopen("//'COLIN.VB'","rt type=blocked") ;

How much data is read at a time?

Depending on the type of data, and the open options a read request
may return data

  * up to the end of the record
  * up to and including the new-line character
  * up to the size of the buffer you give it.

If you do not give it a big enough buffer you may have to issue
several reads to get whole logical record.

The system may also "compress" data, for example remove trailing
blanks from the end of a line. If you were expecting an 80 byte
record - you might only get 20 bytes - so you may need to check this,
and fill in the missing blanks if needed.

Reading the data

You can use the fread() function

The fread() parameters are:

  * the address of a buffer to hold the data
  * the unit of the buffer block size
  * the number of buffer blocks
  * the file handle.

It returns the number of completed buffer blocks used.

It is usually used

    #define lBuffer 1024

    char buffer[lBuffer];
    size_t len = fread(buffer, 1 ,lBuffer ,fHandle );

This says the unit of the buffer is 1 character, and there are 1024
of them.

If the record returned was 80 bytes long, then len would be 80.

If it was written as

    size_t len = fread(buffer, lBuffer, 1 ,fHandle );

This says the unit of buffer is 1024 - and there is one of them. The
returned "length" is 0, as there were no full 1024 size buffers used.

You can use the fgets function.

The fgets() parameters are:

  * the address of a buffer to hold the data
  * the size of the buffer
  * the file handle.

This returns a null terminated string in the buffer. You can use
strlen(buffer) to get the length of it. Any empty string would have
length 0.

Results from reading data sets and files

I did many tests with different options, and configurations, and the
results are displayed below.


The following are using in the tables:

  * Run.
      + job. The program was run in JCL using BPXBATSL or via PGM=...
      + unix shell - the program was run in OMVS shell.
  * BPXAUTO. This is used in OMVS shell. The environment variable
    _BPXK_AUTOCVT can have
      + "ON" - do automatic conversion from ASCII to EBCDIC where
        application
      + "OFF" do not do automatic conversion from ASCII to EBCDIC
  * open. The options used.
      + r read
      + rb read binary
      + rt read test
      + "t=r" is shorthand for type=record
  * read. The C function used to read the data.
      + fread - this is the common read function. It can read text
        files and binary files
      + fgets - this is used to get text from a file. The records
        usually have the new-line character a the end of the daya
      + aread is a C program which uses fread - but the program is
        compiled with the ASCII option.
  * data
      + E is the EBCDIC data I was expecting. For example reading
        from a FB dataset member gave me an 80 byte record with the
        data in it.
      + E15 is EBCDIC data, but trailing blanks have been removed.
        The last character is an EBDCIC line end (x15).
      + A0A. The data is returned in ASCII, and has a trailing line
        end character (x0A). You can convert this data from ASCII to
        EBCDIC using the C run time function __a2e_l(buffer,len) .
      + Abuffer - data up to the size of the buffer ( or size -1 for
        fgets) the data in ASCII.
      + Ebuffer - data up to the size of the buffer ( or size -1 for
        fgets) the data in EBCDIC. This data may have newlines within
        the buffer

To read a dataset

This worked with a data set and inline data within a JOB.

Run        BPXAUTO open   read  data
Job        NA      rb,t=r fread E
                   r|rt   fread Ebuffer
                   r      fgets E15
                   rt     fgets E15
Unix shell Any     rb,t=r fread E
                   r|rt   fread Ebuffer
                   r      fgets E15
                   rt     fgets E15

Read a normal(EBCDIC) OMVS file

Run        BPXAUTO open   read  data
Job        OFF     rb,t=r fread E
           off     rt     fgets E15
           off     *      fread Ebuffer
Unix shell off     r      fgets E15
           off     rt     fgets E15
           off     rb     fgets E15
           off     *      fread E15
           on      rb,t=r fread E
           on      rb     fgets E15

Read an ASCII file in OMVS

Run        BPXAUTO open read  data
Job        off     r    agets A0A
           off     rb   agets A0A
           off     rt   agets A0A
Unix shell on      r    fgets E15
           on      rb   fgets E15
           on      rt   fgets E15
           on      *    fread buffer
           off     r    agets A0A
           off     rb   agets A0A
           off     rt   agets A0A
           off     *    fread Abuffer

To read a binary file

To read a data set

Run        BPXAUTO open read  data
Job        off     rb   fread E
Unix shell on      rb   fread E

Read a binary normal(EBCDIC) OMVS file

Run        BPXAUTO open read  data
Job        off     rb   fread E
Unix shell on      rb   fread E
           off     rb   fread E

Read a binary ASCII file in OMVS

If use list the attributes of a binary file, for example "ls -T
ecrsa.p12" it gives

    b binary T=off ecrsa1024.p12

Which shows it is a binary file

Run        BPXAUTO open   read  data
Job        off     rb,t=r fread E
Unix shell on      rb,=r  fread E
           off     rb,t=r fread E

Reading data sets from a shell script

If you try to use fopen("DD:xxxx"...) from a shell script you will get

    FOPEN: EDC5129I No such file or directory. (errno2=0x05620062)

If you use fopen("//'COLIN.VB'"...) and specify a fully qualified
dataset name if will work.

fopen("//VB"..) will put the RACF userid in front off they name. For
example attempt to open "//'COLIN.VB.'"

How big a buffer do I need?

The the buffer is not large enough to get all of the data in the
record, the buffer will be filled up to the size of the data. For
example using fgets and a 20 byte buffer, 19 characters were
returned. The 20th character was a null. The "new-line" character was
present only when the end of the record was got.

How can I tell the size of the buffer I need?

There is data available which helps - but does not give the whole
picture.

With the C fldata() -- Retrieve file information function you can get
information from the file handle such as

  * the name of the dataset (as provided at open time)
  * record format - fixed, variable
  * dataset organisation (dsorg) - Paritioned, Temporary, HFS
  * open mode - text, binary, record
  * device type - disk, printer, hfs, terminal
  * blksize
  * maximum record length

With fstat(), fstat64() -- Get status information from a file handle 
and lstat(), lstat64() -- Get status of file name or symbolic link you
can get information about an OMVS file name (or file handle). This
has the information available in the OMVS "ls" command. For example

  * file serial number
  * owner userid
  * group userid
  * size of the file
  * time last access
  * time last modified
  * time last file status changed
  * file tag
      + ccsid, value or 0x000 for untagged, or 0xffff for binary
      + pure text flag.

Example output

For data sets and files (along the top of the table)

  * VB a sequential data set Variable Blocked
  * FB a member of user.proclib which is Fixed Block
  * SYSIN inline data in a job
  * Loadlib a library containing load modules (binary file)
  * OMVS file for example zos.c
  * ASCII for example z.py which has been tagged as ASCII

Other values

  * V variable format data
  * F fixed format data
  * Blk it has blocked records
  * B "Indicates whether it was allocated with blocked records". I do
    not know the difference between Blk and V
  * U Undefined format
  * PS It is a Physical Sequential
  * PO It is partitioned - it as members
  * PDSE it is a PDSE
  * HFS it is a file from OMVS

                 VB      FB       SYSIN   loadib        OMVS     ASCII
                                                        file
fldata recfm     V Blk B F Blk B  F Blk B U             U        U
       dsorg     PS      PO       PS      PO PDSMem     HFS      HFS
                         PDSMem           PDSE
       device    disk    disk     other   disk          HFS      HFS
       blocksize 6144    610      80      6400          0        0
       maxreclen 1024    80       80      6400          1024     1024
stat   ccsid     0       0        0       0             0        819
       file size 0       0        0       0             4780     824

To read a record, you should use the maxreclen size. This may not be
the size of the data record but it is the best guess.

Share this:

  * Twitter
  * Facebook
  * 

Like this:

Like Loading...

Related

[8c634342e7]

Published by Colin Paice

I retired from IBM where I worked on MQ on z/OS, and did customer
stuff. I retired, and keep my hand in with MQ, by playing with it!
View all posts by Colin Paice

Published August 6, 2022August 12, 2022

Post navigation

Previous Post Interacting with a Python script running as a started
task on z/OS
Next Post How do I use BPXBATCH? It is not that obvious.

Leave a Reply Cancel reply

Enter your comment here...
[                    ]

Fill in your details below or click an icon to log in:

  *  
  *  
  *  
  *  

Gravatar
Email (required) (Address never made public)
[                    ]
Name (required)
[                    ]
Website
[                    ]
WordPress.com Logo

You are commenting using your WordPress.com account. ( Log Out / 
Change )

Twitter picture

You are commenting using your Twitter account. ( Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. ( Log Out /  Change )

Cancel

Connecting to %s

[ ] Notify me of new comments via email.

[ ] Notify me of new posts via email.

[Post Comment] 

 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
 [                                             ] 
D[                                             ] 

Search for: [                    ] [Search]
Archives

  * August 2022
  * July 2022
  * June 2022
  * May 2022
  * April 2022
  * March 2022
  * February 2022
  * January 2022
  * December 2021
  * November 2021
  * October 2021
  * September 2021
  * August 2021
  * July 2021
  * June 2021
  * May 2021
  * April 2021
  * March 2021
  * February 2021
  * January 2021
  * December 2020
  * November 2020
  * October 2020
  * September 2020
  * August 2020
  * July 2020
  * June 2020
  * May 2020
  * April 2020
  * March 2020
  * February 2020
  * January 2020
  * December 2019
  * November 2019
  * October 2019
  * September 2019
  * August 2019
  * July 2019
  * June 2019
  * May 2019
  * April 2019
  * March 2019
  * February 2019
  * January 2019
  * December 2018
  * November 2018
  * October 2018
  * September 2018
  * June 2018
  * May 2018

Tags

activity trace AFFINITY alter amqsevt application application trace
backup capacity planning certificate certificates change events
Clients CLNTWGHT CLSSDRA clustering colour CORS create csv delete
Display connections Elasticsearch Elliptic Curve Enclave events get
GROUPS ispf java JMS JMX Kibana Liberty linux man pages MDB Midrange
migration monitoring MQ mqconsole mq reason code mq reconnect MQCNO
mqstrerror mqweb offload messages OSGI PCF PDS PDSE performance put
Python RACF reconnection rest RSA scalability self signed server
setmqaut shared conversation Shared queue SMF SSLCIPH statistics su
SYSTEM.ADMIN.ACCOUNTING.QUEUE SYSTEM.ADMIN.STATISTICS.QUEUE TLS Trace
trace route webLogic WLM Z/OS
Blog at WordPress.com.

  * Follow Following
      + [wpcom-] ColinPaice
        Join 78 other followers
        [                    ]
        Sign me up
      + Already have a WordPress.com account? Log in now.
  * 
      + [wpcom-] ColinPaice
      + Customize
      + Follow Following
      + Sign up
      + Log in
      + Copy shortlink
      + Report this content
      + View post in Reader
      + Manage subscriptions
      + Collapse this bar

%d bloggers like this:

[b]