Appendix A: New Ingest File Formats

When a format unknown to the ADAPS process is received, updates must be made to tables and possibly to application code to be able to ingest the new format. The following is a guide to determine the new format and modifications to applications.

Determining Format

MAPPER and FRAMESYNC are functions under LAS/ADAPS that can be used to determine the necessary ingest information to update the "station.ids" file located in the $ADAPSTABLES directory. MAPPER lists the physical characteristics (files, number of lines, bytes per line, blocksize, ...) of tapes, LAS image files, or non-image disk files. The following is an example mapper of an input tape that contains a header, AVHRR data, and trailer file:

    SYSTEM:  SGS4          STORAGE LOCATION: KM0624

    Tape on Drive: 1

    File: 1
              3 Blocks have Blocksize =           80 Characters
    -----------
              3 Total Blocks

    < Tape mark >

    File: 2
              2700 Blocks have Blocksize =     27592 Characters
    -----------
              2700 Total Blocks

    < Tape mark >

    File: 3
              2 Blocks have Blocksize =           80 Characters
    -----------
              2 Total Blocks

    < Tape mark >

    < Tape mark >

    -----------
              4677 Total Blocks on Tape

FRAMESYNC searches the AVHRR data, on disk, for the six word frame sync information as described in the HRPT minor frame of the NOAA Technical Memorandum NESS 107. Variations in the frame sync pattern are used to determine if the data is byte swapped or packed (3 ten bit words stored in 4 bytes). The number of bytes between consecutive frame sync's indicates the number of bytes per line. When the frame sync information is not available, a hexadecimal dump of the data can be used to determine the necessary information. The following is an example of the output from FRAMESYNC:

    Packed, 0 bytes swapped
    Found at byte 1, line size = 1
    Found at byte 13797, line size = 13796
    Found at byte 27593, line size = 13796
    Found at byte 41389, line size = 13796
    Found at byte 55185, line size = 13796
    Found at byte 68981, line size = 13796
    Found at byte 82777, line size = 13796
    Found at byte 96573, line size = 13796
    Found at byte 110369, line size = 13796
    Found at byte 124165, line size = 13796

This output indicates that the data is packed (3 ten bit samples placed in 32 bits), does not need to be byte swapped on input, and contains a frame sync every 13796 bytes. The mapper information and the size of the line (13796) also indicate that each block on tape contains two lines of AVHRR data. Using the information above, HRPT Minor Frame format description (NOAA Technical Memorandum NESS 107), and any other information provided by the creator of the tape the following questions should be determined:


                    AVHRR Tape Format Questionnaire

First 103 Words:          Auxiliary Sync:            Header File:

___ 10 bit packed*        ___ 10 bit packed*         ___ No

___ 10 bit unpacked**     ___ 10 bit unpacked**      ___ Yes

___ 8 bit                 ___ 8 bit                     Number of records ____

___ Not present           ___ Not present               Size of records ____

                          ___ Partially present         ___ ASCII

                              Number of bytes ____      ___ Binary


TIP Data:               Computer System:             Header Records with
                                                          Image data:
--- 10 bit packed**     ___ IBM
                                                     ___ No
___ 10 bit unpacked**   ___ UNIX                     
                                                     ___ Yes
___ 8 bit               ___ VMS                      
                                                        Number of records ____
___ Not present         ___ Other -------------
                                                        Size of records   ____


Spare Words:            Format:                      Other Considerations:

___ 10 bit packed *      ___ Raw/Level 0              _____________________

___ 10 bit unpacked **   ___ Level 1B                 _____________________

___ 8 bit                ___ Sharp                    _____________________

___ Not present          ___ Other  _____________     _____________________

                         ________________________     _____________________


Earth Data:              Need Byte Swapping:

___ 10 bit packed*       ___ No

___ 10 bit unpacked**    ___ Yes

___ 8 bit

___ Number of channels             * 3 ten bit samples placed in 32 bits

___ Number of samples              ** 1 ten bit sample placed in 16 bits

Adding a CEOS ID

After determining the format of the new tape, a new ceos id must added to the "station.ids" table located in the $ADAPSTABLES directory. A new ceos id should be added to the "station.ids" table even if the format is the same as an existing format. The table contains the information needed to ingest the AVHRR data from the various received formats. The ceos id is used to indicate which format is being ingested and is also saved in the archive header to track the receiving station that acquired/contributed the data. The table can be edited with "vi" and the new format information added. Refer to the ADAPS Configuration Notes for a detailed description of the "station.ids" table.

TPACQUIRE Modifications

A pre-processing step is necessary to extract the image data and ancillary files from the various formats and media before ingesting. TPACQUIRE uses UNIX commands to copy the defined formats from tape to disk. If a new format is received the TPACQUIRE function must be modified to extract the AVHRR image and any useful ancillary files from this new tape format. The following steps must be added to TPACQUIRE:

INGEST modifications

Ancillary files that contain other information, such as the satellite number, direction of pass, number of lines, type of data (HRPT, GAC, or LAC) or year of acquisition require modifications to the c_acqhead() support. c_acqhead() is called by the INGEST function to extract desired information from the ancillary files. The information contained in the ancillary files should be extracted to override the user supplied values. This is done to help prevent user input errors. The support function c_acqhead() needs to be modified to extract the useful information from the header file. The following is an example of extracting the year from the HBK receiving station header file:

if (strcmp(pass->station,"HBK") == 0)
    {
    /*
      Read the header information from the acquired image
    */
    nbytes = 14;
    bytes_read = read(acq->fd,header,nbytes);
    if (bytes_read <= 0)
        {
        c_errmsg("Error reading from the acquire header","acqhead-read",
                 &nonfatal);
        return(E_FAIL);
        }

    ptr = (char *) &header[8];
    sscanf(ptr,"%2d",&pass->year);
    }

This support function must then be recompiled and added to the ACQIO library. The INGEST function should then be re-linked with the updated library.