Previous Up Next

Chapter 6  The CFITSIO Iterator Function

The fits_iterate_data function in CFITSIO provides a unique method of executing an arbitrary user-supplied ‘work’ function that operates on rows of data in FITS tables or on pixels in FITS images. Rather than explicitly reading and writing the FITS images or columns of data, one instead calls the CFITSIO iterator routine, passing to it the name of the user’s work function that is to be executed along with a list of all the table columns or image arrays that are to be passed to the work function. The CFITSIO iterator function then does all the work of allocating memory for the arrays, reading the input data from the FITS file, passing them to the work function, and then writing any output data back to the FITS file after the work function exits. Because it is often more efficient to process only a subset of the total table rows at one time, the iterator function can determine the optimum amount of data to pass in each iteration and repeatedly call the work function until the entire table been processed.

For many applications this single CFITSIO iterator function can effectively replace all the other CFITSIO routines for reading or writing data in FITS images or tables. Using the iterator has several important advantages over the traditional method of reading and writing FITS data files:

There are basically 2 steps in using the CFITSIO iterator function. The first step is to design the work function itself which must have a prescribed set of input parameters. One of these parameters is a structure containing pointers to the arrays of data; the work function can perform any desired operations on these arrays and does not need to worry about how the input data were read from the file or how the output data get written back to the file.

The second step is to design the driver routine that opens all the necessary FITS files and initializes the input parameters to the iterator function. The driver program calls the CFITSIO iterator function which then reads the data and passes it to the user’s work function.

The following 2 sections describe these steps in more detail. There are also several example programs included with the CFITSIO distribution which illustrate how to use the iterator function.

6.1 The Iterator Work Function

The user-supplied iterator work function must have the following set of input parameters (the function can be given any desired name):

  int user_fn( long totaln, long offset, long firstn, long nvalues,
               int narrays, iteratorCol *data,  void *userPointer )

The totaln, offset, narrays, data, and userPointer parameters are guaranteed to have the same value on each iteration. Only firstn, nvalues, and the arrays of data pointed to by the data structures may change on each iterative call to the work function.

Note that the iterator treats an image as a long 1-D array of pixels regardless of it’s intrinsic dimensionality. The total number of pixels is just the product of the size of each dimension, and the order of the pixels is the same as the order that they are stored in the FITS file. If the work function needs to know the number and size of the image dimensions then these parameters can be passed via the userPointer structure.

The iteratorCol structure is currently defined as follows:

typedef struct  /* structure for the iterator function column information */
{
   /* structure elements required as input to fits_iterate_data: */

  fitsfile *fptr;       /* pointer to the HDU containing the column or image */
  int      colnum;      /* column number in the table; ignored for images    */
  char     colname[70]; /* name (TTYPEn) of the column; null for images      */
  int      datatype;    /* output data type (converted if necessary) */
  int      iotype;      /* type: InputCol, InputOutputCol, or OutputCol */

  /* output structure elements that may be useful for the work function: */

  void     *array;    /* pointer to the array (and the null value) */
  long     repeat;    /* binary table vector repeat value; set     */
                      /*     equal to 1 for images                 */
  long     tlmin;     /* legal minimum data value, if any          */
  long     tlmax;     /* legal maximum data value, if any          */
  char     unit[70];  /* physical unit string (BUNIT or TUNITn)    */
  char     tdisp[70]; /* suggested display format; null if none    */

} iteratorCol;

Instead of directly reading or writing the elements in this structure, it is recommended that programmers use the access functions that are provided for this purpose.

The first five elements in this structure must be initially defined by the driver routine before calling the iterator routine. The CFITSIO iterator routine uses this information to determine what column or array to pass to the work function, and whether the array is to be input to the work function, output from the work function, or both. The CFITSIO iterator function fills in the values of the remaining structure elements before passing it to the work function.

The array structure element is a pointer to the actual data array and it must be cast to the correct data type before it is used. The ‘repeat’ structure element give the number of data values in each row of the table, so that the total number of data values in the array is given by repeat * nvalues. In the case of image arrays and ASCII tables, repeat will always be equal to 1. When the data type is a character string, the array pointer is actually a pointer to an array of string pointers (i.e., char **array). The other output structure elements are provided for convenience in case that information is needed within the work function. Any other information may be passed from the driver routine to the work function via the userPointer parameter.

Upon completion, the work routine must return an integer status value, with 0 indicating success and any other value indicating an error which will cause the iterator function to immediately exit at that point. Return status values in the range 1 – 1000 should be avoided since these are reserved for use by CFITSIO. A return status value of -1 may be used to force the CFITSIO iterator function to stop at that point and return control to the driver routine after writing any output arrays to the FITS file. CFITSIO does not considered this to be an error condition, so any further processing by the application program will continue normally.

6.2 The Iterator Driver Function

The iterator driver function must open the necessary FITS files and position them to the correct HDU. It must also initialize the following parameters in the iteratorCol structure (defined above) for each column or image before calling the CFITSIO iterator function. Several ‘constructor’ routines are provided in CFITSIO for this purpose.

After the driver routine has initialized all these parameters, it can then call the CFITSIO iterator function:

  int fits_iterate_data(int narrays, iteratorCol *data, long offset,
      long nPerLoop, int (*workFn)( ), void *userPointer, int *status);

When fits_iterate_data is called it first allocates memory to hold all the requested columns of data or image pixel arrays. It then reads the input data from the FITS tables or images into the arrays then passes the structure with pointers to these data arrays to the work function. After the work function returns, the iterator function writes any output columns of data or images back to the FITS files. It then repeats this process for any remaining sets of rows or image pixels until it has processed the entire table or image or until the work function returns a non-zero status value. The iterator then frees the memory that it initially allocated and returns control to the driver routine that called it.

6.3 Guidelines for Using the Iterator Function

The totaln, offset, firstn, and nvalues parameters that are passed to the work function are useful for determining how much of the data has been processed and how much remains left to do. On the very first call to the work function firstn will be equal to offset + 1; the work function may need to perform various initialization tasks before starting to process the data. Similarly, firstn + nvalues - 1 will be equal to totaln on the last iteration, at which point the work function may need to perform some clean up operations before exiting for the last time. The work function can also force an early termination of the iterations by returning a status value = -1.

The narrays and iteratorCol.datatype arguments allow the work function to double check that the number of input arrays and their data types have the expected values. The iteratorCol.fptr and iteratorCol.colnum structure elements can be used if the work function needs to read or write the values of other keywords in the FITS file associated with the array. This should generally only be done during the initialization step or during the clean up step after the last set of data has been processed. Extra FITS file I/O during the main processing loop of the work function can seriously degrade the speed of the program. Note that the behavior of the fits_iterate_data() is undefined if narrays is zero.

If variable-length array columns are being processed, then the iterator will operate on one row of the table at a time. In this case the the repeat element in the interatorCol structure will be set equal to the number of elements in the current row that is being processed.

One important feature of the iterator is that the first element in each array that is passed to the work function gives the value that is used to represent null or undefined values in the array. The real data then begins with the second element of the array (i.e., array[1], not array[0]). If the first array element is equal to zero, then this indicates that all the array elements have defined values and there are no undefined values. If array[0] is not equal to zero, then this indicates that some of the data values are undefined and this value (array[0]) is used to represent them. In the case of output arrays (i.e., those arrays that will be written back to the FITS file by the iterator function after the work function exits) the work function must set the first array element to the desired null value if necessary, otherwise the first element should be set to zero to indicate that there are no null values in the output array. CFITSIO defines 2 values, FLOATNULLVALUE and DOUBLENULLVALUE, that can be used as default null values for float and double data types, respectively. In the case of character string data types, a null string is always used to represent undefined strings.

In some applications it may be necessary to recursively call the iterator function. An example of this is given by one of the example programs that is distributed with CFITSIO: it first calls a work function that writes out a 2D histogram image. That work function in turn calls another work function that reads the ‘X’ and ‘Y’ columns in a table to calculate the value of each 2D histogram image pixel. Graphically, the program structure can be described as:

 driver --> iterator --> work1_fn --> iterator --> work2_fn

Finally, it should be noted that the table columns or image arrays that are passed to the work function do not all have to come from the same FITS file and instead may come from any combination of sources as long as they have the same length. The length of the first table column or image array is used by the iterator if they do not all have the same length.

6.4 Complete List of Iterator Routines

All of the iterator routines are listed below. Most of these routines do not have a corresponding short function name.

1
Iterator ‘constructor’ functions that set the value of elements in the iteratorCol structure that define the columns or arrays. These set the fitsfile pointer, column name, column number, datatype, and iotype, respectively. The last 2 routines allow all the parameters to be set with one function call (one supplies the column name, the other the column number).
  int fits_iter_set_file(iteratorCol *col, fitsfile *fptr);

  int fits_iter_set_colname(iteratorCol *col, char *colname);

  int fits_iter_set_colnum(iteratorCol *col, int colnum);

  int fits_iter_set_datatype(iteratorCol *col, int datatype);

  int fits_iter_set_iotype(iteratorCol *col, int iotype);

  int fits_iter_set_by_name(iteratorCol *col, fitsfile *fptr,
          char *colname, int datatype,  int iotype);

  int fits_iter_set_by_num(iteratorCol *col, fitsfile *fptr,
          int colnum, int datatype,  int iotype);
2
Iterator ‘accessor’ functions that return the value of the element in the iteratorCol structure that describes a particular data column or array
  fitsfile * fits_iter_get_file(iteratorCol *col);

  char * fits_iter_get_colname(iteratorCol *col);

  int fits_iter_get_colnum(iteratorCol *col);

  int fits_iter_get_datatype(iteratorCol *col);

  int fits_iter_get_iotype(iteratorCol *col);

  void * fits_iter_get_array(iteratorCol *col);

  long fits_iter_get_tlmin(iteratorCol *col);

  long fits_iter_get_tlmax(iteratorCol *col);

  long fits_iter_get_repeat(iteratorCol *col);

  char * fits_iter_get_tunit(iteratorCol *col);

  char * fits_iter_get_tdisp(iteratorCol *col);
3
The CFITSIO iterator function
  int fits_iterate_data(int narrays,  iteratorCol *data, long offset,
            long nPerLoop,
            int (*workFn)( long totaln, long offset, long firstn,
                           long nvalues, int narrays, iteratorCol *data,
                           void *userPointer),
            void *userPointer,
            int *status);

Previous Up Next