STScI has committed itself to developing new calibration pipeline tasks in the C language (and eventually, all new software). As a consequence, the STIS and NICMOS pipelines are the first to be so developed. All STIS and NICMOS code have been written in ANSI C (including the standard ANSI C library) with no vendor extensions. The pipeline software is modular in design with dependencies on the software environment limited and managed. The pipelines use the IRAF/STSDAS/TABLES libraries to perform I/O and computational processing. The IRAF/STSDAS/TABLES routines are accessed by use of C bindings developed at STScI (which are distributed with STSDAS/TABLES 2.0). The current implementation of the bindings allows tasks to be built as IRAF native tasks or as host-level tasks (the native form of tasks can also be run at host level in the traditional IRAF manner). Currently the pipeline tasks have been built as host-level tasks because that form has been much more extensively tested.
Data from the new instruments are mapped into ANSI C data structures which correspond closely with the logical structure of the data files. The pipeline software consists of calibration steps that operate on these data structures. A major goal was to make the calibration algorithms dependent only on these data structures and not on the details of a particular file format. All I/O is encapsulated in basic routines that form a mapping between these data structures and the data files.
The science data files for STIS and NICMOS use FITS files with image extensions. Data from both instruments may be represented as sequences of FITS Header-Data-Units (HDUs) containing two-dimensional arrays. These HDUs are grouped as an array of science data, together with an error array and a data quality array representing, respectively, the statistical error associated with each pixel and an array of boolean conditions that identify various possible anomalous conditions that may be associated with each pixel (stored as an integer). In addition, NICMOS has two additional arrays, an array identifying the number of samples associated with each pixel and an array identifying the integration time associated with each pixel. The structure of the FITS file for both instruments is shown below. The set of image extensions that make up the associated arrays for an exposure or readout are referred to as IMSETs. The primary header contains only those keywords that apply to all the FITS extensions contained in the file. For extracted spectra, BINTABLE extensions are used where spectra are stored in individual rows using arrays in table cells.
Primary Header Data Unit general keywords no data first Science HDU \ first Error HDU | - First IMSET first Data Quality HDU / second Science HDU \ second Error HDU | - Second IMSET second Data Quality HDU / etc.
Primary Header Data Unit general keywords no data first Science HDU \ first Error HDU | first Data Quality HDU |-First IMSET first Samples HDU | first Integration Times HDU / second Science HDU \ second Error HDU | second Data Quality HDU |-Second IMSET second Samples HDU | second Integration Times HDU / etc.
A number of basic ANSI C data structures have been defined to correspond to the data files described above. These correspond to various elements such as 2-d arrays of all types, individual HDUs, individual and multiple IMSETs, and whole files.
Supporting these data structures is an I/O interface module called HSTIO. High-level functions in the HSTIO interface include:
The HSTIO interface also includes a number of lower-level functions that can be used if there is insufficient memory to store entire HDUs. These include:
All of the high-level functions are implemented in terms of the lower-level functions. These lower-level functions are implemented using IRAF's image I/O, enhanced by the addition of the FITS kernel with support for image extensions.
The design of the HSTIO interface employs the principle of encapsulation: implementation details are carefully concealed within these I/O functions. There are two important consequences of this design: flexibility in usage and flexibility in implementation. First, by encapsulating all I/O operations within the HSTIO interface, the pipeline calibrations can be written in the form of pure algorithms, with few environmental dependencies. Furthermore, since these algorithms are implemented in a widely available standard language (ANSI C), they are easily used in other environments. Second, encapsulation also allows alternative implementations of the I/O functions without impacting the calibration algorithms.
The Space Telescope Imaging Spectrograph (STIS) is a versatile instrument, making use of both CCD and MAMA detectors for visual and ultraviolet coverage, operating in either imaging or spectroscopic mode. It has both first order and Echelle gratings, with a wide variety of apertures ranging from large imaging apertures to long slits to short Echelle slits. Images can be full frame, binned, or a subset of the full frame. MAMA data can be taken either in accumulate or time-tag mode. This versatility has resulted in greater complexity of the pipeline processing for STIS. The computation of the expected error of each pixel of the data products is an integral part of the pipeline processing at every stage as is keeping a record of data quality for each pixel. This information is stored along with the science data as part of each IMSET described previously and is used in subsequent processing to determine which pixels should be used as part of the processing and with any appropriate weighting factors based on the expected error.
STIS data are associated for three purposes. (1) Since the CCD detector is sensitive to cosmic rays, multiple exposures (CR-split) can be taken at the same pointing for the purpose of identifying and rejecting cosmic rays. (2) Multiple exposures (repeatobs) can be taken with either detector for time resolution or to prevent overflow. (3) For spectroscopic observations, line lamp exposures (wavecals) are taken to allow accurate location of the image on the detector. During pipeline processing, Generic Conversion combines the individual exposures into one FITS file; a second file is written for the wavecals. Calstis adds together the CR-split exposures prior to flat fielding, with cosmic rays excluded from the sum. Repeatobs data are maintained as separate exposures, though calstis does also add them together. Wavecals associated with science data are processed by calstis to determine the image offset from nominal, and that offset is taken into account when extracting spectra from the science data.
The basic steps performed by calstis are: cosmic ray rejection; bias, dark, flat, etc.; wavecal processing; 2-D spectral rectification; and 1-D spectral extraction. Calstis is executed in the pipeline as one program which can do all these steps. The off-line version includes this program, but the basic steps can also be executed as separate tasks. The pipeline version of calstis (this and the following apply to calnic as well) is driven by keywords in the input header, but the separate tasks take command-line switches to control the processing. These are all host level programs, but they can be run from the IRAF CL through the use of CL scripts. The script parameters include the input and output file names and calibration switches. Each script constructs a command line string, beginning with "!" and using osfn() to get the name of the executable. File names and switches are appended to the string, depending on the values of the CL parameters. Then the string is piped to the CL to execute the program.
NICMOS (Near Infrared Camera and Multi-Object Spectrograph) is a instrument capable of observing between 0.8 and 2.5 microns. There are three independent cameras capable of being used simultaneously; each had a different pixel size and corresponding field of view. The NICMOS data reduction and calibration pipeline is divided into two main tasks, calnica and calnicb . Calnica is applied to individual observations, while calnicb is used only for combining associated observations. Calnica performs the instrumental calibration of all observations and applies typical corrections such as the subtraction of detector dark current, correction for non-linear detector response, and flat fielding. For observations obtained in the NICMOS multiple accumulate (MultiAccum) mode, in which many non-destructive readouts of the detector are performed during the course of an exposure, calnica applies instrumental calibrations to all readouts individually and then combines the calibrated data from all readouts into a single, final image. Cosmic ray rejection is also performed in the course of combining the images.
Calnicb processes the data from associated observations and is used only after all observations in the association have been calibrated with calnica . NICMOS associated observations are typically used for 1) taking multiple exposures at a single position on the sky in order to be able to perform cosmic ray rejection, 2) obtaining dithered exposures of a target in order to remove the effects of bad pixels and residual flat field errors, and 3) obtaining images of nearby off-target sky positions in order to remove the sky and telescope background signal from on-target observations. The latter is typically only of concern for NICMOS when observing at wavelengths longer than ~2.0 microns where the warm telescope and instrument optics begin to contribute a significant amount of signal.
NICMOS associated observations are a logical grouping only; the data from individual exposures within an association are stored and treated separately until they are combined by calnicb . The main tasks performed by calnicb are to perform combining operations on separate exposures--taken at identical or different but overlapping positions--and in the process eliminate cosmic rays and subtract background.
Both calnica and calnicb propagate four auxiliary data arrays associated with the science images (i.e., the IMSET). First, calnica computes statistical errors based on a combination of detector readnoise and Poisson noise in the signal. These errors are then propagated and combined with known statistical errors in all calibration data, such as uncertainties in the dark and flatfield data. Second, data quality flags, indicating known problem conditions with science image pixels (such as hot/cold pixels, saturation, etc.) are stored as bit-encoded integer arrays. These flags are used by many calibration steps to identify pixels that should not be used. Finally, two arrays are used to record the number of data samples used to compute each science image pixel value and their total exposure time. This information is computed and propagated in all steps involving image combination so that the end user will have an accurate map of the exposure time and the number of data points associated with each science image pixel.
Perry Greenfield, Phil Hodge, Howard Bushouse
Space Telescope Science Institute