The history of data artifacts is as long as the history of observational astronomy. Artifacts such as dead pixels, hot pixels, and cosmic ray hits are common in astronomical images. They at best render the pixels’ data unusable while, at worst, disable the entire image in downstream approaches.
In dealing with missing pixels, some astronomical procedures simply ignore them while others require imputing their values first. Optimal extraction of spectra and Point Spread Function (PSF) photometry ignore missing data, while box spectral extraction and aperture photometry do not. Aperture photometry and box extraction have the advantage of requiring little knowledge about the PSF or line-spread function (LSF). For this reason, aperture photometry has been used for ultra-precise Kepler photometry. Box extraction is currently standard for the SPHERE and GPI integral-field spectrographs.
In general, correcting the corrupted data in an image involves two steps: identifying what they are and imputing their values. Existing algorithms have emphasized bad pixel identification and rejection. For example, there are well-developed packages that detect cosmic rays (CRs) by comparing multiple exposures. When multiple exposures are not available, Rhoads rejects CRs by applying a PSF filter, van Dokkum by Laplacian edge detection (LACosmic), and Pych by iterative histogram analysis. Among the above methods, LACosmic offers the best performance. Approaches based on machine learning like deepCR, a deep-learning algorithm, may offer further improvements.
In contrast, the literature on methods of imputing missing data is sparse. Currently, the most common approach is the median replacement, which replaces a bad pixel with the median of its neighbours. Algorithms that apply median replacement include LACosmic. Some other packages, such as astropy.convolution, take an average of the surrounding pixels, weighted by Gaussian kernels. An alternative is a 1D linear interpolation. This approach is standard for the integral-field spectrographs GPI and SPHERE. deepCR, on the other hand, predicts the true pixel values by a trained neural network. However, none of these methods are statistically well-motivated, and they usually apply a fixed interpolation kernel to all images and everywhere on the same image. In reality, however, the optimal kernel could vary from image to image and even from pixel to pixel. Moreover, in a continuous region of bad pixels or near the boundary of an image, most existing data imputation approaches either have their performance compromised or have to treat these regions as special cases. Only deepCR can handle them naturally with minimal performance differences.
Now, Zhang and Brandt in their recent paper presented astrofix, a robust and flexible image imputation algorithm based on Gaussian Process Regression (GPR). Through an optimization process, astrofix chooses and applies a different interpolation kernel to each image, using a training set extracted automatically from that image. It naturally handles clusters of bad pixels and image edges and adapts to various instruments and image types, including both imaging and spectroscopy. The mean absolute error of astrofix is several times smaller than that of median replacement and interpolation by a Gaussian kernel (i.e. astropy.convulation).
astrofix accepts images with a bad pixel mask or images with bad pixels flagged as NaN, and it fixes any given image in three steps:
- Determine the training set of pixels that astrofix will attempt to reproduce.
- Find the optimal hyperparameters a and h (or a, hx and hy) given the training set from Step 1.
- Fix the image by imputing data for the ba.
According to authors, the actual performance of astrofix may depend on the initial guess for the optimization, the choice of the training set, and the resemblance of the covariance function to the instrumental PSF. Other covariance functions remain to be explored, and the use of sGPR should be considered carefully.
They also showed that astrofix also has good potential to be used for bad pixel detection. One could compare the GPR imputed values with the measured counts and the expected noise at each pixel, and iterate this procedure to reject continuous regions of bad pixels.
“astrofix has the potential to outperform conventional bad pixel detection algorithms because of its ability to train the imputation specifically for each image. This idea could be developed in future work.”— concluded authors of the study.
They demonstrated good performance of astrofix on both imaging and spectroscopic data, including the SBIG 6303 0.4m telescope and the FLOYDS spectrograph of Las Cumbres Observatory and the CHARIS integral-field spectrograph on the Subaru Telescope.
algorithm is implemented in the Python package astrofix, which is available at this https URL
Featured image: Corrections to the CHARIS Image by GPR, the 5 × 5 median filter, and astropy.convolution. The counts are plotted on a logarithmic scale. GPR best restores the structure of the bars, while the two other approaches produce fuzzier images. © Zhang and Brandt
Reference: Hengyue Zhang, Timothy D. Brandt, “Cleaning Images with Gaussian Process Regression”, ArXiv, 23 March 2021. https://arxiv.org/abs/2103.12250
Copyright of this article totally belongs to our author S. Aman. One is allowed to reuse it only by giving proper credit either to him or to us