Survey on CNN based Super Resolution Methods

Super Resolution is a field of image analysis that focuses on boosting the resolution of photographs and movies without compromising detail or visual appeal, instead enhancing both. Multiple (many input images and one output image) or single (one input and one output) stages are used to convert low-resolution photos to high-resolution photos. The study examines super-resolution methods based on a convolutional neural network (CNN) for super-resolution mapping at the sub-pixel level, as well as its primary characteristics and limitations for noisy or medical images.


Introduction
Over the last few decades, viewing equipment such as plasma display panels (PDPs), liquid crystal displays (LCDs), and light emitting diodes (LEDs) (light emitting screens) has advanced tremendously, displaying images with high spatial and temporal parameters that are crystalline evident and visually good. However, despite the high request for HD content, it is not always available due to a variety of factors such as bandwidth constraints, different noise sources, and different compression methodologies, to name a few. Another important factor is the general availability of high-quality imaging equipment, and healthcare data is no longer modest (Al-falluji et al., 2017).
The data has grown enormously as a result of remarkable developments in picture collection technologies, making image analysis both tough and interesting. This quick advancement in medical pictures and processes necessitates lengthy and tedious work on the part of a particular medical expert, is vulnerable to human error, and may differ significantly between specialists. The use of machine learning techniques to automate the diagnostic process is an alternate solution; nevertheless, to cope with a complex problem, standard machine learning algorithms are insufficient. The successful union of computing at a high level and machine learning promises the potential to process enormous amounts of medical picture data in order to provide accurate and timely diagnoses.
Deep learning will not only assist in the identification and extraction of features, but also in the creation of new ones; additionally, it will not only detecting malady, but will also measure the predictive target and provide actionable prediction models to assist the clinician efficiently (Razzak et al., 2018). The term "ultra HD" refers to a set of algorithms aimed at increasing the resolution of photographs and movies. Upscaling and zooming are the same thing, The presumption behind multiple image hyper-resolution methods is that the multiple images used to create a high-resolution image are geometric transformations and non-aligned versions of each other, and that properly combining them will result in a more detailed image that can be included in an image class (Al-falluji et al., 2017). That is, ultra-high resolution techniques are applied to transform several low-resolution (LR) images to a high-resolution (HR) image. High-resolution photographs may provide more detailed information to everyone, making them more useful in a variety of applications such as satellite imagery, medical images, and so on. Several approaches in the field of sophisticated color digital image processing have emerged as a result of the increased technological interest in picture reconstruction (Shukla et al., 2020).
In terms of current work and future directions, this study presents a review of convolutional neural network-based methods with improved resolution in image analysis challenges. It covers the fundamentals of deep learning as well as the most up-to-date approaches in the field of image processing and analysis.

CNN based super resolution methods
Researchers employed the most up-to-date technology to build a high-resolution image using CNN channels. CNN optimizes all steps of the super-resolution process and tests the nonlinear mapping function in the picture space between low-resolution and high-resolution. Very

VDSR
To build a high-resolution image, A extremely deep convolutional network was utilized, which was inspired by Simonyan & Zisserman (2014) work from 2015. The "D" number of layers was used by CNN. The first layer takes an LR image as input and reconstructs the highresolution image in the final layer. Our main contribution is a rigorous analysis of rising-depth networks using an architecture with extremely small (3 3) convolution filters, which shows that raising the depth to 16-19 weight layers improves performance significantly over priorart setups. The input network image is an interpolated version of a low-resolution image of the same size as the output image. It has been discovered that employing residual images provides a number of advantages, including faster convergence and better performance, that is, it gives better PSNR. Because super-resolution requires surrounding pixels to correctly forecast the pixel, one of the key challenges that deep networks face during the prediction phase is that feature maps are reduced with each iteration of convolution.
In 2016,  presented deeper structures, which was very deep convolutional network (VDSR) uses residual-learning and extremely as well gradient clipping to ensure the training stability with require processing to optimize accurate conclusion, skip connection and quick. And the results are PSNR=37.53, SSIM= 0.9587 and the results bicubic are PSNR= 33.66, SSIM= 0.9299. Yet, because of the slow convergence rate and the requirement for processing bicubic-upscaled LR images, training a very deep network is difficult.
In 2018, Zhang et al, (2018) the depth of a convolutional neural network is critical for visual super-resolution (SR). Deeper networks for image SR, on the other hand, it is additional hard to train. Low-resolution inputs and features contain a lot of low-frequency data, which is processed the same way across channels, limiting CNNs' representational ability. To address these issues, propose the use of very deep residual channel attention networks (RCAN) and the results PSNR= 34.83, SSIM= 0.9296. (2016) He presented a net embedding, inference network, and reconstruction network are the three essential aspects of the proposed algorithm. The algorithm was a deeprecursive convolution network (DRCN), which contains a very deep recursive layer (up to 16 iterations). Despite the fact that the processing of the developed bicubic LR images may be complicated with the standard gradient approach due to burst/fade gradients and increased expulsion between parameters and layers, the results were PSNR = 37.63, SSIM = 0.9588 and PSNR = 33.66, SSIM = 0.9299. Yang & Lu, (2020) proposed in 2020 to improve model representations by increasing the grid's depth and width, which would result in greater image rebuilding quality.

Kim et al
However, a larger network requires more computer resources and memory, making training the network more difficult and increasing forecast time. Due to the aforementioned issues, a novel low-and high-frequency fuser network (DRFFN) is presented, which uses a parallel branching structure to extract low and high-frequency information from the image, respectively. And the results were PSNR=38.16, SSIM=0.9649.

SRCNN
The goal of SRCNN is to learn the bicubic LR picture to HR image end-to-end mapping function. Because the network includes all convolution layers, the output image is the same size as the input image (Yang et al., 2010). As shown in Figure 1.  (Umehara et al., 2018) In 2015, showed the research by Dong et al, (2015) which suggested that SRCNN vaulting superior performance of previous models of hand-made (advanced techniques), both regarding quality of recovery and speed, also the results were PSNR = 36.66, SSIM = 0.9542. Yet, the high-computational costs prevent them from practical use, which requires the performance of real-time (24fps). In 2019, Researcher Pham et al, (2017) developed the SRCNN3D to a focus on increasing the data for SR rotation and flipping to improve generalization of the algorithm and then used it to estimate a network parameter training data set consisting of pairs of HR and LR images by reducing the objective function to better performance and faster convergence with large data sets clustered. And the results were PSNR= 39.01. Critical issues are the size of the training correction and the subject number as well as the case of the MRI of the brain, acquisition of large data sets and their combination with homogeneous acquisition settings.
Li, (2017) implement SRCNN with used 2D non-clinical images then measured by peak signal-to-noise ratio (PSNR) although there advancer methods, but using this simpler model to improve its and implement a preformat super-resolution to model led to gain practical insight into how to better apply deep learning to the task of super-resolution where reached to a (PSNR) of 23.226 dB and 26.442 dB for bicubic up sampling baseline test and SRCNN respectively. It results in increased image resolution but it fails to produce clarity desired in the super-resolution task. The lack of availability of GPUs also limited us to a relatively shallow neural network. This application is more significant in MRI scans and satellite images in medical imaging. The PSNR values of SRCNN image have been increased, as higher PSNR values lead to better image quality. Lastly, the output is going to be the same as the original image, which will be better in terms of texture quality and high resolution. Because using bicubic-interpolation, which has a smoother surface and fewer interpolation tools, is a limitation, by select bicubic-interpolation as the input and send it into 3 warp layers for more processing.

Conclusion
A study of recent ultra-precision algorithms based on learning instances is included in the publication. Because there are more examples to match, traditional learning methods are more accurate than self-correcting approaches. The approaches described are based on convolutional neural networks, which are more accurate, and it was discovered that FSRCNN is the fastest and produces the best results.