3 Sources
[1]
Intel releases new tool to measure gaming image quality in real time -- AI tool measures impact of upscalers, frame gen, others; Computer Graphics Video Quality Metric now available on GitHub
New dataset and companion AI model chart a new path forward for objectively quantifying image quality from modern rendering techniques Intel is potentially making it easier to objectively evaluate the image quality of modern games. A new AI-powered video quality metric, called the Computer Graphics Visual Quality Metric, or CGVQM, is now available on GitHub, as a PyTorch application. A frame of animation is rarely natively rendered in today's games. Between the use of upscalers like DLSS, frame-generation techniques, and beyond, a host of image quality issues, like ghosting, flicker, aliasing, disocclusion, and many more, can arise. We frequently discuss these issues qualitatively, but assigning an objective measurement or score to the overall performance of those techniques in the context of an output frame is much harder. Metrics like the peak signal-to-noise ratio (PSNR) are commonly used to quantify image quality in evaluations of video, but those measurements are subject to limitations and misuse. One such potential misuse is in the evaluation of real-time graphics output with PSNR, which is mainly meant for evaluating the quality of lossy compression. Compression artifacts aren't generally a problem in real-time graphics, though, so PSNR alone can't account for all the potential issues we mentioned. To create a better tool for objectively evaluating the image quality of modern real-time graphics output, researchers at Intel have taken a twofold approach, discussed in a paper, "CGVQM+D: Computer Graphics Video Quality Metric and Dataset," by Akshay Jindal, Nabil Sadaka, Manu Mathew Thomas, Anton Sochenov, and Anton Kaplanyan. The researchers first created a new video dataset, called the Computer Graphics Visual Quality Dataset or CGVQD, that includes a variety of potential image quality degradations created by modern rendering techniques. The authors considered distortions arising from path tracing, neural denoising, neural supersampling techniques (like FSR, XeSS, and DLSS), Gaussian splatting, frame interpolation, and adaptive variable-rate shading. Second, the researchers trained an AI model to produce a new rating of image quality that accounts for this wide range of possible distortions: the aforementioned Computer Graphics Visual Quality Metric, or CGVQM. Using an AI model makes the evaluation and rating of real-time graphics output quality a more scalable exercise. To ensure that their AI model's observations correlated well with those of humans, the researchers first presented their new video data set to a group of human observers to create a ground truth regarding the perceived extremity of the various distortions within. The observers were asked to rate the different types of distortions contained in each video on a scale ranging from "imperceptible" to "very annoying." With their quality baseline in hand, the researchers set out to train a neural network that could also identify these distortions, ideally one competitive with human observers. They chose a 3D convolutional neural network (CNN) architecture, using a residual neural network (ResNet) as the basis for their image quality evaluation model. They used the 3D-ResNet-18 model as the basis of their work, training and calibrating it specifically to recognize the distortions of interest. The choice of a 3D network was crucial in achieving high performance on the resulting image quality metric, according to the paper. A 3D network can consider not only spatial (2D) pattern information, such as the grid of pixels in an input frame, but also temporal pattern information. In use, the paper claims that the CGVQM model outperforms practically every other similar image quality evaluation tool, at least on the researchers' own dataset. Indeed, the more intensive CGVQM-5 model is second only to the human baseline evaluation of the CGVQD catalog of videos, while the simpler CGVQM-2 still places third among the tested models. The researchers further demonstrate that their model not only performs well at identifying and localizing distortions within the Computer Graphics Visual Quality Dataset, but also is able to generalize its identification powers to videos that aren't part of its training set. That generalizable characteristic is critical for the model to become a broadly useful tool in evaluating image quality from real-time graphics applications. While the CGVQM-2 and CGVQM-5 models didn't lead across the board on other data sets, they were still strongly competitive across a wide range of them. The paper leaves open several possible routes to improving this neural-network-driven approach to evaluating real-time graphics output. One is the use of a transformer neural network architecture to further improve performance. The researchers explain that they chose a 3D-CNN architecture for their work due to the massive increase in compute resources that would be required to run a transformer network against a subject data set. The researchers also leave open the possibility of including information like optical flow vectors to refine the image quality evaluation. Even in its present state, however, the strength of the CGVQM model suggests it's an exciting advance in the evaluation of real-time graphics output.
[1]
Intel releases new tool to measure gaming image quality in real time -- AI tool measures impact of upscalers, frame gen, others; Computer Graphics Video Quality Metric now available on GitHub
New dataset and companion AI model chart a new path forward for objectively quantifying image quality from modern rendering techniques Intel is potentially making it easier to objectively evaluate the image quality of modern games. A new AI-powered video quality metric, called the Computer Graphics Visual Quality Metric, or CGVQM, is now available on GitHub, as a PyTorch application. A frame of animation is rarely natively rendered in today's games. Between the use of upscalers like DLSS, frame-generation techniques, and beyond, a host of image quality issues, like ghosting, flicker, aliasing, disocclusion, and many more, can arise. We frequently discuss these issues qualitatively, but assigning an objective measurement or score to the overall performance of those techniques in the context of an output frame is much harder. Metrics like the peak signal-to-noise ratio (PSNR) are commonly used to quantify image quality in evaluations of video, but those measurements are subject to limitations and misuse. One such potential misuse is in the evaluation of real-time graphics output with PSNR, which is mainly meant for evaluating the quality of lossy compression. Compression artifacts aren't generally a problem in real-time graphics, though, so PSNR alone can't account for all the potential issues we mentioned. To create a better tool for objectively evaluating the image quality of modern real-time graphics output, researchers at Intel have taken a twofold approach, discussed in a paper, "CGVQM+D: Computer Graphics Video Quality Metric and Dataset," by Akshay Jindal, Nabil Sadaka, Manu Mathew Thomas, Anton Sochenov, and Anton Kaplanyan. The researchers first created a new video dataset, called the Computer Graphics Visual Quality Dataset or CGVQD, that includes a variety of potential image quality degradations created by modern rendering techniques. The authors considered distortions arising from path tracing, neural denoising, neural supersampling techniques (like FSR, XeSS, and DLSS), Gaussian splatting, frame interpolation, and adaptive variable-rate shading. Second, the researchers trained an AI model to produce a new rating of image quality that accounts for this wide range of possible distortions: the aforementioned Computer Graphics Visual Quality Metric, or CGVQM. Using an AI model makes the evaluation and rating of real-time graphics output quality a more scalable exercise. To ensure that their AI model's observations correlated well with those of humans, the researchers first presented their new video data set to a group of human observers to create a ground truth regarding the perceived extremity of the various distortions within. The observers were asked to rate the different types of distortions contained in each video on a scale ranging from "imperceptible" to "very annoying." With their quality baseline in hand, the researchers set out to train a neural network that could also identify these distortions, ideally one competitive with human observers. They chose a 3D convolutional neural network (CNN) architecture, using a residual neural network (ResNet) as the basis for their image quality evaluation model. They used the 3D-ResNet-18 model as the basis of their work, training and calibrating it specifically to recognize the distortions of interest. The choice of a 3D network was crucial in achieving high performance on the resulting image quality metric, according to the paper. A 3D network can consider not only spatial (2D) pattern information, such as the grid of pixels in an input frame, but also temporal pattern information. In use, the paper claims that the CGVQM model outperforms practically every other similar image quality evaluation tool, at least on the researchers' own dataset. Indeed, the more intensive CGVQM-5 model is second only to the human baseline evaluation of the CGVQD catalog of videos, while the simpler CGVQM-2 still places third among the tested models. The researchers further demonstrate that their model not only performs well at identifying and localizing distortions within the Computer Graphics Visual Quality Dataset, but also is able to generalize its identification powers to videos that aren't part of its training set. That generalizable characteristic is critical for the model to become a broadly useful tool in evaluating image quality from real-time graphics applications. While the CGVQM-2 and CGVQM-5 models didn't lead across the board on other data sets, they were still strongly competitive across a wide range of them. The paper leaves open several possible routes to improving this neural-network-driven approach to evaluating real-time graphics output. One is the use of a transformer neural network architecture to further improve performance. The researchers explain that they chose a 3D-CNN architecture for their work due to the massive increase in compute resources that would be required to run a transformer network against a subject data set. The researchers also leave open the possibility of including information like optical flow vectors to refine the image quality evaluation. Even in its present state, however, the strength of the CGVQM model suggests it's an exciting advance in the evaluation of real-time graphics output.
[3]
Intel's fancy new AI tool measures image quality in games in real time, so upscaling artifacts and visual nasties have nowhere to hide
Image quality perception, I've often found, varies massively from person to person. Some can't tell the difference between a game running with DLSS set to Performance and one running at Native, while others can easily ignore the blurriness of a poor TAA implementation while their peers are busy climbing the walls. Intel's new tool, however, attempts to drill down on image quality and provide a quantifiable end result to give game developers a helping hand. The Computer Graphics Video Quality Metric (CVGM) tool aims to detect and rate distortions introduced by modern rendering techniques and aids, like neural supersampling, path tracing, and variable rate shading, in order to provide a useful evaluation result. The Intel team took 80 short video sequences depicting a range of visual artifacts introduced by supersampling methods like DLSS, FSR, and XeSS, and various other modern rendering techniques. They then conducted a subjective study with 20 participants, each rating the perceived quality of the videos compared to a reference version. Distortions shown in the videos include flickering, ghosting, moire patterns, fireflies, and blurry scenes. Oh, and straight up hallucinations, in which a neural model reconstructs visual data in entirely the wrong way. I'm sure you were waiting for this part: A 3D CNN model (ie, the sort of AI model used in many traditional AI-image enhancement techniques) was then calibrated using the participants' dataset to predict image quality by comparing the reference and distorted videos. The tool then uses the model to detect and rate visual errors, and provides a global quality score along with per-pixel error maps, which highlight artifacts -- and even attempts to identify how they may have occurred. What you end up with after all those words, according to Intel, is a tool that outperforms all the other current metrics when it comes to predicting how humans will judge visual distortions. Not only does it predict how distracting a human player will find an error, but it also provides easily-interpretable maps to show exactly where it's occurring in a scene. Intel hopes it will be used to optimise quality and performance trade-offs when implementing upscalers, and provide smarter reference generation for training denoising algorithms. "Whether you're training neural renderers, evaluating engine updates, or testing new upscaling techniques, having a perceptual metric that aligns with human judgment is a huge advantage", says Intel. "While [CGVQM's] current reliance on reference videos limits some applications, ongoing work aims to expand CGVQM's reach by incorporating saliency, motion coherence, and semantic awareness, making it even more robust for real-world scenarios." Cool. You don't have to look far on the interwebs to find people complaining about visual artifacts introduced by some of these modern image-quality-improving and frame rate-enhancing techniques (this particular sub-Reddit springs to mind). So, anything that allows devs to get a better bead on how distracting they might be seems like progress to me. The tool is now available on GitHub as a PyTorch implementation, so have at it, devs.
Share
Copy Link
Intel has released a new AI-powered tool called Computer Graphics Visual Quality Metric (CGVQM) to objectively evaluate image quality in modern games, addressing issues arising from upscaling and frame generation techniques.
Intel has unveiled a groundbreaking AI-powered tool designed to objectively evaluate image quality in modern video games. The Computer Graphics Visual Quality Metric (CGVQM), now available on GitHub as a PyTorch application, aims to address the challenges posed by contemporary rendering techniques 1[2]3.
Modern games rarely render frames natively, instead relying on techniques such as upscalers (like DLSS) and frame generation. These methods can introduce various image quality issues, including ghosting, flickering, and aliasing. While these problems are often discussed qualitatively, assigning objective measurements has been challenging 1[2].
Existing metrics like peak signal-to-noise ratio (PSNR) have limitations when applied to real-time graphics output. To overcome these constraints, Intel researchers developed a two-pronged approach:
The CGVQM model utilizes a 3D convolutional neural network (CNN) architecture, specifically a 3D-ResNet-18 model. This choice allows the network to consider both spatial and temporal pattern information, crucial for high-performance image quality evaluation 1[2].
Key features of the CGVQM model include:
Source: Tom's Hardware
To ensure the AI model's observations aligned with human perception, researchers conducted a study with 20 participants. These human observers rated various distortions in video sequences, establishing a baseline for the AI model 1[2]3.
The CGVQM tool's ability to predict human judgment of visual distortions makes it valuable for:
While the current version of CGVQM shows promise, there's room for improvement:
Source: pcgamer
The tool's reliance on reference videos currently limits some applications. However, Intel's researchers are working to expand CGVQM's capabilities, making it more robust for real-world scenarios 3.
Summarized by
Navi
Google has launched its new Pixel 10 series, featuring improved AI capabilities, camera upgrades, and the new Tensor G5 chip. The lineup includes the Pixel 10, Pixel 10 Pro, and Pixel 10 Pro XL, with prices starting at $799.
60 Sources
Technology
13 hrs ago
60 Sources
Technology
13 hrs ago
Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to compete with Apple in the premium handset market.
22 Sources
Technology
13 hrs ago
22 Sources
Technology
13 hrs ago
NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.
6 Sources
Technology
21 hrs ago
6 Sources
Technology
21 hrs ago
Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, AI-powered features, and satellite communication capabilities, positioning it as a strong competitor in the smartwatch market.
18 Sources
Technology
13 hrs ago
18 Sources
Technology
13 hrs ago
FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.
7 Sources
Technology
13 hrs ago
7 Sources
Technology
13 hrs ago