I have been developing a new application named Overmix, which attempts to improve the quality of anime screenshot stitching. This article will shortly explain what stitching is, what issues affect the quality and how Overmix tries to fix those. At the end a short summery of the results for the current progress is given.
One common animation technique is panning where the camera moves/pans over the image, showing only a part of it at a given time:
(Shot on YouTube: http://youtu.be/DsHjblyEG88?t=6m25s)
Very little movement actually happens during the shot, in fact only the mouth is moving (presumably to reduce animation costs). This makes it possible to combine the frames together to one large image, which is known as “stitching”.
The issue is however that more often than not, the video quality isn’t that great. The video has been compressed and especially if the source is a TV-transmission or webcast, visual artifacts can be quite noticeable:
The two most significant artifacts with anime encodes is noise (shown above) and color banding/posterization (shown below).
A stitch is normally done by taking two frames, finding the offset between the two images and then soften the edges between the images to make the transition less apparent (which is usually done by applying a gradient on the alpha channel).
Since this is a time consuming process, as few frames as possible is used. The idea is to do the opposite, use as many frames as possible. The reason is that the artifacts are not static, for every frame they differ slightly. In result, every frame carries a slightly different set of information. The goal is then to derive the original information, based on this set of inconsistent information.
Just by using the average, we can get quite decent results:
(Right is a single frame, left is the average of all unique frames.)
Noise artifacts has shown to nearly disappear completely when simply averaging every frame with each other, even when the source has a significant amount of noise artifacts. Color banding is also reduced but with much more varying amounts.
Even with modern TV-encodes, stitches sees a significant improvement from using this technique and can visually be tell apart at normal magnification. Surprisingly, even when using good BD-encodes there is usually a slight improvement, but normally requires 2-4 times magnification to be noticeable.
It has shown that it often is not possible to make a perfect alignment when sticking to the pixel grid. This causes the images to be slightly more blurry than originally. It is an area which still requires work.
Using the average to derive the result is not always desirable, as the encode might contain information not related to the image. Such information could be subtitles, TV logos or simply errors in the source. See the following image as example, the most-right column of pixels was completely black and shows up as lines in the averaged image.
However the currently devised algorithms has a tendency to choke on the slight misalignment mentioned previously and cause unwanted artifacts. If this is solved best by fixing the misalignment or by improving the algorithm is up to discussion.
Overmix is licensed as GPLv3 and can be found here: Overmix on BitBucket