Nov 18 2014

Animated stitches

Category: Overmix,Programs,SoftwareSpiller @ 00:35

Most stitches are mostly static with perhaps a little mouth movement. Some however do contain significant movement and Overmix has never intended to try to merge this into a single image in a sensible fashion.

This is because I have yet to see a program which can do this perfectly. While I have seen several making something which looks decent at first, they usually have several issues. Common issues are lines not properly connecting at places and straight lines ending up being curved.

For me, it need to be perfect. If I need to manually fix it up, or even redo it from scratch, not much is gained. The goal with Overmix was always to reach a level I would not be able to reach without its assists. Thus there is no reason to pursue a silver-bullet solution, especially if it gives worse results than doing it manually.

Instead the approach I have taken is to detect which images belongs to what movement. For example, if an arm is moving, we want to figure out which images are those where the arm has yet to start moving, the images where the arm has stopped moving, and all those in between. In other words, we end up with a group of images for each frame the animator drew.

These groups can be combined individually without any issues and the set of resulting images can be merged manually. However since we might have reduced 100 video frames to 10 animated frames, we can take advantage of the nice denoising and debanding properties of Overmix, which will improve the final quality of the stitch.

Cyclic movement

One interesting use-case which can be fully automated is cyclic movement, i.e. movement which ends the way it started, and continues to loop. This is often people doing repetitive motions such as waving goodbye.

Of course the real benefit is when the view port is moving, as it would be cumbersome to do manually. The following example was a slow pan-up over the course of 89 frames where the wind is rustling the character’s clothes and hair around, reduced to 22 frames:

animated stitch

Notice how the top part of some frames are missing, as the scene ended before the top part had been in view for each frame. The same was the case for the bottom, but since it contained no movement, any frame could fill in the missing information.

(The animation can be downloaded in FullHD resolution APNG here (98 MiB).)

Algorithm

The main difficulty is making a distinction between noise and movement. (Noise can both be compression artifacts, but also others such as TV logos, etc.) A few methods were tried, but the best and simplest one of those take advantage of the fact that most Japanese animation reduce the animation cost by using a lower frame-rate. This is typically 3 video frames for each animated frame, though it can be dynamic through the animation!

The idea is to compare the difference between the previous frame. Since there are usually 3 consecutive frames without animation, it will return a low difference. But as soon as it hits a frame which contains the next part of the animation, a high difference will appear, causing a spike to appear on the graph. Doing this for every frames gives a result like this:

Graph of frame differences

Using this, we can determine a noise threshold by drawing a line (shown in purple) which intersects as many blue lines as possible. While this is mostly tested on cyclic movement, it works surprisingly well.

The ever returning issue of sub-pixel alignment strikes back though. When the stitch contains movement in both directions, the sub-pixel misalignment can cause the difference to become large enough to cause issues. This can easily be avoided by simply using sub-pixel alignment, but as of now this is quite a bit slower in Overmix.

Once the threshold has been determined the images are separated into groups based on that threshold. If the difference between the last image in a group and the next image is below the threshold, it is added to that group. If it could not be added to any group, a new group containing that image will be created. This is done for all the images.

Further work

Notice the file size of the APNG image is nearly 100 MB. This is because each of the 22 images is rendered independently of each other and thus results in 22 completely different images. But the background is the same for each and every frame, so that means we are not taking advantage of the information about the background found in the other frames. Thus, by detecting which parts in the frames are consistent and which differs when rendering, we can both improve quality and reduce file size.

Aligning the resulting frames can be tricky when there is a lot of movement in a cyclic animation, because the images only have a little resemblance to each other. Even when this does work, sub-pixel alignment+rendering is more important than usual since otherwise the ±0.5 error will show up as a shaky animation. I have an idea to how to solve the alignment issue, but my math knowledge is currently too lacking in order to actually implement it.

Tags:


Nov 20 2011

Saving images in the correct format

Category: Software,WebdevelopmentSpiller @ 00:32

I quite often see images stored in wrong formats, see for example this website: http://www.retrofit-anime.com/index.html.

Notice the gradient, it contains ugly lines moving from top to bottom. (More visible on a dark background.) It was properly saved as a 1px high image and then stretched downwards (in order to make the file smaller), however it was saved as JPEG. There are two issues here. First of all, the compression artifacts becomes quite clear as they are repeated, however gradients like this also compresses better in a lossless format like PNG, so it is actually larger than it could have been, with worse quality.

So here is a introduction to image formats which will hopefully give you an idea of when to use one format instead of another.

Compression

File formats uses one of 3 types of compression:

  • No compression: The file is simply raw data and can be read and modified directly. The files usually ends up being rather large though.
  • Lossless compression: The format saves the data in a way that makes the data fill less on the disc, however it still contains all the data. (Like a .zip archive.)
  • Lossy compression: Saves the data, but trows away some of it in a way that the user (hopefully) wouldn’t notice.

Each type has its pros and cons. (Examples are all non-image formats.)

Not compressing at all makes it very fast to display the image when you can access the data just as fast. This is usually the case with an HDD as its reading speed is up to 100 MiB/s (and with SSD reaching speeds of 300MiB/s), however not on the web as the speed is rather slow, often something like 0.5MiB/s (which is 4Mb/s). However its simplicity is its main strength and most file formats (.txt, .exe, .doc, .html, .css, .tar and so on) are therefore uncompressed.

Lossless compression reduces file size greatly in most cases and is therefore used when file size is of importance. The downside is that it becomes slower to open and save, as it has to decompress and compress the file each time. It is also a lot harder to implement. Examples are .docx, .zip and .flac.

Lossy compression reaches file sizes which are normally much smaller than possible with lossless compression, however in the process some data is lost and is impossible to recover. While this might sound terrible, the low file size sometimes is worth the trade-off. Lossy compression usually try to trow away the data humans wouldn’t notice (so much) to avoid losing ‘important’ data. Formats which use this kind of compression is normally media formats (like sound, images and movies) as this kind of data usually is rather large. Examples are .mp3 and .h264.

Image format types

There are two different kind of ways to store an image, Raster and Vector. Raster describes a 2D grid of pixels, just like how your monitor displays it. All popular formats are Raster, that is JPEG, PNG and so on. Vector describes how drawing functions should draw on this 2D grid in order to create an image, for example: ‘draw a line from (30,40) to (100,50) and then draw a circle in (50,50) with radius 10’. Examples on vector formats are SVG and Lego Mindstorms RIC.

Since Vector images are something you create from the bottom up, focus in this post will be on the Raster formats.

Colors in images

The more colors you have to represent in an image, the larger the file becomes. So there are several ways of storing colors and switching from one to another greatly affects size. There are three important modes: indexed, grayscale and truecolor.

Truecolor

This is the normal model which contains all the colors you can display on your monitor. This is normally stored as RGB, i.e. three colors.

Grayscale

Grayscale contains only shades of gray and is stored with a single color. This makes it much smaller than Truecolor, however only in grayscale.

Indexed

Indexed is a bit different. Instead of storing a color for each pixel, it stores an index. This index then refers to a color table which contains a list of truecolors. This means it can only store a few colors (usually up to 256), but you can usually get good results anyway with dithering. Some images, like screenshots might only contain a certain subset of colors which can be indexed, reducing the size greatly sometimes without any loss.

Notice however that this index also takes up a bit of space and if your image contains less than 256 pixels, you might be better of staying in Truecolor.

Alpha

This is a special color, which can be used in addition to Truecolor, Grayscale or Indexed. Alpha specifies the transparency of an pixel which can be quite useful when combining several images together.

However in most situations it is not needed, so make sure you don’t save the alpha values in those cases.

Size of colors

In truecolor and grayscale you store levels of certain colors, like Red, Green and Blue for RGB. However the amount of different shades of the color affect file size too.

A color is usually stored as 8-bit which equals to 1 byte, meaning that it can contain 2^8 = 256 shades of that color. Grayscale uses one color, which means one byte per pixel, while Truecolor uses 3 colors, so it uses 3 bytes per color. If there is an alpha color for each pixel, it would use 1 byte more, to a total of 4 bytes per pixel.

Indexed images with a color table of 256 colors would need to store 8 bits per pixel, as 2^8 = 256. However if your color table is 250 you do still need to store 8 bits, since 7 bits only gives 2^7 = 128. So there is not as much advantage going saving a few colors, unless it will reduce the amount of bits needed per pixel. (The color table will of course be smaller though.)

As said, a color is usually sized to 8-bit, however some images might store more than that per color. 16-bit per color is often used with photography as the more levels increases accuracy when editing the images. However this also means that it uses up twice as much space per pixel, 48-bits, and you can’t display this on monitors either as most are 24-bits (or 18-bits for cheap screens). If you don’t need this, converting to 8-bit can significantly reduce file size.

Overview of image formats

Here is a table over the most common image formats (and I added Lego RIC just for fun).

The most important are PNG (lossless) and JPEG (lossy) as most of the others either compresses badly or is relatively unsupported.

The uncompressed formats like .png and .bmp should be avoided unless you have a good reason to use it, PNG provides just as good result at smaller file size.

TIFF is a bit of a joker, it has many features, however only some are implemented in certain applications. Only use it if you know what you are doing.

JPEG2000, JPEG XR and WebP are some of the popular formats for the future, however both are poorly supported at the moment.

GIF still rule animated images, however there are three alternatives out there, APNG, MNG and WebP.

Guidelines:

Saving while working with images (working copy)

Always use the applications native format if you intend to continue working with the image. This is the only way to ensure that as most data as possible is saved. Most formats doesn’t support layers or any other advanced feature your image editor supports, so all this information will get lost.

If you must save it in a genetic image format, use a lossless one like PNG. (TIFF might save more depending on the implementation, experiment!) If  you save it in a lossy format like JPEG, you will degrade image quality each time you save (and then open it again), as it each time trows a different set of information out.

Lossless versus lossy

In general I tend to use lossless whenever file size is not of great concern. HDDs have reached sizes of 3TB, the only thing not worth saving in lossless today is movies.

But lossless does compress better than lossy in some cases. JPEG works great for  photographs and can easily half the file size compared to PNG. However if the images are more simple graphics, like vector graphics or screenshots of your desktop, PNG suddenly compresses much better. Secondly, the JPEG compression artifacts ruins text readability, so PNG is a much better choice here.

To give you an example, here are two separate screenshots of my two monitors.

Example of good PNG compression

The left monitor, the PNG is 448KB and the JPEG became 777KB at acceptable quality. Try to convert the PNG I supplied and see the difference in quality for yourself.

Example of poor PNG compression

On the right monitor a large amount of my wallpaper was visible. Now the PNG suddenly became 1.95 MB while the JPEG was 801KB. (I have only supplied the JPEG here.)

So as you can see, compression depends a lot on the image. For photos use JPEG. For screenshots, your pie-charts and other similar images, use PNG. For images somewhere in between those, try saving in both PNG and JPEG and compare the sizes and quality.

Don’t convert a lossy image

I have seen it a couple times now, a JPEG image converted to PNG for absolutely no reason. The image will not magically get better by converting it to a lossless format! And even in cases where a lossless normally would compress better, it will not because the JPEG compression added artifacts which the lossless compression conserves. The first test image above became 1.5MB instead of 448KB when converting it to JPEG and back.

Trying to convert a lossy to another lossy format will only mean that you loose even more information in the image, so this is not desirable either.

So it is best to keep the lossy file as it is if you are not going to edit on it.

GIF versus PNG

If you try to save an image in both GIF and PNG, PNG might end up being quite a bit larger. However that is because GIF only supports indexed mode and that it is therefore automatically saved like this, while the PNG is saved in truecolor.

If you save PNG is indexed mode, PNG will in almost every case be smaller than GIF. So there is no reason to use GIF, as PNG compresses better and have more features (except animation).

If you are a web-developer, PNG with alpha works in IE7 and up, while PNG in indexed mode should work in IE6 AFAIK. So you should be able to replace GIFs with PNGs here too.

Index colors when possible

Indexing the colors can reduce the size quite a bit. With diagrams and the like you might not use the 256 colors up at all, so you can sometimes do this conversion without loss or with minimal changes. (In those cases you can dither the image to get prettier results, but it will hurt compression ratio. I would recommend to avoid it…)

The file comparison table a few sections above is an indexed PNG, and no colors were reduced as it only contained 140 different colors from the start. The size in truecolor is 22.5 KB, and when indexed it is 9.79 KB. (In JPEG it became ~125 KB…)

Don’t save transparent pixels

If you have some parts of the image fully transparent, the pixels color will still be saved even though it will not be displayed. This way, when you change the alpha, the original color will remain.

However if you do not need to do this anymore, you can change all completely transparent pixels to the same color. While it will not change the image’s appearance, it will reduce the file size. (Your image editor might have an option to do this for you.)

Animation

GIF is still the main format for animations around, and while it only works for indexed images, there are no real alternatives out there which have good application support.

There are currently three formats other than GIF which supports animation, APNG, MNG and WebP.

APNG is a modification of PNG which adds animation, however the issue with this format is that it uses same mime-type as PNG and thus can’t be differentiated. PNG directly disallows multiple images, so they are two different formats. However it is implemented in Firefox (the creators) and Opera.

MNG is also based on PNG, however it is a separate format. It is quite advanced and offer many possibilities than APNG does not, however its complexity is what has caused it to have nearly nonexistent support. (GIMP support, however animation support in GIMP isn’t that great anyway…)

WebP have recently announced animation support (which works similarly as GIF), and as it supports both lossless, lossy and alpha, it sounds promising. While WebP is implemented in Google Chrome (the creators) and Opera, I do not know whether animation have been added yet.

The future

Both JPEG and PNG is fairly old, PNG is from 1996 and JPEG 1992. Compared to how audio and video formats have evolved the past 10 years, images formats are still stuck in the past. I suspect that you could just replace the default compression algorithm in PNG with something like LZMA and get a 10-20% improvement…

JPEG2000 by the JPEG group was a new format which intended to replace the JPEG format. However support have not been very widespread even though the format was publicized 10 years ago. It is getting better by now though.

There are other similar formats, like JPEG XR by Microsoft, however it does not have good support either. WebP is the newest of the bunch which was publicized 1 year ago, however support is increasing and since it is based on WebM, good browser support is likely and it is already natively implemented in Chrome and Opera (and should be working in any WebM compatible browser via a JavaScript shim).

Which format that is going to get widespread is hard to tell. WebP seems to have a better chance on the Web, however I suspect that it doesn’t have as many advanced features as JPEG2000 and JPEG XR which especially photographers wants.

My guess however is that the Web is the most crucial factor, as this is where lossy compression is most important. I still however have my doubts with WebP in professional photography.

Tags: , , , , , , ,