May 23 2015

Extracting an image overlay

Category: Anime,Overmix,Programs,SoftwareSpiller @ 05:15

I’m getting close to having a Super-Resolution implementation, but I took a small detour using what I have learned to a slightly easier problem, extracting overlayed images. Here is an example:
Background merged with overlay
The character is looking into a glass cage and the reflection is added by adding the face as a semi-transparent layer. We also have the background image, i.e. the image without the overlay:
Background layer
A simple way of trying to extract the overlayed image is to subtract the background image, and indeed we can somewhat improve the situation:
Extracted overlay using Gimp
Another approach is to try estimate the overlayed image, which when overlayed on top on the background image will produce the merged image.
This is done starting with a semi-random estimate and then iteratively improve your estimation. (The initial estimate is just the merged image in this case.)
For each iteration we take our estimation and overlays it on the background. If our estimation is off it will obviously differ from our merged image, so we find the difference between the two and use that difference to improve our estimate. After doing enough iterations, we end up with the following:
Extracted overlay using estimation
While not perfect, this is quite a bit better. I still haven’t added regularization which stabilizes the image and thus could improve it, but I’m not quite sure how it affects the image in this context.

Super-Resolution works in a similar fashion, just instead of overlaying an image, it downscales a high-resolution estimate and compares it against the low-resolution images. It just so happens that I never implemented downscaling…


May 04 2015

6×6 off-roader

Category: Lego,Mindstorms,TechnicsSpiller @ 22:17

I have been meaning to get this out a long time ago, but this model was characterized by delays, long delays.
Alternative front view Back view
This was supposed to be a quick attempt on making a 6 wheeled vehicle with steering on both front and back wheels, and all wheels having power. My initial model is a clear indication of my ambitions with this project:
First prototype
Somewhere along the road I decided to try to add pendular suspension and it slowly turned into a full blown project. Then I started delaying the project, with me not even touching it for periods of up to 6 months.

The steering module is the most important part of the build, with steering and drive being controlled from each side of the axle it suspends on. I tried to keep it small and strong, while including a differential. I tried to figure out how to get the power through the steering, but I didn’t manage to find a solution which was small enough, so I ended up using those Universal Joints. It is not a good solution, as only a little bit of friction actually holds the wheel in place.
Pendular module with steering Steering
To improve the amount of distance the suspension can work with, I made the connection to the spring so it can detach when the spring on the other side is being pressed together.
drop suspension Suspension example, diagonal view
Suspension example, front view Suspension example, side view

To keep the overall height of the model down, I placed the NXT motors between the modules, which worked rather well. As a side effect it also gave the build a very low center of gravity. One thing which required special attention was to keep a smooth surface to prevent the modules getting stuck on the motors.
The modules and motor drive
I do really hate the shape of the NXT motors though, it makes them nearly impossible to incorporate them into a space-efficient model.

All in all, I can’t really say I’m satisfied with the build. While the entire model is very robust, the wheels can easily pop off making the strength of the rest of the model kinda pointless. Also the suspension on the middle wheels should really have been something else than pendular suspension, as it causes the front or back wheels to lift off the ground.
One of my goals with the project was to learn how to do wireless bluetooth communication in C++ from my computer to the NXT. I did succeed, but I never got it polished up with joystick support as I wanted…

No webgl

Download LDraw file here


Mar 04 2015

Real Super-Resolution

Category: Overmix,Programs,SoftwareSpiller @ 01:47

I managed to find a working implementation of one of the papers I was interested in understanding, while looking at OpenCV. It is in Japanese, but you can find the code with some comments in English here: http://opencv.jp/opencv2-x-samples/usage_of_sparsemat_2_superresolution
A small warning if you try to run it, it is very memory intensive. With a 1600×1200 image it used 10GB of RAM on my system. It also crashes if your image’s dimensions are not a multiple of the resolution enhancement factor.

All tests are done with 16 low resolution images and with increasing the resolution 4 times. The image below is the result for the best case, where the images are positioned evenly. Left image is one of the 16 Low Resolution (LR) images, the right is the original, and with the middle being the Super Resolution result after 180 iterations:

SR in perfect case

There is a bit of ringing artifacts around the high-contrast edges, but notice how it manages to slightly bring out the lines in the eye, even though it looks completely flat in the LR image.

Below is the same, but with input images haven been degraded by noise and errors. While it does loose a little bit of detail, the results are still fairly good and with less noise than the input images.

SR with noisy input

The last test is with random sub-pixels displacements, instead of them being optimal.  The optimal is shown to the left as a comparison. It is clear that it loses it effectiveness as the image becomes more blocky.

SR with random alignment

My plan is to use this implementation as an aid to understand the parts of the article I don’t fully understand. I would like to try this out with DVD anime sources, but this method (or just this implementation) just wouldn’t work with 150+ images. You can wait on a slow algorithm to terminate, but memory is more of a hard limit. But this method allows to have separate blur/scaling matrices for each LR image, so you can probably improve it by keeping them equal.

Tags:


Feb 28 2015

cgCompress getting ready for release

Category: cgCompress,Programs,SoftwareSpiller @ 02:55

For about a year I have been using my tool to compress VN event graphics, and I have been very pleased with the results. The usefulness of the program greatly increased when I found games having event graphics with up to 200 small variations, instead of the just 2-5 I thought was the norm.

I have been putting the final touches to the program, with the two major additions being:

  • The final output is now validated against the input images, to make sure the compressor does not produce faulty output without your knowing. So far it have been very reliable.
  • The compressor can now also compress images containing transparency. The implementation didn’t take it into account and complicated adding it as an afterthought, but with a bit of rewriting it now works quite well.
    The importance of this is that it can now also be used for compressing character sprites.

The code is available at: cgCompress at GitHub

So why am I not making an release yet? Because I don’t think it will be very useful if it does not properly integrate with your desktop environment. So my next step is to create a decoder to the Windows Imaging Component (WIC) framework. This will make it supported in many Microsoft applications, including Windows Photo Viewer and File Explorer.

I have already been experimenting a bit with WIC, and while I can’t say I can make a high quality implementation, getting something to work shouldn’t be too hard.

Tags: , ,


Nov 18 2014

Animated stitches

Category: Overmix,Programs,SoftwareSpiller @ 00:35

Most stitches are mostly static with perhaps a little mouth movement. Some however do contain significant movement and Overmix has never intended to try to merge this into a single image in a sensible fashion.

This is because I have yet to see a program which can do this perfectly. While I have seen several making something which looks decent at first, they usually have several issues. Common issues are lines not properly connecting at places and straight lines ending up being curved.

For me, it need to be perfect. If I need to manually fix it up, or even redo it from scratch, not much is gained. The goal with Overmix was always to reach a level I would not be able to reach without its assists. Thus there is no reason to pursue a silver-bullet solution, especially if it gives worse results than doing it manually.

Instead the approach I have taken is to detect which images belongs to what movement. For example, if an arm is moving, we want to figure out which images are those where the arm has yet to start moving, the images where the arm has stopped moving, and all those in between. In other words, we end up with a group of images for each frame the animator drew.

These groups can be combined individually without any issues and the set of resulting images can be merged manually. However since we might have reduced 100 video frames to 10 animated frames, we can take advantage of the nice denoising and debanding properties of Overmix, which will improve the final quality of the stitch.

Cyclic movement

One interesting use-case which can be fully automated is cyclic movement, i.e. movement which ends the way it started, and continues to loop. This is often people doing repetitive motions such as waving goodbye.

Of course the real benefit is when the view port is moving, as it would be cumbersome to do manually. The following example was a slow pan-up over the course of 89 frames where the wind is rustling the character’s clothes and hair around, reduced to 22 frames:

animated stitch

Notice how the top part of some frames are missing, as the scene ended before the top part had been in view for each frame. The same was the case for the bottom, but since it contained no movement, any frame could fill in the missing information.

(The animation can be downloaded in FullHD resolution APNG here (98 MiB).)

Algorithm

The main difficulty is making a distinction between noise and movement. (Noise can both be compression artifacts, but also others such as TV logos, etc.) A few methods were tried, but the best and simplest one of those take advantage of the fact that most Japanese animation reduce the animation cost by using a lower frame-rate. This is typically 3 video frames for each animated frame, though it can be dynamic through the animation!

The idea is to compare the difference between the previous frame. Since there are usually 3 consecutive frames without animation, it will return a low difference. But as soon as it hits a frame which contains the next part of the animation, a high difference will appear, causing a spike to appear on the graph. Doing this for every frames gives a result like this:

Graph of frame differences

Using this, we can determine a noise threshold by drawing a line (shown in purple) which intersects as many blue lines as possible. While this is mostly tested on cyclic movement, it works surprisingly well.

The ever returning issue of sub-pixel alignment strikes back though. When the stitch contains movement in both directions, the sub-pixel misalignment can cause the difference to become large enough to cause issues. This can easily be avoided by simply using sub-pixel alignment, but as of now this is quite a bit slower in Overmix.

Once the threshold has been determined the images are separated into groups based on that threshold. If the difference between the last image in a group and the next image is below the threshold, it is added to that group. If it could not be added to any group, a new group containing that image will be created. This is done for all the images.

Further work

Notice the file size of the APNG image is nearly 100 MB. This is because each of the 22 images is rendered independently of each other and thus results in 22 completely different images. But the background is the same for each and every frame, so that means we are not taking advantage of the information about the background found in the other frames. Thus, by detecting which parts in the frames are consistent and which differs when rendering, we can both improve quality and reduce file size.

Aligning the resulting frames can be tricky when there is a lot of movement in a cyclic animation, because the images only have a little resemblance to each other. Even when this does work, sub-pixel alignment+rendering is more important than usual since otherwise the ±0.5 error will show up as a shaky animation. I have an idea to how to solve the alignment issue, but my math knowledge is currently too lacking in order to actually implement it.

Tags:


Mar 16 2014

Fullscreen canvas, not as easy as it appears

Category: Software,WebdevelopmentSpiller @ 14:25

This is a rant about the erroneous information you will get when you try to find out how to implement something, you have been warned. See the end for the solution that was good enough for me.

I have been working on a WebGL based LDraw viewer, so I can embed 3D views of my Lego models on this blog (Project at Github: WebLDraw). Obviously I want to droll at them in glorious fullscreen, and with the HTML5 fullscren API it should be fairly simple. You just have to call ‘requestFullscreen()‘ on you element.

Well almost. In Firefox it applies 100% width and hight automatically, but not in Chrome, resulting in a small canvas on a large black background filling your entire screen. Awesome, uses your screen estate  as efficiently as Metro apps in Windows 8!

Applying a bit of CSS to do it manually should be fairly easy, but that just doesn’t work for a canvas, as it is rendered at a specific resolution, and the CSS style just upscales it. Googling for ‘canvas resize fullscreen’ will give you something like this:

function on_fullscreen_change() {
   canvas.width = window.innerWidth
   canvs.height = window.innerHeight
}
document.addEventListener( 'fullscreenchange', on_fullscreen_change() );

Great, now my canvas is bigger as soon as I enter fullscreen. And not only that, it is just as big as soon I exit fullscreen, because NO, do not exit the glorious fullscreen environment, it is perfect. So a bit more googling and I ended up using ‘document.fullscreenEnabled‘, which I quickly found out just told me if fullscreen was supported. The correct way was to do:

if( canvas == document.fullscreenElement )

Now I can finally enter and exit fullscreen properly. Except that my canvas did not have the correct size, ‘window.innerWidth’ does not give the correct width. But google have all the answers, you just need to do any of:

  • document.width
  • document.body.clientWidth
  • canvas.offsetWidth
  • canvas.getBoundingClientRect().width
  • canvas.style.width = window.innerWidth + “px” (CSS style, fancy)
  • screen.availWidth

None of them gives the correct result however. People don’t seem to notice because they stretch it with ‘width: 100%’ anyway. ‘screen.availWidth‘ exited me though, it actually returned 1920, the width of my screen. Except that ‘screen.availHeight‘ returned 1160 because YES, I do have a taskbar in my desktop environment, and NO, I don’t want to know it is 40px high when it is hidden anyway because I’m in fullscreen mode…

I really wonder why we need to differentiate between ‘screen.availWidth’ and ‘screen.width‘ which gives the screen’s full width in web development. Anyway, that was the final piece in the puzzle and my Dart implementation ended up looking like this: (Most of it maps pretty closely to a JavaScript implementation.)

original_width = canvas.width;
original_height = canvas.height;
canvas.onFullscreenChange.listen( (t){
  if( canvas == document.fullscreenElement ){
    canvas.width = window.screen.width;
    canvas.height = window.screen.height;
  }
  else{
    canvas.width = original_width;
    canvas.height = original_height;
  }
} );

Some of the confusion about the fullscreen API is hard to avoid, because many articles only applies to the old proprietary APIs. But 7 ways to get the monitor width, none of them which are correct? I just can’t even guess how it could end up that bad…

Tags:


Feb 10 2014

Compressing VN CGs

Category: Anime,cgCompress,Programs,SoftwareSpiller @ 06:48

I have a lot of images on my computer, random fanart from the web, screenshots of movies I have seen, etc. I recently saw that one of my image folders was 80 GB big, so it is no wonder I care much about image compression.

I was looking through a visual novel CG collection when I thought: shouldn’t this be able to compress well? After all, VN CGs tend to have a lot of similar images with minor modifications like different facial expressions. So I did a quick test, how well does it compress using different lossless compression algorithms:

Chart showing compression ratios

As expected, PNG is quite a bit better than simply zipping BMP images, and WebP fares even better. However what is this, compressing BMP images with 7z literally kills the competition!

The giant gap from ZIP to 7z does not come from the fact that LZMA is superior to Deflate, but because ZIP only allows files to be compressed individually while 7z can treat all files as one big blob of data. This is also why a general purpose compression algorithm can beat the ones optimized for images, as PNG and WebP also compresses images individually.

Note to the comparison: Usually CG collections have an average of 2-3 versions of each image, here we checked on an extreme case with 13 versions. This obviously exaggerates the results, but the trend still stands.

Doing it better

BMP is the superior solution? There is no way I can accept that, we need to do something about that!

If you have ever worked with GIF animations you properly know that you can reduce the size if you only change the differences between each frame. That is exactly what we want to do, but to use PNG and WebP to compress that difference. The problem is that we need to store the differences and information on how those should interact to recreate all the images, and there isn’t a good file format to do that.

How to get from one CG to another using the difference

So I have created a format based on OpenRaster, which is a layered image format to compete with PSD (Photoshop) and XCF (GIMP). I wanted to use it without modifications, but having multiple images in one file, while planned, appears to be far into the future. (I want it now!) It is basically a ZIP file which contains ordinary image files and a XML document describing layers, blend modes, etc.

Next part is automatically creating such a file from a series of images. For this I have written cgCompress (Github page) and while there is still a lot of work to be done, it has proven that we can do it better. Fundamentally this is done by creating all the differences and then with an greedy algorithm, select the ones which will add the least to the total file size. This continues frame by frame until we have recreated all the original images. I have also worked with a optimal solver, but I have not been able to get it to work with more that 3-5 images (because of time complexity).

Using the greedy algorithm I managed to reduce the file size 25.6% compared 7z compressing BMP images: (Lossless WebP used internally)

Comparision of compressed BMP and cgCompress

This is a compression rate of a whooping 88.7%! Of course, this is only because we are dealing with 13 very similar images. 67.2% of the file size is the start image and without a better image compression algorithm, we can do very little to improve that. That means the 12 remaining images use each 2.7% each (1/13 is 7.7%), not much to work with but I believe I can still make improvements.

This is just one case though, while uncommon, some images still need further optimization to get near-perfect results. I have tried compressing an entire CG collection of 154 images and my results where as following:

Chart showing compression ratios for an entire CG collection

Compared to 7z compressed BMP, there was an improvement of 24.0% and compared to WebP it is 61.1%. On average, the set contained 3.92 variations per image; cgCompress manages to do 2.57 as many images compared to ordinary WebP. The difference between those two numbers is the overhead cgCompress requires to recreate all 3.92 variations per image and it depends on how different the variations are. While I don’t know how low it can get, I do believe there is room for improvement here.

I included lossy WebP here as well (done at quality 95) to give a sense of difference between lossless and lossy compression. cgCompress definitively closes the gap, but if you don’t care about your images lossy WebP is still the way to go. (It should be possible to use lossy compression together with the ideas used in cgCompress though.)

Conclusion

cgCompress can significantly reduce the space needed to store visual novel CG collections. While only a moderate improvement of ~25% over 7z compressed BMP, compressed archives only works well for archiving or transferring over networks. cgCompress, as based on OpenRaster, has proper thumbnailing, viewer support and potentially meta-data. With PNG and WebP being the direct contenders, cgCompress provides a big leap in compression ratio.

On a personal note, going from concept to something I can use in 8 days is quite an achievement for me. While the cgCompress code isn’t too great, I’m still quite happy on how this turned out.

Tags: , , ,


Jan 25 2014

A year of Overmixing

Category: Overmix,Programs,SoftwareSpiller @ 01:54

It is now about a year ago since I started this project, and not in my wildest dreams I would have imagined how much work I have put into this by now. So here is a quick overview of my progress.

(Overmix is now located at github: https://github.com/spillerrec/Overmix, download here: https://github.com/spillerrec/Overmix/releases )

Automatic aligning

While zooming and rotating is not supported (yet at least), horizontal and vertical movement is rather reliably detected and can be done to sub-pixel precision. Sub-pixel precision is rather slow and memory intensive though, as it works by image upscaling. Still needs some work when aligning images with transparent regions though, and I also believe it can be made quite a bit faster.

De-noising

Noise is removed by averaging all frames and works very well, even on blue-ray rips. However since I have not been able to render while taking sub-pixel alignment into account, it blurs the image ever so slightly.

10-bit raw input and de-telecine

I have made a frame dumper to grab the raw YUV data from the video streams, to ensure no quality loss happen between video file and Overmix. This actually showed that VLC outputs in pretty low quality, and appears to have some trouble with Hi10p content and color spaces. Most noticeable however is that VLC uses nearest-neighbor for chroma-upsampling, which especially looks bad with reds. Having access directly to the chroma channels also opens up for the possibility of doing chroma-restoration using super resolution.

I have made several tools for my custom “dump” format, including the video dumper, Windows shell thumbnailer (vista/7/8), QT5 image plugin and a compressor to reduce the filesize. They can be found in their separate repository here: https://github.com/spillerrec/dump-tools

De-telecine have also been added, as it is necessary in order to work with the interlaced MPEG2 transport stream anime usually is broadcasted in.

Deconvolution

Using deconvolution I have found that I’m able to make the resulting images much sharper, at the expense of a little bit of noise. It appears that anime is significantly blurred, perhaps because of the way it have been rendered, or perhaps intentionally. More on this later…

TV-logo detection and removal

It works decently as shown in a recent post and can also remove other static content such as credits. I have tried to do it fully automatically, but it doesn’t work well enough compared to the manual process. Perhaps some Otsu thresholding combined with dilation will work?

I’m intending to try doing further work with this method to see if I can get it to work with moving items, and doing the reverse, separating the moving item. This is interesting as some panning scenes moves the background independently of the foreground (for faking depth of field).

Steam removal

Prof-of-concept showed that if the steam is moving, we can chose the parts from each frame which contains the least amount of steam. Thus it doesn’t remove it completely, but it can make a significant difference. The current implementation however deals with the colors incorrectly, which is fixable, but not something I care to do unless requested…

Animation

Some scenes contains a repeating animation, while doing vertical/horizontal movement. Especially H-titles seems have much of this, but can also be found as mouth movement and similar. While still in its early stages, I have successfully managed to separate the animation into its individual frames and stitch those, using a manual global threshold.

I doubt it would currently work with minor animations, such as changes with the mouth, noise would probably mess it up. So I’m considering investigating other way of calculating the difference, perhaps using the edge-detected image instead, or doing local differences.



Nov 23 2013

Too much steam in your anime?

Category: Anime,Overmix,Programs,SoftwareSpiller @ 03:54

Well, Overmix is here with a dehumidifier to solve your problem. Too damp? Run it once and watch as your surroundings become clearer.

Your local hot spring before:

and after:

Can’t get enough of Singing in the rain? Don’t worry, just put it in the reverse and experience the downpour.

Normal rainy day:

The real deal:

This is another multi-frame approach, and really just as simple as using the average. Since the steam lightens the image, all you have to do is to take the darkest pixel at that position. (In other words, the lighter the pixel is, the more likely it is to be steam.) Since the steam is moving, this way you use the least steamy parts of each frame to gain a stitched image with the smallest amount of steam.

If we do the opposite, take the brightest pixel, we can increase the amount of steam. That is not really that interesting, but the second example shows how we can uses this to bring out features that would otherwise be treated as noise. We could also combine it with the average approach using a range, to deal with the real noise, but I did this for fun so I didn’t go that far.

While this is a fairly simple method, it highlights that we can use multiple frames not just to improve quality, but also to analyze and manipulate the image. I have several neat ideas I want to try out, but more about those when I have something working.


Next Page »