Mar 16 2014

Fullscreen canvas, not as easy as it appears

Category: Software,WebdevelopmentSpiller @ 14:25

This is a rant about the erroneous information you will get when you try to find out how to implement something, you have been warned. See the end for the solution that was good enough for me.

I have been working on a WebGL based LDraw viewer, so I can embed 3D views of my Lego models on this blog (Project at Github: WebLDraw). Obviously I want to droll at them in glorious fullscreen, and with the HTML5 fullscren API it should be fairly simple. You just have to call ‘requestFullscreen()‘ on you element.

Well almost. In Firefox it applies 100% width and hight automatically, but not in Chrome, resulting in a small canvas on a large black background filling your entire screen. Awesome, uses your screen estate  as efficiently as Metro apps in Windows 8!

Applying a bit of CSS to do it manually should be fairly easy, but that just doesn’t work for a canvas, as it is rendered at a specific resolution, and the CSS style just upscales it. Googling for ‘canvas resize fullscreen’ will give you something like this:

function on_fullscreen_change() {
   canvas.width = window.innerWidth
   canvs.height = window.innerHeight
}
document.addEventListener( 'fullscreenchange', on_fullscreen_change() );

Great, now my canvas is bigger as soon as I enter fullscreen. And not only that, it is just as big as soon I exit fullscreen, because NO, do not exit the glorious fullscreen environment, it is perfect. So a bit more googling and I ended up using ‘document.fullscreenEnabled‘, which I quickly found out just told me if fullscreen was supported. The correct way was to do:

if( canvas == document.fullscreenElement )

Now I can finally enter and exit fullscreen properly. Except that my canvas did not have the correct size, ‘window.innerWidth’ does not give the correct width. But google have all the answers, you just need to do any of:

  • document.width
  • document.body.clientWidth
  • canvas.offsetWidth
  • canvas.getBoundingClientRect().width
  • canvas.style.width = window.innerWidth + “px” (CSS style, fancy)
  • screen.availWidth

None of them gives the correct result however. People don’t seem to notice because they stretch it with ‘width: 100%’ anyway. ‘screen.availWidth‘ exited me though, it actually returned 1920, the width of my screen. Except that ‘screen.availHeight‘ returned 1160 because YES, I do have a taskbar in my desktop environment, and NO, I don’t want to know it is 40px high when it is hidden anyway because I’m in fullscreen mode…

I really wonder why we need to differentiate between ‘screen.availWidth’ and ‘screen.width‘ which gives the screen’s full width in web development. Anyway, that was the final piece in the puzzle and my Dart implementation ended up looking like this: (Most of it maps pretty closely to a JavaScript implementation.)

original_width = canvas.width;
original_height = canvas.height;
canvas.onFullscreenChange.listen( (t){
  if( canvas == document.fullscreenElement ){
    canvas.width = window.screen.width;
    canvas.height = window.screen.height;
  }
  else{
    canvas.width = original_width;
    canvas.height = original_height;
  }
} );

Some of the confusion about the fullscreen API is hard to avoid, because many articles only applies to the old proprietary APIs. But 7 ways to get the monitor width, none of them which are correct? I just can’t even guess how it could end up that bad…

Tags:


Feb 10 2014

Compressing VN CGs

Category: Anime,cgCompress,Programs,SoftwareSpiller @ 06:48

I have a lot of images on my computer, random fanart from the web, screenshots of movies I have seen, etc. I recently saw that one of my image folders was 80 GB big, so it is no wonder I care much about image compression.

I was looking through a visual novel CG collection when I thought: shouldn’t this be able to compress well? After all, VN CGs tend to have a lot of similar images with minor modifications like different facial expressions. So I did a quick test, how well does it compress using different lossless compression algorithms:

Chart showing compression ratios

As expected, PNG is quite a bit better than simply zipping BMP images, and WebP fares even better. However what is this, compressing BMP images with 7z literally kills the competition!

The giant gap from ZIP to 7z does not come from the fact that LZMA is superior to Deflate, but because ZIP only allows files to be compressed individually while 7z can treat all files as one big blob of data. This is also why a general purpose compression algorithm can beat the ones optimized for images, as PNG and WebP also compresses images individually.

Note to the comparison: Usually CG collections have an average of 2-3 versions of each image, here we checked on an extreme case with 13 versions. This obviously exaggerates the results, but the trend still stands.

Doing it better

BMP is the superior solution? There is no way I can accept that, we need to do something about that!

If you have ever worked with GIF animations you properly know that you can reduce the size if you only change the differences between each frame. That is exactly what we want to do, but to use PNG and WebP to compress that difference. The problem is that we need to store the differences and information on how those should interact to recreate all the images, and there isn’t a good file format to do that.

How to get from one CG to another using the difference

So I have created a format based on OpenRaster, which is a layered image format to compete with PSD (Photoshop) and XCF (GIMP). I wanted to use it without modifications, but having multiple images in one file, while planned, appears to be far into the future. (I want it now!) It is basically a ZIP file which contains ordinary image files and a XML document describing layers, blend modes, etc.

Next part is automatically creating such a file from a series of images. For this I have written cgCompress (Github page) and while there is still a lot of work to be done, it has proven that we can do it better. Fundamentally this is done by creating all the differences and then with an greedy algorithm, select the ones which will add the least to the total file size. This continues frame by frame until we have recreated all the original images. I have also worked with a optimal solver, but I have not been able to get it to work with more that 3-5 images (because of time complexity).

Using the greedy algorithm I managed to reduce the file size 25.6% compared 7z compressing BMP images: (Lossless WebP used internally)

Comparision of compressed BMP and cgCompress

This is a compression rate of a whooping 88.7%! Of course, this is only because we are dealing with 13 very similar images. 67.2% of the file size is the start image and without a better image compression algorithm, we can do very little to improve that. That means the 12 remaining images use each 2.7% each (1/13 is 7.7%), not much to work with but I believe I can still make improvements.

This is just one case though, while uncommon, some images still need further optimization to get near-perfect results. I have tried compressing an entire CG collection of 154 images and my results where as following:

Chart showing compression ratios for an entire CG collection

Compared to 7z compressed BMP, there was an improvement of 24.0% and compared to WebP it is 61.1%. On average, the set contained 3.92 variations per image; cgCompress manages to do 2.57 as many images compared to ordinary WebP. The difference between those two numbers is the overhead cgCompress requires to recreate all 3.92 variations per image and it depends on how different the variations are. While I don’t know how low it can get, I do believe there is room for improvement here.

I included lossy WebP here as well (done at quality 95) to give a sense of difference between lossless and lossy compression. cgCompress definitively closes the gap, but if you don’t care about your images lossy WebP is still the way to go. (It should be possible to use lossy compression together with the ideas used in cgCompress though.)

Conclusion

cgCompress can significantly reduce the space needed to store visual novel CG collections. While only a moderate improvement of ~25% over 7z compressed BMP, compressed archives only works well for archiving or transferring over networks. cgCompress, as based on OpenRaster, has proper thumbnailing, viewer support and potentially meta-data. With PNG and WebP being the direct contenders, cgCompress provides a big leap in compression ratio.

On a personal note, going from concept to something I can use in 8 days is quite an achievement for me. While the cgCompress code isn’t too great, I’m still quite happy on how this turned out.

Tags: , , ,


Jan 25 2014

A year of Overmixing

Category: Overmix,Programs,SoftwareSpiller @ 01:54

It is now about a year ago since I started this project, and not in my wildest dreams I would have imagined how much work I have put into this by now. So here is a quick overview of my progress.

(Overmix is now located at github: https://github.com/spillerrec/Overmix, download here: https://github.com/spillerrec/Overmix/releases )

Automatic aligning

While zooming and rotating is not supported (yet at least), horizontal and vertical movement is rather reliably detected and can be done to sub-pixel precision. Sub-pixel precision is rather slow and memory intensive though, as it works by image upscaling. Still needs some work when aligning images with transparent regions though, and I also believe it can be made quite a bit faster.

De-noising

Noise is removed by averaging all frames and works very well, even on blue-ray rips. However since I have not been able to render while taking sub-pixel alignment into account, it blurs the image ever so slightly.

10-bit raw input and de-telecine

I have made a frame dumper to grab the raw YUV data from the video streams, to ensure no quality loss happen between video file and Overmix. This actually showed that VLC outputs in pretty low quality, and appears to have some trouble with Hi10p content and color spaces. Most noticeable however is that VLC uses nearest-neighbor for chroma-upsampling, which especially looks bad with reds. Having access directly to the chroma channels also opens up for the possibility of doing chroma-restoration using super resolution.

I have made several tools for my custom “dump” format, including the video dumper, Windows shell thumbnailer (vista/7/8), QT5 image plugin and a compressor to reduce the filesize. They can be found in their separate repository here: https://github.com/spillerrec/dump-tools

De-telecine have also been added, as it is necessary in order to work with the interlaced MPEG2 transport stream anime usually is broadcasted in.

Deconvolution

Using deconvolution I have found that I’m able to make the resulting images much sharper, at the expense of a little bit of noise. It appears that anime is significantly blurred, perhaps because of the way it have been rendered, or perhaps intentionally. More on this later…

TV-logo detection and removal

It works decently as shown in a recent post and can also remove other static content such as credits. I have tried to do it fully automatically, but it doesn’t work well enough compared to the manual process. Perhaps some Otsu thresholding combined with dilation will work?

I’m intending to try doing further work with this method to see if I can get it to work with moving items, and doing the reverse, separating the moving item. This is interesting as some panning scenes moves the background independently of the foreground (for faking depth of field).

Steam removal

Prof-of-concept showed that if the steam is moving, we can chose the parts from each frame which contains the least amount of steam. Thus it doesn’t remove it completely, but it can make a significant difference. The current implementation however deals with the colors incorrectly, which is fixable, but not something I care to do unless requested…

Animation

Some scenes contains a repeating animation, while doing vertical/horizontal movement. Especially H-titles seems have much of this, but can also be found as mouth movement and similar. While still in its early stages, I have successfully managed to separate the animation into its individual frames and stitch those, using a manual global threshold.

I doubt it would currently work with minor animations, such as changes with the mouth, noise would probably mess it up. So I’m considering investigating other way of calculating the difference, perhaps using the edge-detected image instead, or doing local differences.



Nov 23 2013

Too much steam in your anime?

Category: Anime,Overmix,Programs,SoftwareSpiller @ 03:54

Well, Overmix is here with a dehumidifier to solve your problem. Too damp? Run it once and watch as your surroundings become clearer.

Your local hot spring before:

and after:

Can’t get enough of Singing in the rain? Don’t worry, just put it in the reverse and experience the downpour.

Normal rainy day:

The real deal:

This is another multi-frame approach, and really just as simple as using the average. Since the steam lightens the image, all you have to do is to take the darkest pixel at that position. (In other words, the lighter the pixel is, the more likely it is to be steam.) Since the steam is moving, this way you use the least steamy parts of each frame to gain a stitched image with the smallest amount of steam.

If we do the opposite, take the brightest pixel, we can increase the amount of steam. That is not really that interesting, but the second example shows how we can uses this to bring out features that would otherwise be treated as noise. We could also combine it with the average approach using a range, to deal with the real noise, but I did this for fun so I didn’t go that far.

While this is a fairly simple method, it highlights that we can use multiple frames not just to improve quality, but also to analyze and manipulate the image. I have several neat ideas I want to try out, but more about those when I have something working.


Nov 09 2013

Colorspaces and VLC

Category: Overmix,Programs,SoftwareSpiller @ 21:23

There are two colorspaces commonly used in video today, which are defined in Rec. 601 and Rec. 709 respectively. Simply speaking, Rec. 601 is mainly used for analog sources, while Rec. 709 mainly is for HD TV and BD.

So how do VLC handle this? It assumes everything is Rec. 601 and you get something like this:

The bottom left is from a DVD and the top right is from the BD release. In comparison, here is how it looks in Overmix, using Rec. 601 for the DVD release and Rec. 709 for the BD release:

VLC also seems to ignore the gamma difference between Rec. 601/709 and sRGB, and it handles 10 bit content in a way that reduces color accuracy to worse than 8 bits sources. Behold, the histogram from a Hi10p source:

Free stuff might be nice, but this is what you get…

EDIT: I messed up the studio-swing removal in Overmix (which is now fixed), so the colors were slightly off. It was consistent between rec.601/709 so the comparison still holds. Overmix might be nice, but this is what you get…

Tags: , , ,


Nov 03 2013

First Super Resolution results

Category: Overmix,Programs,SoftwareSpiller @ 23:22

Just five months later… Here are some early results using artificial data.

Using Wikimedia Commons “picture of the day” for October 31. 2013 by Diego Delso (CC BY-SA 3.0), I created LR (Low Resolution) images which were 4 times smaller in each direction. Each LR image had its own offset, so to have one LR image for all possible offsets, 16 images was created.

To detect the sub-pixel alignment afterwards, the images were upscaled to 4x their size and ordinary pixel-based alignment was used. The upscaled versions were only used for the alignment and thus discarded afterwards. The final image was then rendered at 4x resolution using cubic interpolation, but taking the sub-pixel alignment into account. Lastly the image was deconvolved in GIMP using the G’MIC plugin to remove blur. The results are shown below:

Left side shows the LR (shown upscaled using Nearest neighbor interpolation) and original image respectively. Right side shows the SR (Super Resolution) results, using different interpolation methods. Both are cubic, however the top is using Mitchell and the bottom is using Spline. In simple terms, Spline is more blurry than Mitchell but has less blocking artifacts. Mitchell is usually pretty good choice (as it is a compromise between several other cubic interpolation methods), however the blocking is pretty noticeable here. Using Spline here avoids that and since we attempt to remove blur afterwards it works pretty well. However do notice that Mitchell does recover slightly more detail in the windows to the right.

But while Mitchell often does appear to be slightly more sharp, it tends to mess up more often, which can clearly be seen on the “The power of” building to the left. The windows are strangely mixed up into each other, while they are perfectly aligned when using Spline.

Conclusion

Results are much better than the LR images, however it is more an magnification of 2x instead of the optimal 4x. And to make matters worse, this is generated optimal data without blur or noise.

However this is the simplest way of doing SR and I believe other methods do give better results. Next I want to try the Fourier-based approach which is also one of the early SR methods. It should give pretty good results, but it is not used much anymore because it does not work for rotated or skewed images.

Using artificial data has really shown me why I have had so little success with it so far. I’m mainly working with anime screenshots and the amount of detail which can be restored is probably not that much. My goal is actually more to avoid blurriness that happens when they are not aligned perfectly. Thus while it should have been obvious, lesson learned, do not test on data which you are not sure whether will give an result or not… What I did gain from this is that anime tends to be rather blurry and that image deconvolution can help a lot. When I understand this blurriness in detail I will probably write more about it though.


Jun 07 2013

Overmix and Super Resolution?

Category: Anime,Overmix,Programs,SoftwareSpiller @ 22:43

As I was researching on digital signal processing I found an interesting term: Super Resolution. Super Resolution is a field which attempts to improve the resolution of an image, by using the information in one or more images. This is exactly what I was doing with Overmix, using multiple images to reduce noise.

However another aspect of Super Resolution use sub-pixel shifts in the images to improve the sharpness of the image. This could not only solve the issue with the imperfect alignment I was having, it could straight out improve the quality further than I had thought possible.

(I had actually tried to use sub-pixel alignment when I ran into the issue and I speculated it might could increase sharpness. But after much work I only managed to make it align properly without reducing the blur I was having even without it, so I didn’t press it further.)

Limits

Super Resolution has it limits however. First of all, as it tries to estimate the original image, it cannot magically surpass it and give unlimited precision. If the image was created in “480p”, even a 1080p BD upscale will still only give the “480p” image. If the original was blurry by nature, Super Resolution will result in a blurry image as well, unlike a sharpness filter.

And that raises the question, why is anime blurry and why does it not align on the pixel grid? With one sample, I got the same misalignment with both the 720p TV version and the 1080p BD version. If this was caused by downscaling the issue would be smaller at 1080p, however it isn’t. Most anime does not appear to push the boundaries of 1080p, but since there are misalignment issues I suspect their rendering pipeline isn’t optimal.

The other limit is the available images used for the estimation. If the images we have does not contain any hints on what the original image looks, we can’t guess it. Thus if there are no sub-pixel shifts in an image, Super Resolution can’t do much. And that is actually an issue because most slides only moves vertically which means we only have vertical sub-pixel shifts. In those cases we can only hope to improve detail in the vertical direction.

Using all available information

Since Super resolution uses the information in the images, the more we can get the better.

First of all, the closer we can get to the source the better, as we don’t have to estimate the defects that happens on each conversion. A PNG screenshot is better than a JPEG, and the TV MPEG2 transport stream is better than a 10-bit re-encode.

One thing to notice here is that the PNG screenshot is (with all players I have tried) a 8-bit image, not 10-bit (16-bit*) for Hi10p h264. So using PNG screenshots would loose us 2 bits.

However more importantly, PNG cannot represent an image from a MPEG stream directly. The issue is that PNG only supports RGB and MPEG uses Y’CbCr. Y’CbCr is a different color space invented to reduce the required bandwidth of image/video. The human eye is most sensitive to luminance and not so much to color, which Y’CbCr takes advantage of. MPEG then (normally) uses Chroma subsampling which is the practice of reducing the resolution of the planes containing color information. A 1280×720 encode will normally have one plane at 1280×720 and two at 640×360.

So to save as a PNG, the video player upscales the chroma planes and converts to RGB, losing valuable information.

Going even further, video is compressed using a combination of key- and delta-frames. Key-frames stores a whole image while delta-frames only stores how to get from one frame to another. The specifics about how those frames were compressed is again valuable information. (But I don’t know much about how this is done.)

Status of Overmix

Overmix now accepts a custom file format which can store 8- and 10-bit chroma subsampled Y’CbCr images. I created an application using libVLC that takes the output with minimal preprocessing  and stores it in this format. (It also makes it easier to save every frame in the slide.)

Overmix now only uses the Y’ plane to align on, instead of all 3 in RGB. My next goal is to redo the alignment algorithm. Currently it renders an average of all previous added images to align on, as otherwise the slight misalignment would propagate with each added frame. However I will try to use a multi-pass method now, where it will roughly align all images and then do a sub-pixel alignment on the images afterwards. Sub-pixel alignment will, at least in the start, be done by upscaling as optical flow makes no sense to me yet.

Then I need to redo the render system, as it is currently optimized for aligned images, and this will clearly not be the case anymore.

I haven’t worked on Overmix for quite some time due to University stuff, but the next three months I should have plenty of time, so hopefully I will get it done before that is over.

Tags: , , , ,


Feb 28 2013

Stitching anime screenshots in overdrive

Category: Anime,Overmix,Programs,SoftwareSpiller @ 00:44

I have been developing a new application named Overmix, which attempts to improve the quality of anime screenshot stitching. This article will shortly explain what stitching is, what issues affect the quality and how Overmix tries to fix those. At the end a short summery of the results for the current progress is given.

Background

One common animation technique is panning where the camera moves/pans over the image, showing only a part of it at a given time:

animation of pan shot

(Shot on YouTube: http://youtu.be/DsHjblyEG88?t=6m25s)

Very little movement actually happens during the shot, in fact only the mouth is moving (presumably to reduce animation costs). This makes it possible to combine the frames together to one large image, which is known as “stitching”.

Source quality

The issue is however that more often than not, the video quality isn’t that great. The video has been compressed and especially if the source is a TV-transmission or webcast, visual artifacts can be quite noticeable:

Example of noise artifacts

The two most significant artifacts with anime encodes is noise (shown above) and color banding/posterization (shown below).

Example of color banding

Reducing artifacts

A stitch is normally done by taking two frames, finding the offset between the two images and then soften the edges between the images to make the transition less apparent (which is usually done by applying a gradient on the alpha channel).

Since this is a time consuming process, as few frames as possible is used. The idea is to do the opposite, use as many frames as possible. The reason is that the artifacts are not static, for every frame they differ slightly. In result, every frame carries a slightly different set of information. The goal is then to derive the original information, based on this set of inconsistent information.

Just by using the average, we can get quite decent results:

Comparison between average and single

(Right is a single frame, left is the average of all unique frames.)

Results

Noise artifacts has shown to nearly disappear completely when simply averaging every frame with each other, even when the source has a significant amount of noise artifacts. Color banding is also reduced but with much more varying amounts.

Even with modern TV-encodes, stitches sees a significant improvement from using this technique and can visually be tell apart at normal magnification. Surprisingly, even when using good BD-encodes there is usually a slight improvement, but normally requires 2-4 times magnification to be noticeable.

It has shown that it often is not possible to make a perfect alignment when sticking to the pixel grid. This causes the images to be slightly more blurry than originally. It is an area which still requires work.

Using the average to derive the result is not always desirable, as the encode might contain information not related to the image. Such information could be subtitles, TV logos or simply errors in the source. See the following image as example, the most-right column of pixels was completely black and shows up as lines in the averaged image.

Stitched with Overmix

However the currently devised algorithms has a tendency to choke on the slight misalignment mentioned previously and cause unwanted artifacts. If this is solved best by fixing the misalignment or by improving the algorithm is up to discussion.

Binaries and source code

Overmix is licensed as GPLv3 and can be found here: Overmix on Github

Binaries for Windows 64-bit can also be found on Github here: Overmix releases

Tags: ,


Feb 16 2013

Headphone stand

Category: Lego,TechnicsSpiller @ 17:31

The wires on my headphones kept breaking, so I decided to buy a set of wireless ones. The I bought the Sennheiser RS 160 which uses a portable transmitter, instead of the larger stationary RS 170. However the RS 170 transmitter duals as a headphone stand (and charger) and I would like to have some way of safely storing my headphones, so I built a stand in Lego Technics.

I had two ambitions, to make it look a bit more fancy than what I usually build and to mainly use pieces that I rarely use (as otherwise I would probably disassemble it if I needed them). When I noticed that large box with angled beams I never use, it seemed perfect for this.

Headphone stand made with Lego Technics

In the end it just became two plates with 4 straight beams built using angled beams, but I still think it gives it a nice touch.

The beams are not locked and are free to move, which should make the stand very unstable, however it is not. Partially because of the friction pins, but mainly because the beams are each placed at a slightly different angles than the others. So when the top plate wants to move in one direction, one of the beams will restrict this movement as the angle is slightly off. It is not perfect, but as long as you are not rough with it, it stands.

Tags: ,


Next Page »