\Quite some time ago, way back to the time of nxtasy.org, I was writing a Bézier curve implementation in NXC. Muntoo was also writing one at the time, you can check his out on his blog, here. His implementation is generally more feature rich, like several different drawing modes, while mine is geared towards pure speed.

To start it off, here is a screenshot of the current progress:

The process is two-step, a initiation function which calculates some basic stuff which takes 11 msec, and then the drawing functions. In the screenshot there are 10 cubic curves drawn by the general Bézier algorithm which takes 419 msec in total, so ~42 msec per function call in average. (The first draw with a new amount of control points is a bit slower as some of the calculations are reused.)

Right now there is only a general Bézier algorithm, the one I optimized for cubic curves isn’t working correctly…

### Implementation – v1.1.0

The general Bézier formula looks like this:

So to draw a Bézier curve you would start with t=0, calculate the x and y positions, increment t by 1/”desired precision” and recalculate. This process is repeated until t=1. There is a lot of calculation needed for each t value, yet we might be able to do some of the calculation that doesn’t directly involve t beforehand.

We could for example calculate the binomial for each P and store it in C like this:

### Implementation v2.+

NXC is compiled into a bytecode format in which math opcodes are polymorphic. If you try to multiply an array with a scalar, the scalar will be multiplied with each of the elements in the array. This is however a lot faster than manually iterating a loop multiplying each element.

So my idea was to take advantage of this by making an array containing each value of t. Instead of iterating a loop, you would simply do the math operations on this array. The array of t values would also only be calculated once in the beginning of the program, instead of each time you need to draw the curve.

So for the Bézier algorithm there would be two arrays, one containing each value of t and another containing each value of 1-t.

However we want to optimize this further. Since we start at i=0, lets consider that case:

We will only need to calculate , so lets try finding the difference between the first iteration and the next:

We can therefore get to the next iteration by multiplying into and into the next iteration by multiplying again:

This fraction can be calculated without knowing n or i, so we can calculate this at the start of the program and reuse it. We will need to calculate each time we get a new n value (but we can keep the array, so if the next curve have same n value, we also reuse this (and when done right, even the binomial)).

However a small loop is still needed to calculate the sum since ArraySum() isn’t polymorphic. Currently there is also a little bit of multiplication in the loop, but I’m working on removing that too.

The major downside with this approach is that if you use high resolutions the arrays become rather large (and they are using float to boot) so memory usage rise quite a bit. (Some temporary arrays are also needed, so it quickly becomes a few KB.)

### Download

P.S. I should really have considered reading a tutorial about LaTeX before trying to add all that math which needs to be written in LaTeX…

February 16th, 2011 04:44

Your optimizations are really nice! (Taking advantage of NXC-exclusive

`ArraySum()`

, for instance. How much faster is it than doing it “manually”? And what about the array multiplied by scalar vs “manual methods”?)I will see if I can implement some of these optimizations in my Bezier Curve program (v0_4), when I have some more time.

February 16th, 2011 12:34

I wanted to take advantage of ArraySum(), but most of the additions in the EF can only accept simple types (according to John) so I wasn’t able to. pow did however work with arrays, which was what I needed. But if you try to use pow on a struct it will fail.

It is a bit hard to compare the time it takes with a normal loop because in a normal loop with an array there is some code to do the array, you need some more code to extract the value out of the array and then place it back into the array once you are done.

However, to give you an idea I can at least tell how long it takes to do it in this way. I just noticed today that I forgot to round the values properly, so I had to add 0.5 to all the values. This caused the total time to rise to 428 msec. That is 1000 floating point additions in 9 msec.

The interesting is though that it becomes more and more efficient when the array length increases. Here is the time with different resolutions:

10: 103 msec

25: 228 msec

50: 428 msec

100: 511 msec (And the curve looks pretty nice at 100 btw)

200: 526 msec

500: 577 msec

Using PolygonOut() also helps a lot. And I forgot to mention, there are two defines, BEZIER_TEMP_POLYGON_TYPE which contains the name of the low-level polygon struct and BEZIER_TEMP_POINTS which contains the LocationType will all the resulting points. So if you want to redraw it just copy BEZIER_TEMP_POLYGON_TYPE into a safe place (as it will be overwritten at next draw), or if you want to just use the calculation, take the values in BEZIER_TEMP_POINTS.

(If you want to calculate the points without drawing, define BEZIER_NODRAW. If you want to draw with that defined, you need to call bezier_draw() manually after a bezier_general() call.)

July 23rd, 2011 23:59

I know this is a bit late, but with my test program for your Bezier 2.4.0, I get:

(degree, repeats, resolution, total time)

1D, 10i, 50, 210 ms

2D, 10i, 50, 315 ms

3D, 10i, 50, 418 ms

4D, 10i, 50, 527 ms

With my Bezier Curves v0_3:

1D, 10i, 50, 452 ms

2D, 10i, 50, 649 ms

3D, 10i, 50, 824 ms

4D, 10i, 50, 1600 ms

(Level 3 Optimization, Next Byte Codes Compiler version 1.2 (1.2.1.5, built Wed Jul 20 07:04:19 CDT 2011); test_release20110720.zip)