Filter Forge - Where's the cache when you need it?

byRo

an Englishman in Brazil

Posts: 138
Filters: 8

Maybe Vladimir is going to find a way to justify this, but from where I'm sitting this looks very very wrong. smile:evil:

That Blur components aren't the speediest of components we all know, but I have been noticing that, even when composing filters that don't use them, certain configurations just seem to slow down to a complete halt!
I just made one, using only the "faster" components (Tile, Blend, Offset) which I gave up on after an hour of calculating. Difference was that there were 4 stages in cascade.

Always one for a bit of investigative fun I made a test filter (see below). This one does use blur - so I could time it properly.

I ran the filter with just one blur component, then added another in cascade, then finally a third (as in the image attached).

For one blur component the filter took 14 seconds to finish (yes, my machine is that slow).

When adding a new blur stage, the second should take the output from the first and do a new blur. i.e. should take another 14 seconds (or less). No, it took 50 seconds.

With the third stage, my simple filter theory says total time should be around 40 seconds - but no, this simple little filter took 235 seconds (just shy of 4 minutes).

These results indicate, to me, that the Blur output is not being cached at all - at a a guess I'd say it's doing something like this:

1) Blur 1;
2) Blur 1; For Blur 2: do Blur 1 again, now do Blur 2;
3) Blur 1; For Blur 2: do Blur 1 again, now do Blur 2; For Blur 3: do Blur 2 again (For Blur 2: do Blur 1 again, now do Blur 2) now do Blur 3........and some more.

It's going to be sad if the FFFolks say that's the way it's got to be, 'coz this will severely limit filter complexity.

_________________________________
My favourite question is "Why?".
My second favourite is "Why not?"

Posted: July 13, 2006 2:53 pm

Details E-Mail

uberzev

not lyftzev

Posts: 1890
Filters: 36

Yeah, that's weird!

Posted: July 13, 2006 9:48 pm

Details E-Mail

onyXMaster

Filter Forge, Inc.

Posts: 350

The short answer:
Try enabling seamless, all blur-based components are wholly lot faster that way and won't exhibit this "strange" behavior.

The long answer:
First, about caches -- a single Blur component has _three_ internal caches, so it's much more complex than it looks like on the surface.

In seamless mode, everything is okay, we know the region bounds (which are effectively equal seamless wrapping region), so (if we're not talking about rotated motion blur) we can effectively lock everything inside a region that is known in advance.

Now, when seamless is disabled, things get a lot more complex.
Imagine you take a part of an image, sized (W)75x(H)75 pixels (real sizes for 600x600 image). Imagine that blur radius ® is 10 and image size is 600x600, with Size slider set to 600.
Question -- how many samples of input data you need to calculate the image? The most obvious answer (W*H = 75*75 = 5625) is completely wrong. The absolute minimum is (Size * R / 100 * 2 + W + 2) * (Size * R / 100 * 2 + W + 2), which is equal to (600 * 10 / 100 * 2 + 75 + 2) * (600 * 10 / 100 * 2 + 75 + 2) = 38809. So with the specified parameters, we need (38809 / 5625 = 6.8993(7)) ~ 6.9 times samples of the original image. So _without_ cache, each "layer" of blur of radius 10 on default image would be _at_least_ seven times slower (actually even more). The practical minimum should take into account some internal implementation quirks, which do make possible the non-seamless blur at all (don't forget that you can ask for blurred data which is outside your image by using offset or any other distorter, so we need to be able to create it on-demand along with caching) -- and the practical minimum is even bigger than you may imagine, like ~14 times larger than original image for the specified parameters.

The good thing is that I spent last week-and-a-half optimizing cache cell grid alignment along with cell size calculation, which leaded to dramatic (30-290%) decrease in excessive sampling, while still being viable for heavilly supersampled filters. Along with some SSE2-based optimizations done about a month before, some blur-based filters are seeing more than 3x rendering time improvements, and the larger the radius, the gains are generally better (percentage gain decreases, but absolute values steadily improve).

The conclusion: Non-seamless bitmap-based effects are very complex to implement effectively. This is the reason that usual texture generators either do not have blur and the likes of it at all, or have them implemented terribly slow, in a bruteforce way. The Filter Forge uses a novel approach, which is based on deferred calculation along with caching, which is also multithreading-capable (I don't know of any similar algorithm which supports multithreaded processing on the same image efficiently). While this approach allows us to perform non-seamless blurs at acceptable speed, it has it's drawbacks, which are difficult to overcome, and while we (I, really) put a considerable amount of effort to improve bitmap-based components performance (specifically blur), there is no silver bullet for all cases of blur -- its performance is highly dependent on the performance of the underlying tree, the radius, the amount of blur "layers", etc.

General performance recommendation: If you have a CPU which is less than P4 3GHz or AMD 3000+, buy a faster one (preferably multicore, they are cheap now). Gains from dual-core CPUs are really close to 2x, with 1.87x being average for most of filters. If you're low on memory (<512 Mb), get another stick of RAM, adding to 1 GB or even 2 GB will help.

Posted: July 14, 2006 12:52 pm

Details E-Mail

onyXMaster

Filter Forge, Inc.

Posts: 350

And please, post the offending filters here. Especially, we're interested in those which do not use Blur, Motion Blur, Sharpen and High Pass.

Posted: July 14, 2006 12:54 pm

Details E-Mail

byRo

an Englishman in Brazil

Posts: 138
Filters: 8

Quote
onyXMaster wrote: The short answer: Try enabling seamless, all blur-based components are wholly lot faster that way and won't exhibit this "strange" behavior.

Aha! Short answer = quick fix. smile:D

Yes that is a LOT quicker, but (there's always a but) it seems that enabling seamless is a "user-side" option and not a filter parameter.
In other words, if (just as an example smile:|

) I make a three-stage blur filter, there is no way I can force it's use only in seamless mode - it would require "user" intervention to select the option.

..or am I missing something?

RÃ´
(I'm still digesting the long answer )

_________________________________
My favourite question is "Why?".
My second favourite is "Why not?"

Posted: July 14, 2006 6:52 pm

Details E-Mail

byRo

an Englishman in Brazil

Posts: 138
Filters: 8

About the long answer:

I can see from your explanation (and in pratice smile:D

) that the "seamless" blur is a lot quicker.
What I still don't get too well is why the execution time builds up (exponentially ?) with each stage?

If I get this right, you are saying that the final image is calculated one "tile" at a time and is not cached as a whole. So when we do a blur we are fetching "pixels" from outside the available tile cache.

If that is true then it would explain why the attached filter brings my machine to a halt. smile:cry:

The filter mixes up samples from all over the input image - in this case to detect a "white point". So for every output tile (pixel?!) the whole filter has to be recalculated for the entire image.

RÃ´

Mix.ffxml

_________________________________
My favourite question is "Why?".
My second favourite is "Why not?"

Posted: July 14, 2006 7:36 pm

Details E-Mail

onyXMaster

Filter Forge, Inc.

Posts: 350

The final image is calculated one "tile" (block is the correct internal term) at a time and is stored in a cache cell. There's no "whole" for non-seamless case, all tiles that are needed are created, stored and destroyed (based on LRU cache) automatically, so you have something close to the "whole" after the image is being rendered. The execution time builds up because if for first blur (600x600 image, radius 10) you'll need 600*600*k input pixels where k > 1 and is close to 12, you'll need 600*600*(k+k2) pixels, where k2 is also close to 12 and so on. I already worked on reducing the generic "k" value for most common cases and this leads to significant improvements, but it cannot be brought to be less than 7 for radius 10. To calculate blur you need more pixels you can see in the output (that's obvious). To calculate blur over blur, you'll need even more, and so on...

Also, I'll look into the problem in the provided filter smile:)

Posted: July 15, 2006 12:16 pm

Details E-Mail

onyXMaster

Filter Forge, Inc.

Posts: 350

Well, I took a quick look at the filter and I see your point -- my home machine while being very far from "slow" (Athlon 64 X2 3800+, 2 GB RAM) is crawling on the filter.

The next release will contain (already implemented) improved handling of large component trees (will make editing such filters less painful), so it will switch to such filters much faster, enter editor a lot faster and even render them faster.

Unfortunately, since I don't have fresh sources right here (I'm at home) and the source control server is inaccessible to me right now, I'm not declaring that "problem solved" or so and will try to determine the source of the slowdown when I'll get to the office and will have some time to spend on this (approximately next Wednesday).

Posted: July 15, 2006 12:29 pm

Details E-Mail

byRo

an Englishman in Brazil

Posts: 138
Filters: 8

onyXMaster, thank you for taking your time to share such interesting replies.

Can't wait for the next release. smile:D

RÃ´

_________________________________
My favourite question is "Why?".
My second favourite is "Why not?"

Posted: July 15, 2006 4:36 pm

Details E-Mail

Join Our Community!

Filter Forge has a thriving, vibrant, knowledgeable user community. Feel free to join us and have fun!

33,736 Registered Users
+14 new in 30 days!

153,582 Posts
+5 new in 7 days!

15,355 Topics
+5 new in 30 days!

Create an Account

Online Users Last minute:

16 unregistered users.

Recent Forum Posts:

CPU and GPU rendering choices by Graham Rob
10 hours ago
Unable to save projects by alfie
10 hours ago
FYI Filter Forge 14 works in Affinity 3.x (Mac) by Cassel
yesterday
Text Circular Art by Ramlyn by Ramlyn
October 27, 2025
Filter Forge 14 will not load into Adobe Photoshop version 26.2.0 by CFandM
October 24, 2025
FF7 unable to update or download Filters by GMM
October 22, 2025
Chaos Fields by Rachel Duim
October 15, 2025
The Render Animation output goes almost pink by einfach Alberto
October 15, 2025
Ultra forge and the Mac by DigitalWheelie
October 8, 2025
Successfully Looping An Animation by Rachel Duim
October 4, 2025
Multi Image Stacked Swirl by CFandM by CFandM
September 30, 2025
Stacked Swirl by CFandM by CFandM
September 26, 2025