Messages 46 - 90 of 184
First | Prev. | 1 2 3 4 5 | Next | Last |
Crapadilla
![]() |
Very nice contribution, Sphinx. Thank you!
![]() One never stops learning! ![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 16, 2007 10:41 am | ||||||||||
Crapadilla
![]() |
I'm wondering whether we should start setting up broader categories for the entries. We've already got several entries on 'Blur' and 'Noise' for example...
--- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 16, 2007 10:44 am | ||||||||||
Sphinx.
![]() |
Yeah, that might a good idea.. I have one more entry on blur, high passes and sharpens brewing..
Btw. how about merging in the "Optimizing Filters for Speed" section? |
|||||||||
Posted: December 16, 2007 10:54 am | ||||||||||
Crapadilla
![]() |
Alright, I'll reorganize the entries once that second entry of yours has landed. ![]()
Yup. That might be a good idea, since its also about efficiency! --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 16, 2007 11:00 am | ||||||||||
Sphinx.
![]() |
Having second thoughts on that entry - its about using offsets to construct a low radius blur as an alternative to using the blur/highpass/sharpen component for low radius filtering (as shown, once we have blur, we also have High Pass and Sharpen).. but maybe I should just check that in as a snippet instead.. I think that is better.. so just go ahead with the reorganization ![]() |
|||||||||
Posted: December 16, 2007 11:17 am | ||||||||||
Crapadilla
![]() |
Reorganized!
![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 16, 2007 12:45 pm | ||||||||||
Sjeiti
![]() |
As Sphinxmorpher mentioned... it might be good to merge the 'Optimizing Filters for Speed' since all speed optimizations could be considered as DO's.
Then again it might be good to have a separate page with speed optimizations (you might not want to know about all the dont's when you're just looking for speed). There are probably lots of entries that would fall under multiple categories. For instance the Channels would also perfectly fit under Hints, tips and tricks as well as Optimizing Filters for Speed. In such cases maybe the best practice would be to put it into the page it would best fit into and cross-reference to it from other pages. So maybe Channels in 'Optimizing Filters for Speed' would simply have a header and a link to the Channels entry in 'The DOs and DON'Ts of Filter Construction'. |
|||||||||
Posted: December 16, 2007 2:17 pm | ||||||||||
Crapadilla
![]() |
An interesting side-effect of collecting random blips of FF wisdom for writing this article appears to be that I'm checking out all my filters to see whether I could optimize them further. During this process I'm finding that - here and there - I haven't always exactly practised what I preached!
![]() Looks like I'll be updating some filters soon... ![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 16, 2007 5:54 pm | ||||||||||
Vladimir Golovin
Administrator |
Yes, we render a bitmap version of the source input, then blur it. There's a short article in the Help on this -- it is linked from help articles of Blur, HighPass etc. Also, I believe I explained the sampling architecture in my earlier forum posts. Try a keyword search for 'sampling' and 'sample'. |
|||||||||
Posted: December 17, 2007 9:27 am | ||||||||||
Vladimir Golovin
Administrator |
Yes, this is correct. |
|||||||||
Posted: December 17, 2007 9:31 am | ||||||||||
Vladimir Golovin
Administrator |
There's an option 'Show Elapsed Rendering Time' in Tools > Options -- use it to measure the rendering time. |
|||||||||
Posted: December 17, 2007 9:33 am | ||||||||||
Sphinx.
![]() |
Oh, its not as much the functionality of the bitmap based components I'm asking to - its the way I utilize that rasterization process as a way of buffering the input and thereby reducing the rendering time. Look at this filter and the comments for it. If you don't fancy this approach, I don't want to promote it further ![]()
Ah..great! I missed that option. Btw. I know this has been up to discussion before.. but some sort of rendering "report" would be really great - this report could describe where the performance culprits are - perhaps an option we can enable in filter construction mode, which then adds some sort of rendering time percentage to each module (including its subtree)... |
|||||||||
Posted: December 17, 2007 9:59 am | ||||||||||
Crapadilla
![]() |
Did you check this? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 17, 2007 10:06 am | ||||||||||
Sphinx.
![]() |
Yes, but the idea is different and more complicated there, and also the sample count may not be exact in expressing how fast a filteris - its the calculation of the sample that matters.. It should be easier to sum up performance percentages from subtrees simply by timing the individual components. Lets say we have a simple filter with three perlins equally fast connected to the surface result, then each component would have the timing labels: result: 100% perlin 1: 33.33% perlin 2: 33.33% perlin 3: 33.33% or perhaps more realistic: result: 100% perlin 1: 30% perlin 2: 30% perlin 3: 30% Which tells us that the result alone is resposible for 9.99% of the overall rendering time. |
|||||||||
Posted: December 17, 2007 10:17 am | ||||||||||
Sjeiti
![]() |
Hey Crapadilla, Sphinx... maybe you could check this out...
Since that channels trick is pretty cool I thought I'd apply it in Rotated polaroid since that one uses three rotations (offset) that could be combined into one. The weird thing is that my render time went up with ~175% after I applied the channels thing. So sometimes three offsets is faster than one offset plus some assemble disassemble RGB. |
|||||||||
Posted: December 18, 2007 2:53 am | ||||||||||
Sphinx.
![]() |
hmm.. looking at that filter now.. are you talking about the cluster that creates the red and blue fringe artifacts?
|
|||||||||
Posted: December 18, 2007 4:13 am | ||||||||||
Sjeiti
![]() |
no wait... attached
in the original the actual rotation occurs 4 times in this version (with help from the channels) it is rotates twice Rotated Polaroid VI.ffxml |
|||||||||
Posted: December 18, 2007 4:28 am | ||||||||||
Crapadilla
![]() |
I have revisited some of my older filters and played around with the 'Channels' trick as well. Apparently it doesn't necessarily speed things up in all cases, so be careful! I'll have to experiment some more on this phenomenon. It appears you'll have to take into account the speed of the component you wish to concatenate. Offsets are 'ultra-fast' while the Channel operations are just 'fast'. ![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 18, 2007 4:51 am | ||||||||||
Sphinx.
![]() |
I also noted some cases where its slowing things down.. I think we should add a note about that, i.e. that you have to check if the method is actually doing any good - the cost of setting up the channel combining, the intermediate combined processing, and finally the post channel extraction must not be higher than the alternative uncombined solution
|
|||||||||
Posted: December 18, 2007 5:17 am | ||||||||||
Crapadilla
![]() |
So, did anyone toil through all that wiki rambling so far? Any critique, comments, suggestions, wishes, hate-mail?
![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 18, 2007 6:00 pm | ||||||||||
Kraellin
![]() |
you lost me at the second post, though i did go look at one of the wiki pages of yorus, dilla. looked good, but i just go to sleep when you start talking technical. i still have no idea how a blur speeds things up, though i notice you've been revising quite a few filters lately, so, at least you learned something
![]() If wishes were horses... there'd be a whole lot of horse crap to clean up!
Craig |
|||||||||
Posted: December 19, 2007 3:16 pm | ||||||||||
Crapadilla
![]() |
Which one?
Well, I guess that can't be helped then. The wiki article is about efficiency considerations, which will always be technical in nature. Actually, I was hoping the articles would convey a glimpse at the 'mindset' that one needs to develop as a filter author, an attention to certain details. It's always a balancing act between the technical on the one and the artistical on the other hand...
Me neither, I'm afraid. ![]()
Just optimizations mostly, but I will do a few complete overhauls here and there. --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 4:37 am | ||||||||||
Sphinx.
![]() |
You should read the description and comments for this filter then, and if the method is still unclear after reading through that, just ask (in the comment thread please) ![]() The method will not be described in the DOs and DON'Ts before I know what Vlad/FF thinks about it.
Vlad has answered.. and it seems we need to revisit the stuff about the sample cache system. For example using two duplicated components (with same settings) can be faster than using multiple outbound connections from one, if one or more (but not all) of the incoming coordinates have been changed by offsets or distortions (remember that the coordinate flow is "backward", i.e. going in through the outputs, which then returns a sample based on this). |
|||||||||
Posted: December 20, 2007 6:04 am | ||||||||||
Crapadilla
![]() |
My question would be: If indeed the coordinate doesn't match, do the calculations for new sample draw upon the sample cache to retrieve their data, or does the original component that the cache was derived from get reevaluated?
Sphinx, I'll have to admit I fail to understand how this could be the case (and what you mean by "backward" for that matter). If you ask me, this would go against the logic of the sample caching system, at least the way I understand it. An example: If you have a high-detail worley noise with six outgoing connections, the noise gets calculated once. All six subsequent operations on this noise would then draw upon one-and-the-same preprocessed sample cache, making things much speedier. However, if there were six duplicate worleys instead, it is logical that FF would need to prepare six noises instead of one. Consequently, I would dare speculate that multiple connections are always faster than duplicates. Vlad, would you shed some light on this? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 6:26 am | ||||||||||
Vladimir Golovin
Administrator |
Yes -- if the sample coordinates differ from the coordinates of the cached sample, the current cache is discarded, the component is evaluated, and the result of this evaluation is stored in the cache.
Yes, this is correct as long as these six components request samples with the same coordinates.
Exactly.
Not always. Consider the case you described, a Worley Noise with six outbound connections, where each connection is a distorter such as Offset or Noise Distortion with different parameters. This way, each of the six connections will request samples from the Worley at different coordinates, thus invalidating the sample cache. |
|||||||||
Posted: December 20, 2007 7:16 am | ||||||||||
Sphinx.
![]() |
This is correct when the coordinates recieved from the outbound connections all are the same, but if the coordinates are different, it will cause a new sample to be calculated and as a generic behaviour the last calculated sample will always be stored in the cache - thats what Vlad confirmed.
Yes you are right, it seems counter intuitive, but let me explain how I understand it: Lets first get perfectly clear on the difference btwn sample and coordinate flow. Each module can very simplified be thought of as a function that produces a sample from a coordinate (x, y). So in other words a module takes a coordinate and gives a sample, in that order! And we can then say that the sample flow direction goes from component to result, but the coordinate flow goes from result to component, because the coordinates are used to calculate the samples. That FF uses a procedural sampling approach means that your filter will be executed for each pixel in the final image. If the image you are rendering is 2x2 pixels (and no AA enabled), the filter executes 2x2 times. For the first pixel, the result component will request a sample from the connected component, lets say this is a perlin noise, by feeding it the coordinate (0, 0). When the perlin noise has calculated the sample value, it stores the sample to a local variable/cache along with the coordinate that produced it. Remember that this cache at all times hold only one sample: the last calculated. This goes on for the rest of the pixels, (0, 1), (1, 0) and (1, 1). So in a simple setup like this: [PERLIN]--->[RESULT], it is evident that the sample cache system can't improve anything, since the perlin only recieves a given coordinate one time per pixel - we need several outbound connections for this system to do any good. So if there are six outbound connections from the perlin to a multiblend, which then goes to the result, the actual sample is only calculated once in the perlin, because the following five requests from the multiblend match the coordinate associated with the perlin cached sample. If we have an offset inbetween the result and the multiblend, which offsets by 1 horizontally, then we would have this change in coordinate flow: [RESULT]--(0, 0)-->[OFFSET X + 1]--(1, 0)-->[MULTIBLEND]-->[PERLIN] And this means that the perlin would cache the sample for the coordinate (1, 0) and not (0, 0). Again we have no problem if there is many connections between the perlin and multiblend, because all coordinates have been offset by the same value. The problem start showing if we move the offset in between the multiblend and perlin for only some of the connections, because then the perlin would recieve a mix of the coordinate (0, 0) for those connections without offset, and (1, 0) for those going through offset. How bad the cache flushing will be, depends on the actual setup.. The example filter I posted earlier in the thread shows how to contruct a bad case of constant sample cache flushing. |
|||||||||
Posted: December 20, 2007 7:25 am | ||||||||||
Crapadilla
![]() |
So in that particular case, it would actually be irrelevant whether you had one worley with multiple connections or six duplicates. That's interesting.
Very interesting! So which components apart from Offsets and Noise Distortions actually DO modify sample coordinates? I'm guessing pattern components with the jumble fill mode active, and maybe Kaleidoskopes? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 7:29 am | ||||||||||
Sphinx.
![]() |
Actually there is a solution that could free us from worrying about this at all - an improvement to the samplecache system:
Any component that changes coordinates, should attach an ID number to its sources, making the sources spawn a new sample cache reserved for incoming requests with that ID ![]() |
|||||||||
Posted: December 20, 2007 7:35 am | ||||||||||
Crapadilla
![]() |
Thanks Sphinx, that lit the light bulb. The whole thing is getting much clearer now...
![]()
Now I finally understand what exactly you were testing with that filter: Cache Flushing! ![]() --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 7:49 am | ||||||||||
Sphinx.
![]() |
Yep, and as mentioned also Refraction. Also I think the bitmap based components will change the sources cached samples as the bitmap based components buffer a large tile of samples, but it might not be as frequent though - I know too little about the actual blur rasterization to tell, but I think its split up into small cells of samples for optimization reasons.. The elevation gradient will also change the coordinates for its gradient source, as only the top row is used from the gradient source.. |
|||||||||
Posted: December 20, 2007 7:56 am | ||||||||||
Sphinx.
![]() |
LOL! Looking back at my posts I can now see that I'm not as clear in explaining things as I'd like to think.. hehe - also my "danglish" could be a problem (I'm danish and my english is way too influenced by danish sentence constructions..sigh)
Speaking of unclarity, let me just give this one another go ![]() As we are discussing now, the cache system seem to work best with unaltered coordinates, i.e. (0, 0) should stay (0, 0) throughout the flow for optimal chance of hitting a valid cached sample. What I'm then proposing is that components that change the coordinate, should also change some sort of CacheID that all its sources will recieve along with the new coordinate. Using this ID the components can write the sample to another cache index. This mechanism should effectively ensure that caches don't get flushed because one outbound connection is different than the others. |
|||||||||
Posted: December 20, 2007 8:11 am | ||||||||||
Vladimir Golovin
Administrator |
I don't think this is the case ![]() |
|||||||||
Posted: December 20, 2007 8:36 am | ||||||||||
Crapadilla
![]() |
Question: In case of a component having multiple (map) inputs, can we predict the order in which these inputs will be evaluated? In the case of a Multiblend we'd have 14 map inputs, for example. Are these processed from the top input (i.e. "Layer 7") to the bottom input (i.e. "Opacity 1")? From what I could glean from Sphinx' sample cache test filter, I'm guessing this actually is the case.
Knowing this "order of processing" would allow us to predict when sample cache flushing would occur, wouldn't it? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 9:38 am | ||||||||||
Sphinx.
![]() |
Yeah, definitely - actually I'm experimenting with that just now, but it is getting really complicated, specially when you have "interbranch" connections (i.e. connections btwn two major branches/clusters of components) - predicting in which order things are executed is not very easy, and to get things even messier it seems that for some cases it doesn't matter which of two connections are executed first and therefore the execution order is the order by which you actually connected "parallel" components.. ouch, my analysis skills starts to dissolve here ![]() ![]() Btw. another option for optimizing the coordinate "flow" would be to do a "test run" and for components with many outbound connections then sort the execution order so that sample requests belonging to a given coordinate are executed in sequence (it is probably more complicated than the indexed solution, but it should be slightly more memory and performance efficient). Here's the idea: A component with several outbound connections as it is now, could receive the following sequence of coordinates (from different offset and so on) in a given branched filter construction: (0, 0) - first sample (10, 3) - cache flush/new sample (0, 0) - cache flush/new sample (10, 3) - cache flush/new sample (7, 3) - cache flush/new sample (0, 0) - cache flush/new sample "recording" these changes in coordinate flow in a test run would then allow the optimizer to sort the execution order so that we'd have the following order instead: (0, 0) - first sample (0, 0) - cache match (0, 0) - cache match (10, 3) - cache flush/new sample (10, 3) - cache match (7, 3) - cache flush/new sample effectively reducing the cache flushes to a minimum ![]() |
|||||||||
Posted: December 20, 2007 9:55 am | ||||||||||
Crapadilla
![]() |
Same here. And it shows drastic speed differences (28 sec VS. 49 sec). Wow. If I get this right, it appears that - on the non-duplicated branch - both the Perlin Noise's and the Offset's sample caches get flushed alternatingly for each of the 28 Multiblend inputs, requiring each cache to be rebuilt 14 times!?!? On the duplicated branch however, the two sample caches are created once and then stay valid for the whole evaluation... Very intriguing! --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 10:14 am | ||||||||||
Crapadilla
![]() |
... or how about a 'console' that logs sample cache building and flushing on a per-component basis? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 10:22 am | ||||||||||
Vladimir Golovin
Administrator |
Sphinx, the sample-based architecture already comes with inherent computational burden, which becomes especially apparent on simple filters -- in such cases the time spent on the sampling process is greater than the time spent on evaluation of the component's output. That's why we wanted to keep the sampling cache layer as thin and as fast as possible -- the current implementation has only two conditional jumps (one for each coordinate), and, in the majority of cases, only one jump is executed.
|
|||||||||
Posted: December 20, 2007 10:23 am | ||||||||||
Vladimir Golovin
Administrator |
This would require at least one conditional jump in the innermost cycle, which we definitely wouldn't like -- it will slow the rendering down for everyone, even for people who have the console turned off. |
|||||||||
Posted: December 20, 2007 10:28 am | ||||||||||
Sphinx.
![]() |
Ok! thats pretty significant - here the numbers are 1:34 for the duplicated and 2:44 for the non-dup. So there is definitely something to gain by having this fact in mind when hooking up components in a complex system
Yep, I fully understand that.. but actually the ID/indexed solution would not change this much regarding execution speed of the evaluation (perhaps no change at all), since the ID simply could be thought of as a change of a pointer value on where the cached coordinate should be fetched from when comparing. lets say the unchanged coordinates have cache index = 0 by default, i.e. the first cache item in the components internal cache array. This idx remains in use until we reach a coordinate altering component (e.g. offset). The offset will then increase the idx to 1, so that all its connected sources now use a new index when storing and comparing cached samples in the offsets line of sample requests. So as you can see its not some type of extreme "search through a list of cached samples" solution, but rather a simple solution that just changes to a new memory location - the conditionals will remain the same... oh.. btw.. ditch that test run proposal, I was too fast there, its only in rare cases the offset remain constant for the whole rendering ![]() |
|||||||||
Posted: December 20, 2007 10:42 am | ||||||||||
Sphinx.
![]() |
Inspired by this indexed sample cache proposal, I just made an interesting observation! Try inserting an Invert component (disable invert) between the Perlin and the Offset in the non-duplicated branch ![]() |
|||||||||
Posted: December 20, 2007 10:56 am | ||||||||||
Crapadilla
![]() |
Both branches render 28-29 sec now! ![]() So, should we conclude this means that each and every component always maintains its own sample cache, regardless of the number of outgoing connections? --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 11:47 am | ||||||||||
Sphinx.
![]() |
Yeah, every component has its own output sample cache, but only one sample cached - the last rendered/requested - so if you have different coordinate connections interleaved (like in my example filter), it causes constant recalculation (and cache flushing). By introducing a new intermediate component, like the Invert, you also add a new cache - very much like the indexed cache idea. I think what is happening is that the Invert draws one time from the perlin, and from there on, the Invert's own sample cache will match all the next offset requests, effectively minimizing the flushes. This is more elegant than the duplicated component solution I think ![]() But since every component has its own cache, this is also why the "problem" is not more prominent than it is, but there are still quite many situations where you could spare a sample calculation here and there by throwing in an Invert (with "invert" unchecked) and then dragging connections for a certain coordinate branch from that one.. This is quite interesting, and I definitely have to look at some of my filters in this regard. Another related thing: IIRC the height tree draws more samples per output sample than the other surface inputs.. is that right? If so, then there could be some very significant optimizations by seperating connections to components shared btwn the height tree input and the other surface inputs.. haven't tested this though.. |
|||||||||
Posted: December 20, 2007 12:33 pm | ||||||||||
onyXMaster
Posts: 350 |
Sphinx, you're correct, the Height input in both root component (in surface filter mode) and Refraction component takes three input samples for one output sample.
Offtopic: "...idx remains in use until we reach a coordinate altering component (e.g. offset)" This rings a bell, Vlad, remember that dreaded is_distorter_present flag? ![]() |
|||||||||
Posted: December 20, 2007 2:19 pm | ||||||||||
Crapadilla
![]() |
Vlad, could you give us an official confirmation on this list? * Offset * Noise Distortion * Bricks (jumble) * Pavements (jumble) * Tiles (jumble) * Kaleidoskope * Refraction * Bitmap-based components (?) * Elevation Gradient --- Crapadilla says: "Damn you, stupid redundant feature requests!" ;) |
|||||||||
Posted: December 20, 2007 4:47 pm | ||||||||||
Sphinx.
![]() |
There is one more I think, Worley noises (Solid Fill Mode)
Also I'm really not sure about the elevation gradient - could be that the upper row of the gradient input is rasterized somehow.. |
|||||||||
Posted: December 20, 2007 4:59 pm |
Filter Forge has a thriving, vibrant, knowledgeable user community. Feel free to join us and have fun!
33,711 Registered Users
+18 new in 30 days!
153,533 Posts
+38 new in 30 days!
15,348 Topics
+73 new in year!
17 unregistered users.