Sunset Lake Software - Comments for "Optimizing Gaussian blurs on a mobile GPU" Comments for "Optimizing Gaussian blurs on a mobile GPU" en Drop me an email via the <p>Drop me an email via the address at the bottom of this page, and I'll put you in contact with the people I know.</p> pubDate Mon, 03 Feb 2014 00:02:17 +0000 dc:creator Brad Larson guid false comment 2072 at We've used GPUImage in a <p>We've used GPUImage in a photo app we're developing. We've written our own blur filter that is applied via a mask over the source image. The mask is calculated by the user drawing a line on screen and when they lift their finger, the line autocorrects and a mask image is generated. We need to optimize the code for generating the mask. We also need to capture full resolution images with filters applied to them.</p> <p>Would you know of someone good we could approach? We've posted the job everywhere and cannot find a good dev with these skills. We would very much appreciate it if you could give us a pointer. </p> <p>Cheers,<br /> Noel</p> pubDate Sun, 02 Feb 2014 11:37:28 +0000 dc:creator Anonymous guid false comment 2071 at I am also wondering if you <p>I am also wondering if you can cross-compile the GPUImage library with apportable, as it would be a awful lot more convenient for me to port my app to android using an Objective-C codebase. I understand that this may not be possible, but just double checking to be sure. </p> <p>Regards,<br /> Christopher</p> pubDate Wed, 01 Jan 2014 16:40:17 +0000 dc:creator Christopher Doherty guid false comment 2049 at Do you have a stable build <p>Do you have a stable build for GPUImage which works with apportable ? We tried to compile gpu image agaist apportable and it fails . Checking for apportable is much appreciated .</p> pubDate Fri, 27 Dec 2013 12:58:07 +0000 dc:creator Anonymous guid false comment 2026 at In regards to the 8 dependent <p>In regards to the 8 dependent texture reads, yes, it's my understanding from talking to the engineers that anything above that triggers a dependent texture read. The only real benefit you get from that point on is the fact that the texture offsets are only calculated once per vertex rather than once per fragment.</p> <p>I don't know that I'd say that the 5S can do 40 non-dependent texture reads, but as you can see from the performance numbers above, and from write-ups like Anand Lal Shimpi's excellent analysis of the 5S and iPad Air GPU, the 5S clearly does something different from all previous GPUs. For texture reads, it behaves in a manner unlike any of them. Dependent-read-heavy shaders like the unoptimized Kuwahara filter I have run nearly as fast as ones that have only a small number of non-dependent texture reads. I wonder if it has to do with the memory structure Anand points out in his latest iPad Air writeup.</p> pubDate Wed, 13 Nov 2013 20:07:28 +0000 dc:creator Brad Larson guid false comment 1938 at Thanks for you reply, yes I <p>Thanks for you reply, yes I was suspecting something like that, I shall do my own testing. For image processing needs its always useful to have the widest possible blur with the least possible resources. Just one last question, even if the compiler lets you pack them as vec4's and you unpack them in the fragment shader, are you then limited to 8 non-dependent texture reads anyways on some devices, so the benefit of passing the extra ones through in the vertex shader are lost. How do you know by the way the the iPhone 5s can have about 40 non-dependent texture reads - as you say in stack overflow (I cant find anything on this kind of stuff, shaders it seems are a black art, just give me a device and stopwatch!).</p> <p>Cheers,</p> <p>Gary</p> pubDate Wed, 13 Nov 2013 13:07:22 +0000 dc:creator Anonymous guid false comment 1937 at If I understand correctly <p>If I understand correctly what's going on, the vector types are mapped into vec4's, not packed for the most efficient use of available varyings. I see the same limit if I use vec2 types as vec4, so I believe it's just a matter of two vec2's not being packed into one vec4. I believe this can be done, but I think that's up to the driver / compiler.</p> pubDate Wed, 13 Nov 2013 02:44:06 +0000 dc:creator Brad Larson guid false comment 1936 at Thanks a lot for a very <p>Thanks a lot for a very useful write up. As you say iOS has a hard coded limit of 32 varying components, yet you say this only gives you about 8 blur coordinates, I presumed it would give you 16, as the texture coordinates are vec2's, am I missing something? Thanks again for GPUImage and your SO answers. </p> <p>Cheers,</p> <p>Gary</p> pubDate Tue, 12 Nov 2013 22:31:21 +0000 dc:creator Anonymous guid false comment 1935 at Nice write up! Have you <p>Nice write up!</p> <p>Have you considered a multi-pass blur, not just over the separable rows and columns, but on the whole image to approximate a larger blur radius with a smaller one?</p> <p>Love the reference to Stack Blur -- am sure that'll come in super-handy at some point, and looks very SIMD friendly.</p> pubDate Fri, 25 Oct 2013 23:39:14 +0000 dc:creator jpap guid false comment 1911 at Thanks for sharing, great <p>Thanks for sharing, great insights!<br /> Is it possible to implement a custom, faster gaussian blur CIFilter for OS X, that uses a downsampled texture size?</p> pubDate Fri, 25 Oct 2013 14:35:40 +0000 dc:creator Raffael guid false comment 1910 at