Sunset Lake Software - Comments for "glGenerateMipmap and raw pixel buffer"

The slow glGenerateMipmap was

The slow glGenerateMipmap was verified by others on the Apple Developers Forum. Since I only need grayscale pyramids, I've implemented a pyramid filter class using GPUImageFilterGroup to perform progressive Gaussian blurring using each channel of an RGBA texture as a pyramid level -- this is along the lines of a SIFT implementation you had mentioned in earlier correspondence:

This is something like the following matlab code:

texture(1,:,:) = I;
texture(2,:,:) = conv2(texture(1,:,:), G, 'same');                 # I*G
texture(2,1:end/2,1:end/2) = texture(2,1:2:end,1:2:end);  # decimate 1->2
texture(3,:,:) = conv2(texture(2,:,:), G, 'same');                 # I*G*G
texture(3,1:end/2,1:end/2) = texture(3,1:2:end,1:2:end);  # decimate 2->3
texture(4,:,:) = conv2(texture(3,:,:), G, 'same');                 # I*G*G*G
texture(4,1:end/2,1:end/2) = texture(4,1:2:end,1:2:end);  # decimate 3->4

Where I is the input grayscale image, G is a Gaussian Kernel (I use a separable version in the actual implementation), and

texture(1,:,:) = image.r;
texture(2,:,:) = image.g;
texture(3,:,:) = image.b;
texture(4,:,:) = image.a;

The catch is that I can't seem to find a way to perform the sub-sampling operation on each pyramid level without destroying the previous pyramid levels (color channels) in some way. As an example, the following vertex and fragment shader pair will nicely decimate the green channel, but it doesn't preserve the values in the other color channels.

NSString *const kGPUImageDecimationVertexShaderString = SHADER_STRING
(
 attribute vec4 position;
 attribute vec4 inputTextureCoordinate;
 varying vec2 textureCoordinate;
 void main()
 {
     gl_Position = vec4((position.xy * 0.5), 0.0, 1.0);
     textureCoordinate = inputTextureCoordinate.xy;
 }
);
 
NSString *const kGPUImageShaderStringGreen = SHADER_STRING
(
 precision mediump float;
 varying vec2 textureCoordinate;
 uniform sampler2D inputImageTexture;
 void main()
 {
    gl_FragColor.g = texture2D(inputImageTexture, textureCoordinate).g;
 }
 );

I can fix this somewhat by manually retrieving the pixel at the decimated position before overwriting the green value and assigning the original colors along with modified green color as demonstrated in a modified fragment shader below

reducedPixel = texture2D(inputImageTexture, reducedCoordinate);
originalPixel = texture2D(inputImageTexture, originalCoordinate);
gl_FragColor = vec4(reducedPixel.r, originalPixel.g, reducedPixel.b, reducePixel.a);

but this still destroys the original higher resolution color channels (in this case red). This would be solved by the ability to preinitialize the texture, or the ability to perform two outputs in the fragment shader, which isn't supported in OpenGL ES 2.0. So I'm hoping there is a way to achieve the prior, such that a noop fragment shader will still preserve the original texture? This way I could selectively modify a single channel and leave the other color channels in tact.

This could also be achieved by the ability to run multiple fragment shaders, one to fill in the undecimated pixel and a second for the decimated pixel, but it isn't clear to me if this is possible either.

Terrific. Adding

Terrific. Adding forceProcessingAtSize to the GPUImageGrayscaleFilter with nearest power of 2 dimensions seems to satisfy the call to glGenerateMipmap. This appears to be resizing to the new dimensions. In my case I'll have to replace this with a border/padding operation to preserve the original image aspect ratio, but this gets me going. I need to debug the rest and will hopefully post the working code soon. The resize modification is shown below. I'll keep a lookout in case you post a pyramid filter. I use this structure pretty extensively in OpenCV.

#if DO_MIPMAP_IMAGE_PYRAMID
    // Calculate next power of two for glGenerateMipmap() call in setNewFrameAvailableBlock
    int width2=pow(2, ceil(log2(m_ImageWidth))); /* 720 -> 1024, 1280 -> 2048 */
    int height2=pow(2, ceil(log2(m_ImageHeight))); 
    rawDataOutput = [[GPUImageRawDataOutput alloc] initWithImageSize:CGSizeMake(width2, height2) resultsInBGRAFormat:YES];
    [gf forceProcessingAtSize:CGSizeMake(width2, height2)]; // Use filter to force dimensions to nearest power of 2
#else

UPDATE: The call to glGenerateMipmap(GL_TEXTURE_2D) takes about 0.56 seconds (ouch!) on an IPhone 4S running iOS 5.0 w/ a 1280 -> 2048 texture. A few posts online suggest that this function is typically HW accelerated, but I haven't found anything specific to the IPhone. The benchmark certainly suggests otherwise. Unless there is a missing magic configuration parameter, the next step may be to try implementing the pyramid by iteratively low pass filtering (as in GPUImageGaussianBlurFilter) and explicitly resizing the texture for each pyramid level. I'm curious if you've seen similar results in your tests.

Are you sure that the raw

Are you sure that the raw data output texture is the active one that you're getting in glGetIntegerv(GL_TEXTURE_BINDING_2D, &textureName)? That makes some assumptions about whatever is currently bound that might not be correct. In fact, there is no output texture for the raw data output on iOS 4.x and the output texture used on iOS 5.x is never bound, so I don't think it will be picked up by that call. I think the texture you're getting there is the one being fed into the raw data output, which is most likely not a power of two in size.

Rather than use a raw data output, you might be able to use a standard filter with -forceProcessingAtSize: and then grab its output texture using -textureForOutput. With a forced processing at a power of two size, you should be able to activate mipmap generation for that texture either manually or automatically on rendering. I did something like this recently when exploring Gaussian pyramids myself. I may even create a specific filter subclass just to do this.