I've coded a practical OpenCL example for the Nexus 10 and the code is on Github. The app works in six stages:
- The user clicks a button to take a picture.
- The Activity receives image data from the camera, places it in a ByteBuffer, and passes it to the native function.
- The native function sends the buffer data to the GPU, which executes the OpenCL kernel.
- The kernel performs 3x3 spatial convolution and places the result in an output buffer.
- The native function reads the GPU's result and places the data in a ByteBuffer.
- The Activity creates a bitmap from the ByteBuffer containing the filtered image.
I tried using OpenCL's image processing routines, but the Mali GPU doesn't support the CL_ARGB format. I tried using two-dimensional work-items, but that gave me an odd error. So here's the kernel:
__kernel void camera_filter(__global int *input_img,
__global int *out_img,
int width, int height) {
int offset, red, green, blue;
int row = get_global_id(0)/width;
int col = get_global_id(0) - row * width;
int4 pixels[3], color_vec;
/* Set filter coefficients */
int4 k0 = (int4)(-1, -1, 0, 0);
int4 k1 = (int4)(-1, 0, 1, 0);
int4 k2 = (int4)( 0, 1, 1, 0);
/* Set filter denominator */
int denom = 1;
if((row > 0) && (col > 0) && (col < width-2) && (row < height-1)) {
/* Read input image data into private memory */
offset = (row-1) * width + (col-1);
pixels[0] = vload4(0, input_img + offset);
pixels[1] = vload4(0, input_img + offset + width);
pixels[2] = vload4(0, input_img + offset + 2*width);
/* Compute red component */
color_vec = ((pixels[0] & 0x00ff0000) >> 16) * k0 +
((pixels[1] & 0x00ff0000) >> 16) * k1 +
((pixels[2] & 0x00ff0000) >> 16) * k2;
red = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);
/* Compute green component */
color_vec = ((pixels[0] & 0x0000ff00) >> 8) * k0 +
((pixels[1] & 0x0000ff00) >> 8) * k1 +
((pixels[2] & 0x0000ff00) >> 8) * k2;
green = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);
/* Compute blue component */
color_vec = (pixels[0] & 0x000000ff) * k0 +
(pixels[1] & 0x000000ff) * k1 +
(pixels[2] & 0x000000ff) * k2;
blue = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);
/* Update output pixel in global memory */
out_img[get_global_id(0)] = 0xff000000 + (red << 16) + (green << 8) + blue;
}
else {
out_img[get_global_id(0)] = input_img[get_global_id(0)];
}
}
The filter's operation is determined by the coefficients in the three int4 vectors. In this case, the filter gives a 3D shadow to the image. The coefficients are defined in the kernel at the moment, but ideally, they'd be set by the user at runtime. I'll see what I can do.
On a scale from 1 to 10, I'd give this project's difficulty a solid 10. The problem isn't the OpenCL, but interfacing Java and OpenCL through the
JNI. It would be nice if Android had something like
Aparapi, which makes it possible to directly execute OpenCL kernels from Java.
Read more...