OpenCL Image Filtering on the Nexus 10

>> Wednesday, March 27, 2013

I've coded a practical OpenCL example for the Nexus 10 and the code is on Github. The app works in six stages:

  1. The user clicks a button to take a picture.
  2. The Activity receives image data from the camera, places it in a ByteBuffer, and passes it to the native function.
  3. The native function sends the buffer data to the GPU, which executes the OpenCL kernel.
  4. The kernel performs 3x3 spatial convolution and places the result in an output buffer.
  5. The native function reads the GPU's result and places the data in a ByteBuffer.
  6. The Activity creates a bitmap from the ByteBuffer containing the filtered image.
I tried using OpenCL's image processing routines, but the Mali GPU doesn't support the CL_ARGB format. I tried using two-dimensional work-items, but that gave me an odd error. So here's the kernel:
__kernel void camera_filter(__global int *input_img,
                            __global int *out_img,
                            int width, int height) {

   int offset, red, green, blue;
   int row = get_global_id(0)/width;
   int col = get_global_id(0) - row * width;
   int4 pixels[3], color_vec;

   /* Set filter coefficients */
   int4 k0 = (int4)(-1, -1,  0,  0);
   int4 k1 = (int4)(-1,  0,  1,  0);
   int4 k2 = (int4)( 0,  1,  1,  0);

   /* Set filter denominator */
   int denom = 1;

   if((row > 0) && (col > 0) && (col < width-2) && (row < height-1)) {

      /* Read input image data into private memory */
      offset = (row-1) * width + (col-1);
      pixels[0] = vload4(0, input_img + offset);
      pixels[1] = vload4(0, input_img + offset + width);
      pixels[2] = vload4(0, input_img + offset + 2*width);

      /* Compute red component */
      color_vec = ((pixels[0] & 0x00ff0000) >> 16) * k0 +
                  ((pixels[1] & 0x00ff0000) >> 16) * k1 +
                  ((pixels[2] & 0x00ff0000) >> 16) * k2;
      red = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);

      /* Compute green component */
      color_vec = ((pixels[0] & 0x0000ff00) >> 8) * k0 +
                  ((pixels[1] & 0x0000ff00) >> 8) * k1 +
                  ((pixels[2] & 0x0000ff00) >> 8) * k2;
      green = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);

      /* Compute blue component */
      color_vec = (pixels[0] & 0x000000ff) * k0 +
                  (pixels[1] & 0x000000ff) * k1 +
                  (pixels[2] & 0x000000ff) * k2;
      blue = clamp((color_vec.s0 + color_vec.s1 + color_vec.s2)/denom, 0, 255);

      /* Update output pixel in global memory */
      out_img[get_global_id(0)] = 0xff000000 + (red << 16) + (green << 8) + blue;
   }
   else {
      out_img[get_global_id(0)] = input_img[get_global_id(0)];
   }
}

The filter's operation is determined by the coefficients in the three int4 vectors. In this case, the filter gives a 3D shadow to the image. The coefficients are defined in the kernel at the moment, but ideally, they'd be set by the user at runtime. I'll see what I can do.

On a scale from 1 to 10, I'd give this project's difficulty a solid 10. The problem isn't the OpenCL, but interfacing Java and OpenCL through the JNI. It would be nice if Android had something like Aparapi, which makes it possible to directly execute OpenCL kernels from Java.

7 comments:

Tobias Kramer March 28, 2013 at 4:18 AM  

Thanks for sharing your code and providing examples! I managed to compile and run a stripped-down computational physics OpenCL code on a Nexus 4 device using basically the "command line" and avoiding any Java code. I posted my experience and recipe here:

http://quantumdynamics.wordpress.com/2013/03/23/computational-physics-on-smartphone-gpu/

For me this route towards OpenCL on Android devices is a bit simpler, but of course it lacks a proper "App" Interface. For benchmarking and comparisons with other GPUs it is good enough for me.

Stefan M March 30, 2013 at 8:21 AM  

regarding aparapi, check out:
http://mahadevangorti.blogspot.de/

Matt Scarpino March 30, 2013 at 10:59 AM  

Thank you. That's very interesting.

Anonymous,  April 8, 2013 at 9:47 AM  

I am a complete noob, so this may not make sense at all but why bother with Java at all and just go down the C++ and NDK route?

Matt Scarpino April 8, 2013 at 10:16 AM  

It's possible to code apps without Java, but to the best of my knowledge, none of the native functions access the camera.

Also, I'm not sure if a fully-native app can be deployed to Google Play.

lmpe May 5, 2013 at 9:41 AM  

Thanks for the post! I'm looking into using the Nexus 10 to do some simple benchmarks (BLAS mainly) of the Mali-T604 GPU for a specialization project at school. Do you know anything about the quality of the bundled driver? For instance, any lack of support for functions it should support, weird quirks or anything similar?

Matt Scarpino May 9, 2013 at 7:42 PM  

The only quirks I've encountered involve the inability to execute kernels with multi-dimensional work-items or kernels that use OpenCL's image processing capabilities. Other than that, everything has worked very well.

Post a Comment

  © Blogger template Werd by Ourblogtemplates.com 2009

Back to TOP