>> Saturday, March 12, 2011
I finally finished implementing the Conjugate Gradient (CG) algorithm in OpenCL, and judging from a casual web search, I think this is the first time it's been done. The theory isn't simple by any means, and despite Jonathan Shewchuk's excellent paper on the subject (available here), there are still a few places where I'm not satisfied.
Thankfully, Matlab provides an m-file that shows how the CG algorithm is implemented, so I was able to check my work. Last night, I tested my code with the BCSSTK05 sparse matrix from NIST's Harwell-Boeing collection, and it converged with a final residual of 0.067. Happy day.
I was also planning to implement the biconjugate gradient stabilized algorithm, also known as BiCGSTAB, in OpenCL, but when I tried the Matlab bicgstab routine with NIST's LNS_131 matrix, the algorithm didn't converge. Even after 20,000 iterations, the residual stayed above ten thousand.
This amazes me. Finite element analysis (FEA) has been around since the 1960s, so I figured all the mathematical theory had been riddled out years ago. But from what I've seen, coding sparse matrix solvers is still more of an art than a science.
GPUs excel at highly-parallel algorithms, so it might be better to have them solve sparse matrix systems using direct methods instead of iterative methods. Need to give this more thought.