Compiling and running PCL with CUDA

I have a very sophisticated normal extraction algorithm that I am working to
optimize by means of CUDA. I am familiar with .cu files, and I have used PCL
rather extensively as well. However, I have found it incredibly difficult to
merge the two libraries, as I consistently receive errors regarding
dependencies. I have looked into the pcl_cuda library, but I would prefer to
implement a more simple architecture if possible.

Does anyone have any experience with this?

Thanks in advance.

