New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 28723 - Can't have OpenMP enabled while compiling CUDA (even without any OpenMP constructs)
Summary: Can't have OpenMP enabled while compiling CUDA (even without any OpenMP const...
Status: RESOLVED FIXED
Alias: None
Product: clang
Classification: Unclassified
Component: CUDA (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Unassigned Clang Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-26 15:13 PDT by David Poliakoff
Modified: 2016-07-29 14:43 PDT (History)
5 users (show)

See Also:
Fixed By Commit(s):


Attachments
A reduced version of the kind of parallel programming system we're writing (1.26 KB, text/rtf)
2016-07-26 16:31 PDT, David Poliakoff
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Poliakoff 2016-07-26 15:13:10 PDT
Please excuse if this is in the wrong tracker, it might need to be in the CUDA or OpenMP section (in fact, I'll file an issue there as well)

I have a file without any OpenMP in it, and if I try

clang++ [cuda args] -x cuda -std=c++11 --cuda-gpu-arch=sm_30 -stdlib=libc++ b.cpp

it is successful. If on the other hand I try

clang++ [cuda args] -x cuda -std=c++11 --cuda-gpu-arch=sm_30 -stdlib=libc++ -fopenmp b.cpp

I get

error: The target 'nvptx64-nvidia-cuda' is not a supported OpenMP host target.

It appears that Clang only sees that I am using a CUDA backend and am using the OpenMP flag, not checking whether any OpenMP code actually needs to go to the PTX backend. Much like the Sandia folks, we're writing portable parallel programming models and would really love to put Clang through its paces, but if we can't write code where CUDA sections follow OpenMP ones, we won't be able to.
Comment 1 Hal Finkel 2016-07-26 15:37:06 PDT
> Please excuse if this is in the wrong tracker, it might need to be in the CUDA or OpenMP section (in fact, I'll file an issue there as well)

This is the right one.
Comment 2 Samuel Antao 2016-07-26 16:22:02 PDT
What is going on is that -fopenmp is passed to the frontend command that generates code for CUDA. There is work under way to get a generic offloading implementation in clang. That will enable understanding the offloading programming model and pass the options -fopenmp only for OpenMP offloading code generation.

Given that OpenMP offloading support in the driver, is not yet complete, a fix would be to not pass -fopenmp for any offloading toolchain. I'll post a patch for fixing that.

Thanks for posting the bug.
Comment 3 David Poliakoff 2016-07-26 16:26:43 PDT
(In reply to comment #2)
> What is going on is that -fopenmp is passed to the frontend command that
> generates code for CUDA. There is work under way to get a generic offloading
> implementation in clang. That will enable understanding the offloading
> programming model and pass the options -fopenmp only for OpenMP offloading
> code generation.
> 
> Given that OpenMP offloading support in the driver, is not yet complete, a
> fix would be to not pass -fopenmp for any offloading toolchain. I'll post a
> patch for fixing that.
> 
> Thanks for posting the bug.

Just to clarify, we *do* have code that looks like
________________________________________
Initialize()

run_parallel(openmp,[=](int i){
   //OpenMP parallel stuff
});

run_parallel(cuda,[=](int i)__device{
   //cuda parallel stuff
});
______________________________________

Will this intermediate solution of selective passing of -fopenmp work on a system like this? Thanks for responding so quickly!
Comment 4 David Poliakoff 2016-07-26 16:31:21 PDT
Created attachment 16812 [details]
A reduced version of the kind of parallel programming system we're writing
Comment 5 David Poliakoff 2016-07-26 16:33:33 PDT
Sorry, used to systems in which I can edit comments rather than spam, last message for a while: the full version of the parallel system is at https://github.com/LLNL/RAJA if you want to check it out, but it doesn't currently build with this Clang (due to the "device attribute placement" bug Christian pointed out)
Comment 6 Samuel Antao 2016-07-26 16:37:43 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > What is going on is that -fopenmp is passed to the frontend command that
> > generates code for CUDA. There is work under way to get a generic offloading
> > implementation in clang. That will enable understanding the offloading
> > programming model and pass the options -fopenmp only for OpenMP offloading
> > code generation.
> > 
> > Given that OpenMP offloading support in the driver, is not yet complete, a
> > fix would be to not pass -fopenmp for any offloading toolchain. I'll post a
> > patch for fixing that.
> > 
> > Thanks for posting the bug.
> 
> Just to clarify, we *do* have code that looks like
> ________________________________________
> Initialize()
> 
> run_parallel(openmp,[=](int i){
>    //OpenMP parallel stuff
> });
> 
> run_parallel(cuda,[=](int i)__device{
>    //cuda parallel stuff
> });
> ______________________________________
> 
> Will this intermediate solution of selective passing of -fopenmp work on a
> system like this? Thanks for responding so quickly!

Based on your description I think the fix would work just fine. All it would do is to prevent the CUDA code generation to choke on OpenMP options and directives.
Comment 7 David Poliakoff 2016-07-26 17:04:56 PDT
Fantastic, this kind of support is really appreciated. If you ping me when it's in (or comment on this bug) I'll try to run it through RAJA and either get you a reason why it's still not working or tell you that it is working
Comment 8 Samuel Antao 2016-07-28 09:30:14 PDT
Fixed in r276979.
Comment 9 David Poliakoff 2016-07-28 10:47:39 PDT
You folks are wonderful, awesome time to candidate solution. I'll take it out for a spin and get you feedback.

Much appreciated!
Comment 10 David Poliakoff 2016-07-28 11:41:30 PDT
Verifying that this fixes this particular bug, we're going to be scaling up our use of LLVM to testing it on our mini-apps, I'll report any future bugs as they arise.

Seriously impressive work, thanks again!
Comment 11 David Poliakoff 2016-07-29 14:43:31 PDT
Closing comment:

This particular bug is fixed, we're running into new ones which I'll report when I'm sure they're not on my end, and when I have a good explanation for what's going on.

Thanks all for the help