Using clang and openmp built from r359012, I see the following assertion failure: $ cat test.c int main() { #pragma omp target teams num_teams(3) ; return 0; } $ clang -fopenmp test.c $ ./a.out Assertion failure at kmp_runtime.cpp(4270): new_thr->th.th_active == (!0). OMP: Error #13: Assertion failure at kmp_runtime.cpp(4270). OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://bugs.llvm.org/. Aborted (core dumped) If I remove num_threads(3), I see the following instead: $ clang -fopenmp test.c $ ./a.out Assertion failure at z_Linux_util.cpp(1469): (__kmp_thread_pool_active_nth) >= 0. OMP: Error #13: Assertion failure at z_Linux_util.cpp(1469). OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://bugs.llvm.org/. Aborted (core dumped) These assertions do not fail every time, so there's a race. I also see all this at r357927 but so far never at r357926, so r357927 is the likely culprit. I built clang and openmp using: $ clang --version clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin $ cat /etc/issue Ubuntu 18.04.2 LTS \n \l
Issue reproduced, working on a fix.
(In reply to Andrey Churbanov from comment #1) > Issue reproduced, working on a fix. As discussed in the review, D61944 fixes my second reproducer (the one without `num_teams(3)`), but it does not fix the first reproducer for me. I tried with the patch applied on top of r360778. I'm noticing now that I don't always see the same assert fail from the first reproducer. Currently, I more often see: a.out: ../nptl/pthread_mutex_lock.c:79: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed. Aborted (core dumped) But I still frequently see the one I originally reported: Assertion failure at kmp_runtime.cpp(4294): new_thr->th.th_active == (!0). OMP: Error #13: Assertion failure at kmp_runtime.cpp(4294). OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://bugs.llvm.org/. Aborted (core dumped) Again, it's racy. If I run the compiled executable in a shell while loop, I usually see the failure within just a few seconds. If there's any more information I can provide, please let me know.
(In reply to Joel E. Denny from comment #2) > (In reply to Andrey Churbanov from comment #1) > > Issue reproduced, working on a fix. > > As discussed in the review, D61944 fixes my second reproducer (the one > without `num_teams(3)`), but it does not fix the first reproducer for me. I > tried with the patch applied on top of r360778. > > I'm noticing now that I don't always see the same assert fail from the first > reproducer. Currently, I more often see: > > a.out: ../nptl/pthread_mutex_lock.c:79: __pthread_mutex_lock: Assertion > `mutex->__data.__owner == 0' failed. > Aborted (core dumped) > > But I still frequently see the one I originally reported: > > Assertion failure at kmp_runtime.cpp(4294): new_thr->th.th_active == (!0). > OMP: Error #13: Assertion failure at kmp_runtime.cpp(4294). > OMP: Hint Please submit a bug report with this message, compile and run > commands used, and machine configuration info including native compiler and > operating system versions. Faster response will be obtained by including all > program sources. For information on submitting this issue, please see > https://bugs.llvm.org/. > Aborted (core dumped) > > Again, it's racy. If I run the compiled executable in a shell while loop, I > usually see the failure within just a few seconds. > > If there's any more information I can provide, please let me know. Yes, please provide some more info. Which HW are you using? Is there real offload happen in your execution, or target region runs on host? To me the pthreads failure looks like a memory corruption. But I am not 100% sure. Anyway, I will try to reproduce the failure once more. Thanks, Andrey
(In reply to Andrey Churbanov from comment #3) > Yes, please provide some more info. > Which HW are you using? Is there real offload happen in your execution, or > target region runs on host? Host (x86_64). I'm compiling with only -fopenmp. I just tried with -fopenmp-targets=nvptx64, and so far it doesn't reproduce then.
Joel, I've just committed second fix for another assertion (that was indeed different problem). Thanks to Johnny Peyton for the investigation and the fix provided. Please check if it works for you, when you have time.
(In reply to Andrey Churbanov from comment #5) > Joel, > > I've just committed second fix for another assertion (that was indeed > different problem). Thanks to Johnny Peyton for the investigation and the > fix provided. Please check if it works for you, when you have time. That assert seems to be fixed. Thanks. However, the same test case (with `num_teams(3)`) targeting host now sometimes fails a nearby assert: Assertion failure at kmp_runtime.cpp(4300): new_thr->th.th_active == 0. OMP: Error #13: Assertion failure at kmp_runtime.cpp(4300). OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://bugs.llvm.org/. Aborted (core dumped) Again, I run the executable in a shell while loop. Sometimes it fails in a few seconds. One time it took nearly 20 minutes.
Last problem fixed at <https://reviews.llvm.org/D62251>. Thanks.