LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 20144 - strcmp() from glibc 2.20 triggers -Warray-bounds when __OPTIMIZE__ is defined
Summary: strcmp() from glibc 2.20 triggers -Warray-bounds when __OPTIMIZE__ is defined
Status: NEW
Alias: None
Product: clang
Classification: Unclassified
Component: Frontend (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Unassigned Clang Bugs
URL:
Keywords:
: 15979 21614 24025 28583 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-06-27 03:12 PDT by Pelle Johansson
Modified: 2016-07-17 12:48 PDT (History)
11 users (show)

See Also:
Fixed By Commit(s):


Attachments
Pre-processed source. (36.23 KB, text/plain)
2014-06-27 08:16 PDT, Eddy Jansson
Details
minimal repro (146 bytes, text/x-csrc)
2015-08-07 17:01 PDT, Martin Thomson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pelle Johansson 2014-06-27 03:12:16 PDT
clang -O2 -E generates bad code for this code:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
	if (argc > 3) {
		if (strcmp(argv[1], "-4") == 0) {
			printf("wat\n");
		}
	}
	return 0;
}

It seems to believe sizeof("-4") is 4 rather than 3.

Might be a duplicate of 11536, not sure.


Steps to reproduce:

clang -O2 -E bug.c | clang -Werror -x c -


Expected result:

bug.o generated


Actual result:

dummy.c:6:2109: error: array index 3 is past the end of the array (which contains 3 elements) [-Werror,-Warray-bounds]
  ...> 2 && __result == 0) __result = (__s1[3] - ((__const unsigned char *) (__const char *) ("-4"))[3]); } } __result; }))) : __...
                                                                                              ^      ~
1 error generated.


Version info:
clang version 3.5.0 (http://llvm.org/git/clang.git db5f8b25707a23dd5852a35539775f7e8cf8896e) (http://llvm.org/git/llvm.git 0263debd04a3a827ac0e11968082d1c6fe0a9902)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Found candidate GCC installation: /opt/rh/devtoolset-2/root/usr/lib/gcc/x86_64-redhat-linux/4.8.2
Selected GCC installation: /opt/rh/devtoolset-2/root/usr/lib/gcc/x86_64-redhat-linux/4.8.2
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64

on CentOS.
Also tested with 3.4 release.
Comment 1 Dimitry Andric 2014-06-27 07:38:16 PDT
Can you please post the preprocessed code?  I think this is caused by the following macro in glibc's /usr/include/bits/string2.h header:

# define __strcmp_gc(s1, s2, l2) \
  (__extension__ ({ __const unsigned char *__s1 =                             \
                      (__const unsigned char *) (__const char *) (s1);        \
                    register int __result =                                   \
                      __s1[0] - ((__const unsigned char *)                    \
                                 (__const char *) (s2))[0];                   \
                    if (l2 > 0 && __result == 0)                              \
                      {                                                       \
                       	__result = (__s1[1]                                   \
                                    - ((__const unsigned char *)              \
                                       (__const char *) (s2))[1]);            \
                       	if (l2 > 1 && __result == 0)                          \
                          {                                                   \
                            __result =                                        \
                              (__s1[2] - ((__const unsigned char *)           \
                                          (__const char *) (s2))[2]);         \
                            if (l2 > 2 && __result == 0)                      \
                              __result =                                      \
                               	(__s1[3]                                      \
                                 - ((__const unsigned char *)                 \
                                    (__const char *) (s2))[3]);               \
                          }                                                   \
                      }                                                       \
                    __result; }))
#endif

Normally, warnings such as you are seeing are suppressed for macros, so that unused paths like the one that accesses __s1[3] are not warned about.  But if you preprocess manually, and feed the result to another instance of clang, it will warn about all the possible forks of this particular if statement.
Comment 2 Eddy Jansson 2014-06-27 08:16:47 PDT
Created attachment 12713 [details]
Pre-processed source.
Comment 3 Eddy Jansson 2014-06-27 08:18:31 PDT
> Can you please post the preprocessed code?  I think this is caused by the
> following macro in glibc's /usr/include/bits/string2.h header:

Okay, that seems like the likely culprit then. I'm quite happy with this explanation.

> Normally, warnings such as you are seeing are suppressed for macros, so that
> unused paths like the one that accesses __s1[3] are not warned about.  But
> if you preprocess manually, and feed the result to another instance of
> clang, it will warn about all the possible forks of this particular if
> statement.

Ok. For context, we ran into this trying to build our code with ccache 3.1.x, so I guess that setup isn't really compatible then, unless maybe we use CCACHE_CPP2=1
Comment 4 Richard Smith 2014-06-27 08:51:50 PDT
(In reply to comment #3)
> > Can you please post the preprocessed code?  I think this is caused by the
> > following macro in glibc's /usr/include/bits/string2.h header:
> 
> Okay, that seems like the likely culprit then. I'm quite happy with this
> explanation.
> 
> > Normally, warnings such as you are seeing are suppressed for macros, so that
> > unused paths like the one that accesses __s1[3] are not warned about.  But
> > if you preprocess manually, and feed the result to another instance of
> > clang, it will warn about all the possible forks of this particular if
> > statement.
> 
> Ok. For context, we ran into this trying to build our code with ccache
> 3.1.x, so I guess that setup isn't really compatible then, unless maybe we
> use CCACHE_CPP2=1

I'm not sure how to get ccache to do this, but the trick will be to run the preprocessing step with -frewrite-includes. That preserves enough information that we should still be able to suppress the diagnostic.
Comment 5 Eddy Jansson 2014-06-27 10:12:25 PDT
(In reply to comment #4)
> I'm not sure how to get ccache to do this, but the trick will be to run the
> preprocessing step with -frewrite-includes. That preserves enough
> information that we should still be able to suppress the diagnostic.

Thanks for the tip, though in practice I haven't gotten it to work yet.

While adding -frewrite-includes fixes it on the original example, if we change the example to also use say 'strcasestr', the preprocessed code won't compile unless we pass -D_GNU_SOURCE on the /compile step/

$ ccache clang -frewrite-includes -D_GNU_CACHE -O2 -c dummy.c
In file included from /home/eddy/.ccache/tmp/dummy.tmp.dev214.15827.i:1:
In file included from <built-in>:1:
dummy.c:7:8: warning: implicit declaration of function 'strcasestr' is invalid in C99 [-Wimplicit-function-declaration]
                        if (strcasestr(argv[1], "apa"))
                            ^
1 warning generated.

I assume this is roughly what happens in ccache:
clang -O1 -D_GNU_SOURCE -E -frewrite-includes dummy.c > cache-file.i
(calculate hash for cache-file.i)
clang cache-file.i

and here cache-file.i doesn't compile because the define was essentially lost?

This however works:

$ clang -O1 -E -frewrite-includes dummy.c | clang -D_GNU_SOURCE -x c -
Comment 6 Richard Smith 2014-06-27 11:14:33 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > I'm not sure how to get ccache to do this, but the trick will be to run the
> > preprocessing step with -frewrite-includes. That preserves enough
> > information that we should still be able to suppress the diagnostic.
> 
> Thanks for the tip, though in practice I haven't gotten it to work yet.
> 
> While adding -frewrite-includes fixes it on the original example, if we
> change the example to also use say 'strcasestr', the preprocessed code won't
> compile unless we pass -D_GNU_SOURCE on the /compile step/

Yes, because -frewrite-includes doesn't expand macros, you'll need to pass the same -D flags to both the "preprocess" and the compile steps.

Perhaps we should translate command-line -D flags into #define directives in the output when we're in -frewrite-includes mode. (Patches welcome...)
Comment 7 Reid Kleckner 2015-07-03 10:05:28 PDT
*** Bug 21614 has been marked as a duplicate of this bug. ***
Comment 8 Reid Kleckner 2015-07-03 10:05:49 PDT
Re-titling.
Comment 9 Reid Kleckner 2015-07-03 10:07:22 PDT
*** Bug 24025 has been marked as a duplicate of this bug. ***
Comment 10 Martin Thomson 2015-08-07 17:01:15 PDT
Created attachment 14710 [details]
minimal repro

Attached is a minimal repro case.

Compiling with ``clang main.c -c -O2 -Wall -Werror`` fails.

Also, I noted a similar problem with a (crazy) call to tolower:

  map[tolower(start++)] = 1;

x.c:199:46: error: expression with side effects has no effect in an unevaluated context [-Werror,-Wunevaluated-expression]
 map[(__extension__ ({ int __res; if (sizeof (start++) > 1) { if (__builtin_constant_p (start++)) { int __c = (start++); __res = __c < -128 || __c > 255 ? __c : (*__ctype_tolower_loc ())[__c]; } else __res = tolower (start++); } else __res = (*__ctype_tolower_loc ())[(int) (start++)]; __res; }))] = 1;

This looks to be symptomatic of a general problem with macro expansion happening at the preprocessor leaving the compilation stage ignorant of where the macros came from.  I'm getting a lot of spurious -Wparentheses-equality warnings from macros in the following form:

#define M(X, Y) ((X) == (Y))
if (M(a, b)) // boom!
Comment 11 Richard Smith 2015-08-07 17:09:35 PDT
(In reply to comment #10)
> Compiling with ``clang main.c -c -O2 -Wall -Werror`` fails.

It works fine for me. As noted earlier in the bug, if you're using -E and feeding the result back into clang (maybe your 'clang' is a symlink to ccache and it's doing this for you?), you need to also use -frewrite-includes to prevent macro expansion from being performed in the "preprocessing" step.
Comment 12 Petr Špaček 2016-03-01 06:30:02 PST
*** Bug 15979 has been marked as a duplicate of this bug. ***
Comment 13 Petr Špaček 2016-03-01 06:31:11 PST
This is 100 % reproducible with:
glibc-2.22-10.fc23.x86_64
clang-3.7.0-4.fc23.x86_64

It is no longer reproducible with:
glibc-2.23.1-5.fc24.x86_64
clang-3.8.0-0.3.fc24.x86_64

I do not why, if a change happened in Glibc or Clang.
Comment 14 Dimitry Andric 2016-03-01 12:07:12 PST
The macro definitions haven't fundamentally changed:

https://sourceware.org/git/?p=glibc.git;a=blob;f=string/bits/string2.h;h=8200ef173da116589e2ff2e321984e80b1bb40e1;hb=HEAD#l797

However, maybe the "#if __GNUC_PREREQ (3, 2)" part is now chosen?  In that case it would use __builtin_strcmp() and avoid all the nastiness with comparing characters directly.
Comment 15 Reid Kleckner 2016-07-17 12:48:30 PDT
*** Bug 28583 has been marked as a duplicate of this bug. ***