This outputs (size_t)-1 in both cases with `gcc -O2` but clang gives 0 for the second case: #include <stdio.h> int main(int argc, char **argv) { printf("%zu\n", __builtin_object_size(NULL, 0)); void *p = NULL; printf("%zu\n", __builtin_object_size(p, 0)); return 0; } This is broken with glibc's _FORTIFY_SOURCE implementation for nullable parameters. It assumes that it can do comparisons like `object_size(buffer) < copy_size` after handling the unknown case, but this can result in a false positive with Clang. It's not a glibc bug because they're using GNU C as it is defined by GCC, which is where this feature comes from.
__builtin_object_size(p, 0) gives an upper bound on the number of bytes that can be accessed through pointer p, so both 0 and -1 seem like correct answers here, since LLVM explicitly assumes that there cannot exist a valid object at address 0. As you say, we may need to deliberately give a worse answer here to placate glibc. Which glibc function in particular are you having problems with, and how does it use __builtin_object_size? Whatever we do, the first case (constant-folded by Clang) and the second case (constant-folded by LLVM) should give the same answer.
Also, _FORTIFY_SOURCE + glibc + clang is not supported and does not work (for instance, it relies on __builtin_va_pack_len and friends, which we have no intention of supporting), so glibc compatibility is unlikely to be a strong motivator for a change here.
The realpath(name, resolved) function is one example. The second parameter can be NULL, in which case realpath allocates the string itself with malloc(). The glibc code has a check for (size_t)-1 and then fails if the detected size is smaller than PATH_MAX. This occurs in all of the _FORTIFY_SOURCE functions with similar nullable buffer parameters. I'm working on extending _FORTIFY_SOURCE in Android's libc (Bionic) and this ends up being annoying because the distinction between zero-size and NULL is important. It has to be handled like this: #if defined(__clang__) if (bos == 0 && !(__builtin_constant_p(resolved) && resolved != NULL)) { return __realpath_real(path, resolved); } #endif The code then checks for bos < PATH_MAX and errors out. I don't think 0 is necessarily a better answer because it makes it seem like there's a 0-size object when NULL really means there's no object at all. It's easy enough to detect NULL if you actually want to handle it separately, but I don't think that's the common case. It's harder to undo the merge of the two cases.
I know standard C doesn't actually have 0-size objects, but GNU C and other languages like Rust *do*, as does LLVM IR.
I would expect the common case to be that a null pointer is treated like a pointer to an object of size 0 -- not inherently an invalid argument, but invalid if the function requires the ability to access a nonzero number of bytes. If the function explicitly specifies that it has special behavior for a null argument, I'd expect the fortify check to also have an explicit special case for a null argument. In LLVM IR, the null pointer acts essentially like a pointer to a zero-sized object. FWIW, if we could prove that a pointer did not point to an object at all, I think 0 would be the right thing for __builtin_object_size to return when applied to that object -- the answer to the question "how many bytes can I access from this pointer?" is zero.
If it can figure out that there's a dangling pointer, 0 is definitely the right answer. I don't think it's the right way to handle NULL though.
(In reply to comment #1) > __builtin_object_size(p, 0) gives an upper bound on the number of bytes that > can be accessed through pointer p, so both 0 and -1 seem like correct > answers here From the docs: "type is an integer constant from 0 to 3. (...) The second bit determines if maximum or minimum of remaining bytes is computed." Which results in: When in doubt, "__builtin_object_size should return (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3." I think the docs are clear enough in this case. This has nothing to do with fortify or glibc.
This incompatibility just bit us in ChromeOS-land, so I'll throw out my 2c: Were I designing __builtin_object_size from the ground up, I'd totally agree that __builtin_object_size(NULL, N) should lower to 0. That said, my understanding of __builtin_object_size in clang is that it exists for compatibility with gcc. This NULL-pointer (mis-)feature is clearly relied on by glibc, and I'm uncertain how many other users of __builtin_object_size depend on it (implicitly or explicitly). Because __builtin_object_size, in my mind, exists primarily for compatibility, I don't see why we can't try to better match this corner case. I'm happy to make clang answer conservatively in this case; if I do, would you be okay with it Richard?
Re comment#7: > When in doubt, "__builtin_object_size should return (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3." But there is no doubt as to how many bytes can be accessed through a null pointer. The answer is zero bytes. > This has nothing to do with fortify or glibc. I disagree. Giving 0 for a null pointer is sound and principled, and the obvious right answer; the only reason to do something else is compatibility with GCC or with code that relies on the GCC semantics. And, to be clear, that compatibility *is* important. Re comment#8: > I'm happy to make clang answer conservatively in this case; if I do, would you be okay with it Richard? Yes, this seems like the best thing to do. But it's not clear that it's Clang you need to change; the "wrong" answer is coming from LLVM's lowering of @llvm.objectsize, and for that intrinsic, zero is definitely the correct answer for a null pointer. Ideally, we should solve this in a way that doesn't insert a branch for every non-constant use of __builtin_object_size...
> But it's not clear that it's Clang you need to change; the "wrong" answer is coming from LLVM's lowering of @llvm.objectsize, and for that intrinsic, zero is definitely the correct answer for a null pointer Yup! > Ideally, we should solve this in a way that doesn't insert a branch for every non-constant use of __builtin_object_size... I was thinking of adding a bit to @llvm.objectsize that controls how it handles NULL, similar to how it has the min/max bit already. If we make the default "null is 0 bytes", I'd imagine that should be fairly painless.
If the people on the llvm side are OK with that, it's fine by me.
(In reply to comment #10) > I was thinking of adding a bit to @llvm.objectsize that controls how it > handles NULL, similar to how it has the min/max bit already. If we make the > default "null is 0 bytes", I'd imagine that should be fairly painless. I recommend you send this as an RFC to the LLVM dev list before trying to implement it (unless it's really quick and serves as a demonstration). Changing core IR builtins should be done with a larger consensus.
Fixed by r298430+r298431. Thanks!