Why is it invalid for a union type declared in one function to be used in another function?
When I read ISO/IEC 9899:1999 (see:6.5.2.3), I saw an example like this (emphasis mine) :
The following is not a valid fragment (because the union type is not visible within function f):
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 * p1, struct t2 * p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
I found no errors and warnings when I tested.
My question is: Why is this fragment invalid?
The example attempts to illustrate the paragraph beforehand1 (emphasis mine):
6.5.2.3 ¶6
One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common initial
sequence (see below), and if the union object currently contains one
of these structures, it is permitted to inspect the common initial
part of any of them anywhere that a declaration of the completed type
of the union is visible. Two structures share a common initial
sequence if corresponding members have compatible types (and, for
bit-fields, the same widths) for a sequence of one or more initial
members.
Since f is declared before g, and furthermore the unnamed union type is local to g, there is no questioning the union type isn't visible in f.
The example doesn't show how u is initialized, but assuming the last written to member is u.s2.m, the function has undefined behavior because it inspects p1->m without the common initial sequence guarantee being in effect.
Same goes the other way, if it's u.s1.m that was last written to before the function call, than accessing p2->m is undefined behavior.
Note that f itself is not invalid. It's a perfectly reasonable function definition. The undefined behavior stems from passing into it &u.s1 and &u.s2 as arguments. That is what's causing undefined behavior.
1 - I'm quoting n1570, the C11 standard draft. But the specification should be the same, subject only to moving a paragraph or two up/down.
Here is the strict aliasing rule in action: one assumption made by the C (or C++) compiler, is that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)
This function
int f(struct t1* p1, struct t2* p2);
assumes that p1 != p2 because they formally point to different types. As a result the optimizatier may assume that p2->m = -p2->m; have no effect on p1->m; it can first read the value of p1->m to a register, compare it with 0, if it compare less than 0, then do p2->m = -p2->m; and finally return the register value unchanged!
The union here is the only way to make p1 == p2 on binary level because all union member have the same address.
Another example:
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1* p1, struct t2* p2)
{
if (p1->m < 0) p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
u.s1.m = -1;
return f(&u.s1, &u.s2);
}
What must g return? +1 according to common sense (we change -1 to +1 in f). But if we look at gcc's generate assembly with -O1 optimization
f:
cmp DWORD PTR [rdi], 0
js .L3
.L2:
mov eax, DWORD PTR [rdi]
ret
.L3:
neg DWORD PTR [rsi]
jmp .L2
g:
mov eax, 1
ret
So far all is as excepted. But when we try it with -O2
f:
mov eax, DWORD PTR [rdi]
test eax, eax
js .L4
ret
.L4:
neg DWORD PTR [rsi]
ret
g:
mov eax, -1
ret
The return value is now a hardcoded -1
This is because f at the beginning caches the value of p1->m in the eax register (mov eax, DWORD PTR [rdi]) and does not reread it after p2->m = -p2->m; (neg DWORD PTR [rsi]) - it returns eax unchanged.
union here used only for
all non-static data members of a union object have the same address. as result &u.s1 == &u.s2.
is somebody not understand assembler code, can show in c/c++ how strict aliasing affect f code:
int f(struct t1* p1, struct t2* p2)
{
int a = p1->m;
if (a < 0) p2->m = -p2->m;
return a;
}
compiler cache p1->m value in local var a (actually in register of course) and return it , despite p2->m = -p2->m; change p1->m. but compiler assume that p1 memory not affected, because it assume that p2 point to another memory which not overlap with p1
so with different compilers and different optimization level the same source code can return different values (-1 or +1). so and undefined behavior as is
One of the major purposes of the Common Initial Sequence rule is to allow functions to operate on many similar structures interchangeably. Requiring that compilers presume that any function which acts upon a structure might change the corresponding member in any other structure that shares a common initial sequence, however, would have impaired useful optimizations.
Although most code which relies upon the Common Initial Sequence guarantees makes use of a few easily recognizable patterns, e.g.
struct genericFoo {int size; short mode; };
struct fancyFoo {int size; short mode, biz, boz, baz; };
struct bigFoo {int size; short mode; char payload[5000]; };
union anyKindOfFoo {struct genericFoo genericFoo;
struct fancyFoo fancyFoo;
struct bigFoo bigFoo;};
...
if (readSharedMemberOfGenericFoo( myUnion->genericFoo ))
accessThingAsFancyFoo( myUnion->fancyFoo );
return readSharedMemberOfGenericFoo( myUnion->genericFoo );
revisiting the union between calls to functions that act on different union members, the authors of the Standard specified that visibility of the union type within the called function should be the determining factor for whether functions should recognize the possibility that an access to e.g. field mode of a FancyFoo might affect field mode of a genericFoo. The requirement to have a union containing all types of structures whose address might be passed to readSharedMemberOfGeneric in the same compilation unit as that function makes the Common Initial Sequence rule less useful than it would otherwise be, but would make at least allow some patterns like the above usable.
The authors of gcc and clang thought that treating union declarations as an indication that the types involved might be involved in constructs like the above would be an impractical impediment to optimization, however, and figured that since the Standard doesn't require them to support such constructs via other means, they'll simply not support them at all. Consequently, the real requirement for code that would need to exploit the Common Initial Sequence guarantees in any meaningful fashion is not to ensure that a union type declaration is visible, but to ensure that clang and gcc are invoked with the -fno-strict-aliasing flag. Also including a visible union declaration when practical wouldn't hurt, but it is neither necessary nor sufficient to ensure correct behavior from gcc and clang.
The following is not a valid fragment (because the union type is not visible within function f):
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 * p1, struct t2 * p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
I found no errors and warnings when I tested.
My question is: Why is this fragment invalid?
The example attempts to illustrate the paragraph beforehand1 (emphasis mine):
6.5.2.3 ¶6
One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common initial
sequence (see below), and if the union object currently contains one
of these structures, it is permitted to inspect the common initial
part of any of them anywhere that a declaration of the completed type
of the union is visible. Two structures share a common initial
sequence if corresponding members have compatible types (and, for
bit-fields, the same widths) for a sequence of one or more initial
members.
Since f is declared before g, and furthermore the unnamed union type is local to g, there is no questioning the union type isn't visible in f.
The example doesn't show how u is initialized, but assuming the last written to member is u.s2.m, the function has undefined behavior because it inspects p1->m without the common initial sequence guarantee being in effect.
Same goes the other way, if it's u.s1.m that was last written to before the function call, than accessing p2->m is undefined behavior.
Note that f itself is not invalid. It's a perfectly reasonable function definition. The undefined behavior stems from passing into it &u.s1 and &u.s2 as arguments. That is what's causing undefined behavior.
1 - I'm quoting n1570, the C11 standard draft. But the specification should be the same, subject only to moving a paragraph or two up/down.
Here is the strict aliasing rule in action: one assumption made by the C (or C++) compiler, is that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)
This function
int f(struct t1* p1, struct t2* p2);
assumes that p1 != p2 because they formally point to different types. As a result the optimizatier may assume that p2->m = -p2->m; have no effect on p1->m; it can first read the value of p1->m to a register, compare it with 0, if it compare less than 0, then do p2->m = -p2->m; and finally return the register value unchanged!
The union here is the only way to make p1 == p2 on binary level because all union member have the same address.
Another example:
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1* p1, struct t2* p2)
{
if (p1->m < 0) p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
u.s1.m = -1;
return f(&u.s1, &u.s2);
}
What must g return? +1 according to common sense (we change -1 to +1 in f). But if we look at gcc's generate assembly with -O1 optimization
f:
cmp DWORD PTR [rdi], 0
js .L3
.L2:
mov eax, DWORD PTR [rdi]
ret
.L3:
neg DWORD PTR [rsi]
jmp .L2
g:
mov eax, 1
ret
So far all is as excepted. But when we try it with -O2
f:
mov eax, DWORD PTR [rdi]
test eax, eax
js .L4
ret
.L4:
neg DWORD PTR [rsi]
ret
g:
mov eax, -1
ret
The return value is now a hardcoded -1
This is because f at the beginning caches the value of p1->m in the eax register (mov eax, DWORD PTR [rdi]) and does not reread it after p2->m = -p2->m; (neg DWORD PTR [rsi]) - it returns eax unchanged.
union here used only for
all non-static data members of a union object have the same address. as result &u.s1 == &u.s2.
is somebody not understand assembler code, can show in c/c++ how strict aliasing affect f code:
int f(struct t1* p1, struct t2* p2)
{
int a = p1->m;
if (a < 0) p2->m = -p2->m;
return a;
}
compiler cache p1->m value in local var a (actually in register of course) and return it , despite p2->m = -p2->m; change p1->m. but compiler assume that p1 memory not affected, because it assume that p2 point to another memory which not overlap with p1
so with different compilers and different optimization level the same source code can return different values (-1 or +1). so and undefined behavior as is
One of the major purposes of the Common Initial Sequence rule is to allow functions to operate on many similar structures interchangeably. Requiring that compilers presume that any function which acts upon a structure might change the corresponding member in any other structure that shares a common initial sequence, however, would have impaired useful optimizations.
Although most code which relies upon the Common Initial Sequence guarantees makes use of a few easily recognizable patterns, e.g.
struct genericFoo {int size; short mode; };
struct fancyFoo {int size; short mode, biz, boz, baz; };
struct bigFoo {int size; short mode; char payload[5000]; };
union anyKindOfFoo {struct genericFoo genericFoo;
struct fancyFoo fancyFoo;
struct bigFoo bigFoo;};
...
if (readSharedMemberOfGenericFoo( myUnion->genericFoo ))
accessThingAsFancyFoo( myUnion->fancyFoo );
return readSharedMemberOfGenericFoo( myUnion->genericFoo );
revisiting the union between calls to functions that act on different union members, the authors of the Standard specified that visibility of the union type within the called function should be the determining factor for whether functions should recognize the possibility that an access to e.g. field mode of a FancyFoo might affect field mode of a genericFoo. The requirement to have a union containing all types of structures whose address might be passed to readSharedMemberOfGeneric in the same compilation unit as that function makes the Common Initial Sequence rule less useful than it would otherwise be, but would make at least allow some patterns like the above usable.
The authors of gcc and clang thought that treating union declarations as an indication that the types involved might be involved in constructs like the above would be an impractical impediment to optimization, however, and figured that since the Standard doesn't require them to support such constructs via other means, they'll simply not support them at all. Consequently, the real requirement for code that would need to exploit the Common Initial Sequence guarantees in any meaningful fashion is not to ensure that a union type declaration is visible, but to ensure that clang and gcc are invoked with the -fno-strict-aliasing flag. Also including a visible union declaration when practical wouldn't hurt, but it is neither necessary nor sufficient to ensure correct behavior from gcc and clang.
Comments
Post a Comment