9
I have some C code:
#include "stdio.h"
typedef struct num {
unsigned long long x;
} num;
int main(int argc, char **argv) {
struct num anum;
anum.x = 0;
__asm__("movq %%rax, %0n" : "=m" (anum.x) : "rax"(2));
printf("%llun",anum.x);
}
which I’m compiling and running on my (intel) Mac laptop.
The output from the code seems to be different depending on whether I compile with (gnu) gcc or clang.
I compile with gnucc -o gnu-test test.c
for gcc (I built gnucc
from source on my mac after downloading the source from https://gcc.gnu.org/install/download.html)
and clang -o clang-test test.c
for clang (built-in macos clang)
On my mc, with gnu, the result is 2
(which is what I expect). With clang, the result is 140701838959608
.
The clang result seems wrong to me, but I’m also wondering if, perhaps, my inline assembly isn’t quite correct and gcc just happens to not expose my error.
I tried out the same code on godbolt.org and the output there is also different for gcc (x86-64 gcc 13.2 gives 2
) and clang (x86-64 clang 16.0.0 gives 140726522786920
)
I tried disassembling the clang binary with objdump -d
:
clang-test: file format mach-o 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000100003f60 <_main>:
100003f60: 55 pushq %rbp
100003f61: 48 89 e5 movq %rsp, %rbp
100003f64: 48 83 ec 20 subq $32, %rsp
100003f68: 89 7d fc movl %edi, -4(%rbp)
100003f6b: 48 89 75 f0 movq %rsi, -16(%rbp)
100003f6f: 48 c7 45 e8 00 00 00 00 movq $0, -24(%rbp)
100003f77: 48 8d 45 e8 leaq -24(%rbp), %rax
100003f7b: b9 02 00 00 00 movl $2, %ecx
100003f80: 48 89 00 movq %rax, (%rax)
100003f83: 48 8b 75 e8 movq -24(%rbp), %rsi
100003f87: 48 8d 3d 16 00 00 00 leaq 22(%rip), %rdi ## 0x100003fa4 <_printf+0x100003fa4>
100003f8e: b0 00 movb $0, %al
100003f90: e8 09 00 00 00 callq 0x100003f9e <_printf+0x100003f9e>
100003f95: 31 c0 xorl %eax, %eax
100003f97: 48 83 c4 20 addq $32, %rsp
100003f9b: 5d popq %rbp
100003f9c: c3 retq
Disassembly of section __TEXT,__stubs:
0000000100003f9e <__stubs>:
100003f9e: ff 25 5c 00 00 00 jmpq *92(%rip) ## 0x100004000 <_printf+0x100004000>
and 100003f80: 48 89 00 movq %rax, (%rax)
seems to be the issue? clang has the correct value in ecx
and the correct address to write to in rax
but it does movq %rax, (%rax)
instead of movq %rcx, (%rax)
?
6
1 Answer
Reset to default
15
Clang is generating correct code, but you specified the incorrect constraint on the input operand.
The constraint ("rax"
) is not interpreted as a register name. Instead, each letter in the constraint specifies an allowed operand type. The first letter here, r
, allows using any general register, which makes the choice of rcx
valid.
To constrain to the rax
register, you need to use the "a"
constraint. See the x86 section in the machine constraints page.
__asm__("movq %%rax, %0n" : "=m" (anum.x) : "a"(2));
6
-
2
@mtraceur
"rax"
is not treated as a register name, but as a combination of 3 different constraints r, a, and x. The correct way to specify a constraint on rax, eax, or ax (depending on the operand size) is the use the"a"
constraint instead of"rax"
.– interjay15 hours ago
-
2
interjay, so the constraints are union rather then intersection, I guess. In other words,
r
(any register) anda
(ax-type register) would be permissivelyr
rather than restrictivelya
?– paxdiablo15 hours ago
-
3
@paxdiablo Yes. The GCC documentation says: The simplest kind of constraint is a string full of letters, each of which describes one kind of operand that is permitted.
– interjay15 hours ago
-
2
Just a side note: there is an experimental GCC port for ia86 that breaks with this tradition of one letter = one constraint. stackoverflow.com/questions/62686259/…
– Michael Petch14 hours ago
-
3
Great, thanks for clarifying. Here's a +1, and I recommend adding "
"rax"
is not treated as a register name, but as a combination of 3 different constraints r, a, and x" and "when combiningr
(any register) anda
(ax-type register),r
wins" to the answer itself.– mtraceur13 hours ago
Not the answer you're looking for? Browse other questions tagged
or ask your own question.
or ask your own question.
It does it correctly: godbolt.org/z/44oTGMxcY but only if you enable optimizations as your assumptions are not valid without the optimizations
15 hours ago
I hate gas syntax, but I can't find the expected "movq $0, %rax" instruction in the assembly output so I think something's wrong here.
15 hours ago
@Joshua: Where were you expecting to see a mov-immediate of
0
to a register? In a debug build (which this obviously is),anum.x = 0;
compiles tomovq $0, -24(%rbp)
. (Theleaq -24(%rbp), %rax
is preparing a register for(%rax)
to be the addressing mode for"=m"
.)movb $0, %al
is zeroing AL to tell the variadic printf there are zero XMM register args.xorl %eax, %eax
implements the implicitreturn 0
at the bottom ofmain
. There's never any reason for a compiler to emitmovq $0
to a register; at most it'd usemovl $0, %eax
if not xor-zeroing.12 hours ago
@PeterCordes: I'm expecting to see the actual instruction in the asm block somewhere in the compiler output.
12 hours ago
Ah I see. The asm template is
"movq %%rax, %0"
, also using AT&T syntax, so it's moving RAX to whatever the%0
placeholder expands to (a lot like a printf format string, hence the %% to get a literal %).movq %rax, (%rax)
is the expansion of the asm template. (The compiler picked RAX for the address, and a different register the template didn't use for the"rax"(2)
input constraint.) This would be clearer if the OP looked at the compiler's asm output (clang -S
) instead of disassembly, since then they could put a comment inside the asm template and make it super easy to find it.11 hours ago
|
Show 1 more comment