Aligned thread-locals initialized to incorrect value
Posted: Wed Jun 15, 2022 2:10 pm
Hello! I have been debugging an issue for a while in which I found the initial value of __thread variables to be wrong (building for the 3DS), under certain circumstances. The following code reproduces the issue. Notably, changing these seems to resolve it:
From looking at objdump output, what I believe is happening (although I am no expert) is that the linker appears to generating incorrect offsets for the thread-local variables. The constant pool at the end of main looks like this:
Whereas the thread-local initializer data looks like this, which (if I understand correctly) would seem to indicate that the offsets should be 0x1C and 0xC, respectively (including the 0x8 ARM thread-local offset). Hex-editing the binary to 0x1C and 0xC results in the expected behavior.
The reason I suspect the linker is that the object file itself has all zero offsets (I presume these get filled in during relocation of the object file during linking)?
I'm hoping for any ideas about what might be causing this (am I accidentally creating UB or something?), or other workarounds to emit the proper offsets for thread-locals with alignment like this. I tried changing some ALIGN() directives in the 3dsx.ld linker script, but got inconsistent results and I'm a bit out of my depth when it comes to writing linker scripts, so I'm hoping someone here can point me in the right direction.
- using ALIGN(8) or less for BUF_16
- building with -O1 or higher
Code: Select all
#include <3ds.h>
#include <stdio.h>
#include <string.h>
typedef ALIGN(4) struct {
u8 inner[3];
} Align4;
typedef ALIGN(16) struct {
u8 inner[3];
} Align16;
static __thread Align4 BUF_4 = {.inner = {2, 2, 2}};
static __thread Align16 BUF_16 = {.inner = {1, 1, 1}};
int
main(int argc, char** argv)
{
gfxInitDefault();
consoleInit(GFX_TOP, NULL);
BUF_16.inner[0] = 0;
bool reproduced = false;
printf("[");
for (int i = 0; i < 3; i++) {
if (BUF_4.inner[i] != 2) {
reproduced = true;
}
printf("%d, ", BUF_4.inner[i]);
}
printf("]\n");
if (reproduced) {
printf("reproduced!\n");
}
else {
printf("nope");
}
// Main loop
while (aptMainLoop()) {
gspWaitForVBlank();
hidScanInput();
u32 kDown = hidKeysDown();
if (kDown & KEY_START)
break; // break in order to return to hbmenu
// Flush and swap framebuffers
gfxFlushBuffers();
gfxSwapBuffers();
}
gfxExit();
return 0;
}
Code: Select all
100d2c: eb0013c3 bl 105c40 <__aeabi_read_tp>
100d30: e1a03000 mov r3, r0
100d34: e59f2114 ldr r2, [pc, #276] ; 100e50 <main+0x148> ; example use of the offsets, in this case BUF_16
100d38: e3a01000 mov r1, #0
100d3c: e7c31002 strb r1, [r3, r2] ; BUF_16.inner[0] = 0
...
100e4c: e8bd8800 pop {fp, pc}
100e50: 00000024 .word 0x00000024 ; offset for BUF_16
100e54: 00000014 .word 0x00000014 ; offset for BUF_4
100e58: 00121000 .word 0x00121000
100e5c: 00121008 .word 0x00121008
100e60: 0012100c .word 0x0012100c
100e64: 00121018 .word 0x00121018
Code: Select all
Contents of section .tdata:
12af0c 00000000 02020200 00000000 00000000 ................
12af1c 00000000 01010100 ........
Code: Select all
140: e24bd004 sub sp, fp, #4
144: e8bd8800 pop {fp, pc}
148: 00000000 .word 0x00000000 ; BUF_16
14c: 00000000 .word 0x00000000 ; BUF_4
150: 00000000 .word 0x00000000
154: 00000008 .word 0x00000008
158: 0000000c .word 0x0000000c
15c: 00000018 .word 0x00000018