crash debugging help

Asked by peter toth

Hi Everyone,

I'm currently investigating a crash that happened compiled with gcc 4.2.1 on arm7tdmi architecture (I could use 4.9.3 on demand). I'm using LPC2387 and I'm getting wdog resets. Instead of wdog resets I'm using wdog interrupts, so when it would reset otherwise, it gets into my handler, which saves state and prints a whole memory dumps (64k only). So basically I know the registers before wdog reset and have a stack showing all of the call history.

On the stack I can see loads of references to the end of the function, and I see many instructions as data in the memory region. Which I think will become the reason for the halt and then the consequent wdog interrupt. Any ideas what might be happening?

I guess reasons can be when dereferencing a function pointer, but my function seems to be quite straight forward. It is touching many hardware registers (interrupt, peripheral enable/disable).

Like this
2015/05/27 04:45:30: addr: 4000BF2C 7FE00390 -->this is "svcvc 0x00e00390" according to gcc 4.2.1 and ".word 0x7fe00390" according to 4.9.3.

At the end of the function I see this in gcc 4.9.3
   191d4: e89d6ff8 ldm sp, {r3, r4, r5, r6, r7, r8, r9, sl, fp, sp, lr}
   191d8: e12fff1e bx lr
   191dc: 7fe00390 .word 0x7fe00390
   191e0: 40000044 .word 0x40000044
   191e4: 00064de5 .word 0x00064de5
   191e8: 00064dfb .word 0x00064dfb
   191ec: 4000107c .word 0x4000107c
   191f0: e0028000 .word 0xe0028000
   191f4: e01fc000 .word 0xe01fc000
   191f8: 40001084 .word 0x40001084
   191fc: 4000113c .word 0x4000113c
   19200: 3800b010 .word 0x3800b010
   19204: 40002a78 .word 0x40002a78
   19208: 40002ab4 .word 0x40002ab4
   1920c: 40002aa0 .word 0x40002aa0
   19210: 40001080 .word 0x40001080
   19214: 400001a9 .word 0x400001a9
   19218: e002c000 .word 0xe002c000
   1921c: 40001134 .word 0x40001134
   19220: 00064e0e .word 0x00064e0e

It used to look like this on gcc 4.2.1:
  1953c: 7fe00390 svcvc 0x00e00390
   19540: 40000044 andmi r0, r0, r4, asr #32
   19544: 0006d74c andeq sp, r6, ip, asr #14
   19548: 0006d764 andeq sp, r6, r4, ror #14
   1954c: 400012d0 ldrmid r1, [r0], -r0
   19550: e0028000 and r8, r2, r0
   19554: e01fc000 ands ip, pc, r0
   19558: 40001390 mulmi r0, r0, r3
   1955c: 40001394 mulmi r0, r4, r3
   19560: e002c040 and ip, r2, r0, asr #32
   19564: 40002e54 andmi r2, r0, r4, asr lr
   19568: e002c068 and ip, r2, r8, rrx
   1956c: e002c000 and ip, r2, r0
   19570: 40002e90 mulmi r0, r0, lr
   19574: e002c02c and ip, r2, ip, lsr #32
   19578: 3fffc000 svccc 0x00ffc000
   1957c: 40002e7c andmi r2, r0, ip, ror lr
   19580: 3fffc0a0 svccc 0x00ffc0a0
   19584: 400012d4 ldrmid r1, [r0], -r4
   19588: 400001a1 andmi r0, r0, r1, lsr #3
   1958c: 400012d8 ldrmid r1, [r0], -r8
   19590: 0006d778 andeq sp, r6, r8, ror r7

Can someone explain me what is in the end of the function? what are the .word regions? Why would I see pointers to this area on the stack?

Thanks,
Peter

Question information

Language:
English Edit question
Status:
Solved
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Solved by:
peter toth
Solved:
Last query:
Last reply:
Revision history for this message
Thomas Preud'homme (thomas-preudhomme) said :
#1

Hi Peter,

What you see is most likely literal pools, ie immediates that are used by the function. Suppose for instance that your function contains:

foo = 239559;

The number might be too big to encode directly in the instructions. One solution is to use several instructions to build (the right term is synthesize) this number but depending on the number it can need many instructions and take a lot of space. Another solution is to put the value after the function and load that value. I'm guessins that's what you are seeing.

Best regards.

Revision history for this message
peter toth (tothphu) said :
#2

Hi Thomas

Ah, so it is just 4.2.1 is decoding them as instructions rather than leaving them as values! Good to know, I was mislead by that for a long time and never understood why some functions need svc instructions on some don't. 0x7FE00000 is a memory region for me. It all makes sense now.

Relating to the actual problem I'm trying to debug. Is there a chance that execution would stop for the processor?

What I'm seeing is that the wdog interrupt reports (through getting register state) after waiting 15 seconds, that the execution stopped at a very specific instruction (reproducible). Generally that would be the last instruction of the enabling interrupts function. How could have that stayed intact for 15 seconds?

Thanks,
Peter

Revision history for this message
Thomas Preud'homme (thomas-preudhomme) said :
#3

Hi Peter,

It is not possible for me to answer you as this is likely dependent on your exact processor and I lack specifics. My guess is that the problem is in an interrupt handler. What might happen is that once the interrupts are enabled, either a storm of interrupts happen or an interrupt is serviced and the handler has a bug that will lead to another interrupt being thrown or perhaps the handler has an infinite loop.

Anyway my suggestion would be to review you interrupt handlers.

Best regards.

Revision history for this message
peter toth (tothphu) said :
#4

Hi Thomas,

Thanks for the answer. I'll look into what is happening around my interrupts.

Cheers,
Peter