Bug report

Asked by emblocks

A lot of the modifications revision 19380 are wrong and will give errors at certain optimization levels with certain code.

In the thumb2.md file are all the functions handling the short thumb2 (thumb inherited) instructions like:

ADDS r6,r6,#34 = 2 bytes length

However:

ADD r6,r6,#34 = thumb2 instruction, hence 4 bytes long.

As you can seen in the arm.md here below, is this instruction tagged as 2 bytes long in the arm.md file.
The result is "branch out of range" errors from the assembler.

I saw that a lot of this is changed that way (MUL,SUB etc) which are all needed to be handled by the thumb2.md (as in the mainline)

I suggest to revert that whole revision 193980 (I personally have moved to the mainline with all the embedded related stuff patched into it).

Regards

;; The r/r/k alternative is required when reloading the address
;; (plus (reg rN) (reg sp)) into (reg rN). In this case reload will
;; put the duplicated register first, and not try the commutative version.
(define_insn_and_split "*arm_addsi3"
  [(set (match_operand:SI 0 "s_register_operand" "=kind of l, r, k,r,r, k, l, r, k,r, k, r")
 (plus:SI (match_operand:SI 1 "s_register_operand" "%0, rk,k,r,rk,k, 0, rk,k,rk,k, rk")
   (match_operand:SI 2 "reg_or_int_operand" "Py,rI,rI,k,Pj,Pj,Pv,L, L,PJ,PJ,?n")))]
  "TARGET_32BIT"
  "@
   add%?\\t%0, %1, %2
   add%?\\t%0, %1, %2
   add%?\\t%0, %1, %2
   add%?\\t%0, %2, %1
   addw%?\\t%0, %1, %2
   addw%?\\t%0, %1, %2
   sub%?\\t%0, %1, #%n2
   sub%?\\t%0, %1, #%n2
   sub%?\\t%0, %1, #%n2
   subw%?\\t%0, %1, #%n2
   subw%?\\t%0, %1, #%n2
   #"
  "TARGET_32BIT
   && GET_CODE (operands[2]) == CONST_INT
   && !const_ok_for_op (INTVAL (operands[2]), PLUS)
   && (reload_completed || !arm_eliminable_register (operands[1]))"
  [(clobber (const_int 0))]
  "
  arm_split_constant (PLUS, SImode, curr_insn,
               INTVAL (operands[2]), operands[0],
        operands[1], 0);
  DONE;
  "
  [(set_attr "length" "2,4,4,4,4,4,2,4,4,4,4,16")
   (set_attr "predicable" "yes")
   (set_attr "arch" "t2,*,*,*,t2,t2,t2,*,*,t2,t2,*")]

Question information

Language:
English Edit question
Status:
Solved
For:
GNU Arm Embedded Toolchain Edit question
Assignee:
No assignee Edit question
Solved by:
emblocks
Solved:
Last query:
Last reply:
Revision history for this message
chengbin (can-finner) said :
#1

Hi emblocks,
Thanks for reporting this bug to us.
Could you please provide a test case which reveals the bug (including the command line)? The non-flag setting instructions are expected to be converted to its flag setting version in gcc machine reorg pass.

Thanks.

Revision history for this message
chengbin (can-finner) said :
#2

In general cases, such non-flag setting instructions will be converted to its flag setting version, which is 2 bytes long, in machine reorg pass. But it won't happen if such instructions interleave with other flag setting/using instruction pair, which causes the problem.
We can be conservative, setting the length of such instructions to 4 to fix the problem, or just revert the patch.

Revision history for this message
emblocks (gnugcc) said :
#3

I had this problem with optimization -O1

I think you can safely revert. As you can see in the tumb2.md there are addsi_short (something like that) which will handle all the flagged situations. Also for all the other short instructions. If I check the intermediates I see that all the flagged are tagged with the right length.

I don't see really significant changes in code size between this ARM version and the branch-7 head.
I made a branch-7 head with most of the patches that this branch is using to get further.

I have a free IDE with this GCC compiler, you can take that to see the differences between both versions.

http://www.emblocks.org (this is not commercial, just informative)

Cheers.

Revision history for this message
chengbin (can-finner) said :
#4

Well, could you provide an example program that reveals this bug?

Thanks very much.

Revision history for this message
emblocks (gnugcc) said :
#5

That's a bit difficult because it's concerning a project of a customer.

However, I first didn't realize that you guys are using different machine descriptions so I first started a bug for the branch-4.7
I have an intermediate posted there which will show you what is going on.

I was first looking in the wrong direction although I was on the right spot.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56441

Revision history for this message
emblocks (gnugcc) said :
#6

For those who are following this thread, I had a special patch from chengbin to test, thanks for that, and a part of our discussion is printed here to make this thread complete.

Chengbin:

Hi,
Firstly we are sincerely sorry about the inconvenience caused by this bug and thanks very much for reporting it to us.

After investigation, I think the bug is caused by the incorrect instruction length set in the check-in r193980, but I am having trouble to create a test case to verify. So here I attached this patch for our released GCC fixing the problem, could you please help us apply and verify whether it can fix the problem?

Since This patch does improve code size in our release by 1% when optimizing for O2/O3, we want to keep improving and upstreaming, rather than simply discard it, and we really appreciated if you could help us on this issue.

Thanks-chengbin

Answer emblocks:

Hi Bin,

You don't have to apologize, every effort for the open community should be appreciated.

I have applied those patches manually to the branch-7 head which I'm using now.
Together with your modifications in thumb1 in arm.c.

I can confirm that this actual error is now gone.

However, about your statement that this is giving 1% code reduction I can see that this is not correct.
Here are my numbers:

                                       -O2 -O3
Normal-branch 114132 133820
ARM-branch 114072 133704

Reduction 0.05% 0.08%

These numbers are noise and can't be called real improvements.

I think that it doesn't make sense to change the machine description and leaving the well known, aggressively tested, branch for a lot of uncertainty.

I like to Newlib nano approach and also the special library buildings (LIBRARY_REQUIRED).

Regards

Revision history for this message
chengbin (can-finner) said :
#7

Hi,
Thanks for verifying the patch and glad to hear that it works.

The improvement is specific on Thumb2 instructions set and for our gcc branch. Builds of official GCC trunk/branch are not included, because regrename pass is disabled by default.

Thanks again.

Revision history for this message
emblocks (gnugcc) said :
#8

Well all the tests are THUMB2 that's why I got that error in the first place.

However, an enabled register renaming pass doesn't show any significant changes between both.
I like to have a positive vibe about this but I have to admit that I'm sceptical about these special optimizations.

But I recognize the mechanism that, if you put a lot of time in something, it is hard to look objective afterwards what it brought you.

Keep up the good work.