Monday, 27 January 2014

Initial Experimentation with Cross-Platform Assembler

In this particular exercise, there were two parts to be compiled and executed on two respective platforms - x86/64 and ARM64. Trivial sounding at first, but much more intricate and procedure intensive when labored through, the tasks were to initially create a 10 iteration loop that is outputted as a character string that shows the ascending integer of the iterator. This would be followed by expanding the loop to 30 iterations and presenting the incremented result in a "double digit" format that combines both the quotient and the remainder of the number when divided by 10.

Our group ran into a handful of obstacles throughout the process of completing both parts of the lab, and I have taken the liberty of jotting them down as we encountered them:

- Throughout the first part, we constantly fiddled around with the .data variable lengths and calculations in order to insert the result byte at the correct index of the prescribed string.

- A recurring error we ran into was a segmentation fault involving the index variable, with my first guess assuming that this was attributed to the declaration syntax. Later fixed with the inclusion of parentheses.

- Stepping through the execution with the GNU debugger (gdb) gave us a weird value at register 12 at first.This was corrected by changing the use to the right register (rdx instead rcx).

- The format came out wonky at first, the newline character did not get executed. The correction to this came with Nicholas's logic to include the calculations in the loop on where to store the byte in the string.

- In the Aarch64 port, the first main problem was trying to load the "index" variable to a register, since we couldn't find a replacement for the x86 equivalent of a -b suffix eg. "r13b".

- A segmentation fault followed yet again after finding "strb" which might or might not have worked for the problem above.

- We observed some interesting output when changing line 11 to "adr" from "mov": "qemu: Unsupported syscall: 0" in the 10 iterations of the loop.

- In part 2 for the x86 concerns (which were greatly reduced):

1) Hex values were for some odd reason confused with the ASCII conversion on the divisor.

- As for Aarch64:

1) I was especially not a fan of how the "msub" instruction was explained on the wiki and therefore had some trouble implementing it. Later deduced that the structure of the instruction is as follows:

msub "register to store calculated remainder in" "register where first numeral to be divided is stored" "register where second numeral to be divided is stored" "quotient (whole number result of division)"

Below is our completed code for Part 2 for x86/64, followed by ARM64:

/* x86 */

.text

.globl _start

_start:


    mov $0, %r15 

    mov $10, %r14 /*used to store the divider */ 

    loop:


        mov %r15, %rax

        mov $0, %rdx
        div %r14
      
        add $0x30, %al /*convert to ascii */ 
        add $0x30, %dl 

        mov %al, (msg + len - 3) /*  store the byte */

        mov %dl, (msg + len - 2)

movq $len,%rdx /* message length */

movq $msg,%rsi /* message location */
movq $0x1,%rdi /* file descriptor stdout */
movq $0x1,%rax /* syscall sys_write */
syscall

        inc %r15

        cmp $0x1e, %r15
        jne loop

movq $0,%rdi /* exit status */

movq $60,%rax /* syscall sys_exit */
syscall

.data


msg: .ascii      "Loop: ##\n"

len = . - msg

=================================================================


/* aarch64 */


.text


  .global _start


_start:

  
  mov x19, 0
  mov x23, 10 /* used for dividing */
  adr x24, msg

  loop:
    mov x20, x19/* calculate the byte  */ 
    udiv x21, x20, x23       /* r21 = i / 10 */
    msub x22, x21, x23, x20  /* r22 = i - (r21 * 10) ie gets the remainder*/ 

    add x21, x21, 0x30  /* convert to ascii */

    add x22, x22, 0x30

    strb w21, [x24, len - 3]/* set the byte  */ 

    strb w22, [x24, len - 2]
   
    mov x0, 1   /* print the loop */
    adr x1, msg
    mov x2, len
    mov x8, 64
    svc 0

    add x19, x19, 1 /* if not 30 revert to the beginning of the loop */

    cmp x19, 30
    bne loop

    mov x0, 0

    mov x8, 93
    svc 0 

.data
  msg: .ascii "Loop: ##\n"

  len = . - msg

Not included here, but completed later is the suppression of the zero digits on the quotient in the first 10 iterations of the loop, which ended up being a simple case of comparing the quotient to a zero stored in a register and skipping the register containing the 0 before output depending on whether or not the condition is true. This is was done like so:

x86 -  cmp $0x00, %rax
           je skip_10s

aarch64 - cmp x21, 0x00
                beq skip

No comments:

Post a Comment