Skip to main content

Shift on Mips-X

Mips-X


Mips-X as described earlier on the blog was a Stanford University grad project. A 32bit RISC CPU with some unique features for one, it had 2 delay slots for control change instructions, branches and jumps. I am not aware of any other processor that has that. We had a visit from John Hennessy (Stanford Mips project faculty lead and ultimately university president) one day (not Mips-X related) and I asked him, "why two delay slots?" his paraphrased answer was "It was a graduate project, we were just trying things out".

The Shifter

Mips-X had a barrel shifter and exposed it to the programmer via these opcodes:

asr    rSRC,rDST,#1..32
rotlb  rSRC1,rSRC2,rDST
rotlcb rSRC1,rSRC2,rDST
sh     rSRC1,rSRC2,rDST,#1..32


Via a combination of the above, all the needed shift operations could be done. Observe though there is no variable shift, just fixed # shift values.

My Shift function

Now here is a good puzzle for the reader to parse my variable shift func for lsr.s.

r0 == 0 -- can be a src or dst
r24 is the code segment offset (allows for position independent code off of r24).
r4 is the value to be shifted.
r5 has the #<shift>
r2 is the result.
r31 is the return address

.text
.noreorg
shift_table:
        mov     r4,r2
        lsr     r4,r2,#1
        lsr     r4,r2,#2
        lsr     r4,r2,#3
        lsr     r4,r2,#4
        lsr     r4,r2,#5
        lsr     r4,r2,#6
        lsr     r4,r2,#7
        lsr     r4,r2,#8
        lsr     r4,r2,#9
        lsr     r4,r2,#10
        lsr     r4,r2,#11
        lsr     r4,r2,#12
        lsr     r4,r2,#13
        lsr     r4,r2,#14
        lsr     r4,r2,#15
        lsr     r4,r2,#16
        lsr     r4,r2,#17
        lsr     r4,r2,#18
        lsr     r4,r2,#19
        lsr     r4,r2,#20
        lsr     r4,r2,#21
        lsr     r4,r2,#22
        lsr     r4,r2,#23
        lsr     r4,r2,#24
        lsr     r4,r2,#25
        lsr     r4,r2,#26
        lsr     r4,r2,#27
        lsr     r4,r2,#28
        lsr     r4,r2,#29
        lsr     r4,r2,#30
        lsr     r4,r2,#31
.globl ___lshrsi3
___lshrsi3:
        nop
        add     r24,r5,r1
        jspci   r1,#shift_table,r0
        jspci   r31,#0,r0
        nop
        nop
.end


Look at the two jspci's above. A jspci in the delay slot of a jspci! What happens? Also observe the nop at function entry. Why is that there? Well, this func's caller could have had a LD of r5 in the second delay slot of the jspci. In that case, if add were the first instruction, r5 would be stale as LD's have a one instruction hazard.


        jspci   r24,#___lshrsi3,r0
        nop
        ld      0[r29],r5
___lshrsi3:
        add     r24,r5,r1

That is a hazard as r5 is still in transit in the pipeline when the add goes to use it. Thus the nop.

Comments

Popular posts from this blog

The Air32F103

 Bluepills Its no surprise that Bluepills, such as they were, a smoking sub $2 deal would dry up eventually, taken over by clones, some subtle, some overt. With supply issues still plaguing ST parts for hobbyists, what about these cheap Chinese clone parts? Lets dive into the Air32F103. Air32F103 I bought 5 of these $1.90 boards. They look like this: Some notables vs a stock bluepill: 1) Castellated pins with flat underside for use as a module 2) 3 LEDs R/G/B (vs just the G for a bluepill) 3) Clone of ST's peripherals 4) over clockable to 256Mhz (spec is 216Mhz) (original STM32F103CB is 72Mhz) 5) 32K of ram. But, via secret regs, 97K(!) 6) top and bottom debug pads, top for JLINK, bottom, legacy STLINK SWD 7) USB C vs USB mini 8) BOOT as a button vs jumper 9) 2 12bit DACs (STM32F103CB's don't have that) 10) QSPI (hidden support) 11) Undocumented crypto block from MegaHunt (includes: AES/DES/3DES/SHA/SM[1,3,4,7]) Clocking As mentioned 256Mhz is possible. One Q is how do they...

Ada on a $2 eBay 'Bluepill' board (STM32F103C8T6)

For about $2 a 'bluepill' board can be obtained from eBay. Taking its name from the PCB patina this small board is used in the Arduino community. The SoC itself per ST's early docs on it was 64k flash and 20k data, that said, all STM32F103C8's you buy nowadays are 128k flash. The CPU is a Cortex-M3 which can be run at 72Mhz maximum. There is a good website that shows the schematic and pinout of the bluepill: http://wiki.stm32duino.com/index.php?title=Blue_Pill If you are an Arduino programmer, the link above will take you where you want to go. But suppose you wanted to try programming and using the bluepill a different way? Well that is what this blog entry is about. An Ada port was done via AdaCores Ada_Drivers_Library and the Libre GNAT toolchain to the bluepill. This is preliminary work but it is able to generate working code that the author already is using as of yesterday (a garage parking measurement sensor).  The port of the library is derivative, ...

The STM32MP257F-EV1

 This is the follow on to the STM32MP1 series. This time a dual core Cortex-A35 a Cortex-M33 and a Cortex-M0+. As shipped, they provide a 16G micro SD card with Linux on it. That card is partitioned with a lot of odd small partitions. The important ones are at the end, p10 and p11. Unfortunately p10, the / partition is only 4G so immediately there is a storage crisis when you start populating the usual tools. A quick dd from that card to a 256G card I had lying around and we now have a start of something useful. Next is to run gparted on my RPi5 and first move p11 which is /usr/local to 1MiB from the end. Don't leave that as 0 or the move will fail. Once moved, you can resize p10 to be 200+G. Now we have some useful space. I added a user with sudoers privs. There is no date/time setup when it boots, that needs to get rectified with a ntp callout.  Now... there is no C compiler or binutils. binutils got installed. Lots of apt update -> apt install xyz's. Here is an interesti...