### Mips-X

Mips-X as described earlier on the blog was a Stanford University grad project. A 32bit RISC CPU with some unique features for one, it had 2 delay slots for control change instructions, branches and jumps. I am not aware of any other processor that has that. We had a visit from John Hennessy (Stanford Mips project faculty lead and ultimately university president) one day (not Mips-X related) and I asked him, "why two delay slots?" his paraphrased answer was "It was a graduate project, we were just trying things out".

### The Shifter

Mips-X had a barrel shifter and exposed it to the programmer via these opcodes:

asr rSRC,rDST,#1..32

rotlb rSRC1,rSRC2,rDST

rotlcb rSRC1,rSRC2,rDST

sh rSRC1,rSRC2,rDST,#1..32

asr rSRC,rDST,#1..32

rotlb rSRC1,rSRC2,rDST

rotlcb rSRC1,rSRC2,rDST

sh rSRC1,rSRC2,rDST,#1..32

Via a combination of the above, all the needed shift operations could be done. Observe though there is

**no variable shift**, just fixed # shift values.

### My Shift function

Now here is a good puzzle for the reader to parse my variable shift func for lsr.s.r0 == 0 -- can be a src or dst

r24 is the code segment offset (allows for position independent code off of r24).

r4 is the value to be shifted.

r5 has the #<shift>

r2 is the result.

r31 is the return address

.text

.noreorg

shift_table:

mov r4,r2

lsr r4,r2,#1

lsr r4,r2,#2

lsr r4,r2,#3

lsr r4,r2,#4

lsr r4,r2,#5

lsr r4,r2,#6

lsr r4,r2,#7

lsr r4,r2,#8

lsr r4,r2,#9

lsr r4,r2,#10

lsr r4,r2,#11

lsr r4,r2,#12

lsr r4,r2,#13

lsr r4,r2,#14

lsr r4,r2,#15

lsr r4,r2,#16

lsr r4,r2,#17

lsr r4,r2,#18

lsr r4,r2,#19

lsr r4,r2,#20

lsr r4,r2,#21

lsr r4,r2,#22

lsr r4,r2,#23

lsr r4,r2,#24

lsr r4,r2,#25

lsr r4,r2,#26

lsr r4,r2,#27

lsr r4,r2,#28

lsr r4,r2,#29

lsr r4,r2,#30

lsr r4,r2,#31

.globl ___lshrsi3

___lshrsi3:

nop

add r24,r5,r1

jspci r1,#shift_table,r0

jspci r31,#0,r0

nop

nop

.end

Look at the two jspci's above.

**A jspci in the delay slot of a jspci!**What happens? Also observe the nop at function entry. Why is that there? Well, this func's caller could have had a LD of r5 in the second delay slot of the jspci. In that case, if add were the first instruction, r5 would be stale as LD's have a one instruction hazard.

jspci r24,#___lshrsi3,r0

nop

ld 0[r29],r5

___lshrsi3:

add r24,r5,r1

That is a hazard as r5 is still in transit in the pipeline when the add goes to use it. Thus the nop.