Monday, May 25, 2020

SVD file optimization


Optimizing SVD files

SVD files, what are they?


This webpage has a good overview, after all, they penned the format:

https://www.keil.com/pack/doc/CMSIS/SVD/html/index.html

Basically, its an XML file that describes an SoC from peripherals to registers to individual fields. Typically sucked in by a debugger to get a meaningful view into an SoC. Its also been used as an input to some very useful tools, such as...

svd2ada

When I started looking at Ada for embedded ARM hacking some years back. Adacore had an early library, Ada_Drivers_Library. It had drivers for all the peripherals in some STM32F4 series parts. How it did this was interesting, underneath the driver was a description of the HW from a bunch of .ads files that were... automatically generated. Adacore had written a tool. svd2ada, that would parse the SVD file provided by the vendor, in this case ST microelectronics, and produce a detailed specification of each peripheral in the part along with type records for each register and enclosed fields. Quite eye opening to me. Having waded through many vendors C .h files that were festooned with #define masks all over the place to describe in a flat, non-hierarchical fashion the encoding of fields in regs.

Example

Here is a peripheral from the STM32L562, the on-the-fly decryption engine. We see the name of the peripheral, a description and the all important base address:

<peripheral>
      <name>OTFDEC1</name>
      <description>On-The-Fly Decryption engine</description>
      <groupName>OTFDEC</groupName>
      <baseAddress>0x420C5000</baseAddress>

Lets look at a register from this peripheral:
        <register>
          <name>R1CFGR</name>
          <displayName>R1CFGR</displayName>
          <description>OTFDEC region x configuration
          register</description>
          <addressOffset>0x20</addressOffset>
          <size>0x20</size>

and another:
        <register>
          <name>R2CFGR</name>
          <displayName>R2CFGR</displayName>
          <description>OTFDEC region x configuration
          register</description>
          <addressOffset>0x50</addressOffset>
          <size>0x20</size>

See a pattern? Each register is a name with a number embedded and its offset is rising by a value > the size in bits... we will look at that observation later.

Now traditionally, svd2ada would process this as you see it in the XML and the stm32_svd-otfdec.ads would look like this:

      R1CFGR      at 16#20# range 0 .. 31;
      R1STARTADDR at 16#24# range 0 .. 31;
      R1ENDADDR   at 16#28# range 0 .. 31;
      R1NONCER0   at 16#2C# range 0 .. 31;
      R1NONCER1   at 16#30# range 0 .. 31;
      R1KEYR0     at 16#34# range 0 .. 31;
      R1KEYR1     at 16#38# range 0 .. 31;
      R1KEYR2     at 16#3C# range 0 .. 31;
      R1KEYR3     at 16#40# range 0 .. 31;

...
      R4CFGR      at 16#B0# range 0 .. 31;
      R4STARTADDR at 16#B4# range 0 .. 31;
      R4ENDADDR   at 16#B8# range 0 .. 31;
      R4NONCER0   at 16#BC# range 0 .. 31;
      R4NONCER1   at 16#C0# range 0 .. 31;
      R4KEYR0     at 16#C4# range 0 .. 31;
      R4KEYR1     at 16#C8# range 0 .. 31;
      R4KEYR2     at 16#CC# range 0 .. 31;
      R4KEYR3     at 16#D0# range 0 .. 31;

Now, there is a lot of commonality in those reg groups I think we can see. So the question becomes is there a more compact way to describe the layout of these repetitive groupings?

Well, it turns out, there is. SVD files have some other nomenclature that permits descriptions of this type of repetitive grouping. The term SVD uses is cluster and dim (or <cluster> and <dim> in XML). These terms allow specification of such groups in a form that is indexable by software. Fortunately for us, svd2ada already supports <cluster> and <dim>. Super news for us, if only there was a way to automatically emit these compressive constructs so we don't have to sift through 500k XML files performing hand edits.

3) svdopt.rb
If there was a tool that could parse the SVD file, identify those groupings and re-write the SVD file with those changes, then the indexable records ought to have a more compact form that should reduce the amount of code needed in an Ada driver to work with the peripheral. Taking the example above, if left unchanged, you would have code to handle R1CFGR... R2... R3.. R4 where, realistically, RxCFGR would do if you had an array of records. So a tool was crafted. It accepts an SVD file as input and produces an SVD file as output. It tries to be automatic in processing but we will get to special cases later. For now lets look at the definition for R1CFGR which we saw above. This cluster below describes all the regs in the group. Observe, R is taken as the cluster name as all the RxY use R as the lead-in to the element. The <cluster> has a dim of 4 which matches the HW desc. There is a new field I added <dimOffset> that shows 1. Given SVD files are C oriented, they assume arrays begin at 0. Well Ada doesn't have to do that, and neither does the reference manual and vendor SVD. They start this register group a 1 so... we have a syntax to allow that also. Observe also that this <cluster> has an array embedded inside each element. the RxKEYy and RxNONCEy values. So the tool correctly identifies this from the description and emits embedded <dim> accordingly.

<cluster>
  <dim>4</dim>
  <dimIncrement>0x30</dimIncrement>
  <dimOffset>1</dimOffset>
  <name>R[%s]</name>
  <addressOffset>0x20</addressOffset>
  <register>
    <name>CFGR</name>
    <displayName>CFGR</displayName>
    <description>OTFDEC region x configuration register</description>
    <addressOffset>0x0</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      ...
    </fields>
  </register>
  <register>
    <name>STARTADDR</name>
    <displayName>STARTADDR</displayName>
    <description>OTFDEC region x start address register</description>
    <addressOffset>0x4</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
    </fields>
  </register>
  <register>
    <name>ENDADDR</name>
    <displayName>ENDADDR</displayName>
    <description>OTFDEC region x end address register</description>
    <addressOffset>0x8</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      <field>
        <name>REGx_END_ADDR</name>
        <description>Region AXI end address</description>
        <bitOffset>0</bitOffset>
        <bitWidth>32</bitWidth>
      </field>
    </fields>
  </register>
  <register>
    <dim>2</dim>
    <dimIncrement>4</dimIncrement>
    <dimOffset>0</dimOffset>
    <name>NONCER[%s]</name>
    <addressOffset>0xc</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      <field>
        <name>REGx_NONCE</name>
        <description>REGx_NONCE</description>
        <bitOffset>0</bitOffset>
        <bitWidth>32</bitWidth>
      </field>
    </fields>
  </register>
  <register>
    <dim>4</dim>
    <dimIncrement>4</dimIncrement>
    <dimOffset>0</dimOffset>
    <name>KEYR[%s]</name>
    <addressOffset>0x14</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      <field>
        <name>REGx_KEY</name>
        <description>REGx_KEY</description>
        <bitOffset>0</bitOffset>
        <bitWidth>32</bitWidth>
      </field>
    </fields>
  </register>
</cluster>

svd2ada result

Well, what do we get from the above description? Does <cluster> and <dim> improve svd2ada .ads generation?

   type OTFDEC_Peripheral is record
...
      R   : aliased R_Clusters;

   for R_Cluster use record
      CFGR      at 16#0# range 0 .. 31;
      STARTADDR at 16#4# range 0 .. 31;
      ENDADDR   at 16#8# range 0 .. 31;
      NONCER    at 16#C# range 0 .. 63;
      KEYR      at 16#14# range 0 .. 127;
   end record;

   type R_Clusters is array (1 .. 4) of R_Cluster;

 Here the description is far more compact and will produce less code as the driver need only access the elements as an array vs a case statement. I also would wager the code will be clearer as it matches the reference manual wrt how replicated elements are treated. In the RM's they will write the shorthand for the address computation as so:

OTFDEC region x configuration register(OTFDEC_RxCFGR)
Address offset: 0x20 + 0x30 * (x -1) (x = 1 to 4)

Special cases

What fun would programming be without a myriad of special cases and arcane detail to deal with? Well as with any good programming problem, there are loads of these issues. Lets take a look at some of them.

cluster naming

The example I presented above, was a good one, in that the tool can make an educated guess that the cluster name is... R. Can we get so lucky that this 'rule' holds for all such groupings? Sadly, no. Lets see another case from the DMA controller. For each of the 8 channels, 5 registers make up the cluster:

0x00000008 CCR1
0x0000000c CNDTR1
0x00000010 CPAR1
0x00000014 CM0AR1
0x00000018 CM1AR1
...
0x00000094 CCR8
0x00000098 CNDTR8
0x0000009c CPAR8
0x000000a0 CM0AR8
0x000000a4 CM1AR8

Here we see some interesting layout. The cluster# is the last digit. A possible array is the first digit as in CM0AR1, CM1AR1. More disturbingly there is no implied grouping in the names other than they all start with a C (not too meaningful). So these are really DMA channel regs in a group that is dimensioned 1..8. So we need to help the tool a) identify this issue b) to allow a naming of this grouping. I have a syntax on the cmdline of the tool that permits this naming. It requires some help from the user to place the rename. For this group, it looks like this:

~/ruby/svdopt.rb -C  DMA1:8:CH,DMA2:8:CH ...other options...

This says that when working on peripheral DMA1 or DMA2, at offset 8, use CH as the clustername.
In the output you then get this:

<cluster>
  <dim>8</dim>
  <dimIncrement>0x14</dimIncrement>
  <dimOffset>1</dimOffset>
  <name>CH[%s]</name>
  <addressOffset>0x8</addressOffset>
  <register>
    <name>CCRx</name>
    <displayName>CCRx</displayName>
    <description>channel x configuration register</description>
    <addressOffset>0x0</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      ...
    </fields>
  </register>
  <register>
    <name>CNDTRx</name>
    <displayName>CNDTRx</displayName>
    <description>channel x number of data
    register</description>
    <addressOffset>0x4</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      ...
    </fields>
  </register>
  <register>
    <name>CPARx</name>
    <displayName>CPARx</displayName>
    <description>channel x peripheral address
    register</description>
    <addressOffset>0x8</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      ...
    </fields>
  </register>
  <register>
    <dim>2</dim>
    <dimIncrement>4</dimIncrement>
    <dimOffset>0</dimOffset>
    <name>ARy</name>
    <addressOffset>0xc</addressOffset>
    <size>0x20</size>
    <access>read-write</access>
    <resetValue>0x00000000</resetValue>
    <fields>
      ...
    </fields>
  </register>
</cluster>

Finally svd2ada yields this:

   type DMA_Peripheral is record
...
      CH    : aliased CH_Clusters;
...
   end record

and

   for CH_Cluster use record
      CCRx   at 16#0# range 0 .. 31;
      CNDTRx at 16#4# range 0 .. 31;
      CPARx  at 16#8# range 0 .. 31;
      ARy    at 16#C# range 0 .. 63;
   end record;

   type CH_Clusters is array (1 .. 8) of CH_Cluster;


split fields

Yes, sounds bad, and it is. Lets take a look at AES from the same SoC:

0x00000010 KEYR0
0x00000014 KEYR1
0x00000018 KEYR2
0x0000001c KEYR3
0x00000020 IVR0
0x00000024 IVR1
0x00000028 IVR2
0x0000002c IVR3
0x00000030 KEYR4
0x00000034 KEYR5
0x00000038 KEYR6
0x0000003c KEYR7

Take a look at that reg layout. Looks like when they did the HW it only supported 128bit AES. Who needs more than 128bits they thought? Well, time moves on and now 256bit AES KEYs are commonplace. But what of legacy code that uses 128bit keys and expects the IV to be right after it. Well, lets just stuff the rest of the key after and make a hole in the middle of the KEYR if you were to look at it as a contiguous array 0..7.

How svdopt processes regs

Internally, svdopt looks at regs as so:

['KEYR', :x] or ['KEYR', 1] ... ['KEYR', 7] etc.

So a natural grouping of KEYR above would be to ID it as an array 0..7. There is a safety check in svdopt to ensure that the gap between elements equals the base register size from one numbered element to the next. At KEYR4 this discontinuity is detected and some messy logic takes over to split the array into 2 new arrays:

The re-write looks like this, basically the array becomes 2 new ones, KEYRA and KEYRB. Observe that KEYRB starts at index 4 as you would expect.

<register>
  <dim>4</dim>
  <dimIncrement>4</dimIncrement>
  <dimOffset>0</dimOffset>
  <name>KEYRA[%s]</name>
  <addressOffset>0x10</addressOffset>
  <size>0x20</size>
  <access>read-write</access>
  <resetValue>0x00000000</resetValue>
  <fields>
    <field>
      <name>KEY</name>
      <description>Cryptographic key, bits[31:0]</description>
      <bitOffset>0</bitOffset>
      <bitWidth>32</bitWidth>
    </field>
  </fields>
</register>
<register>
  <dim>4</dim>
  <dimIncrement>4</dimIncrement>
  <dimOffset>0</dimOffset>
  <name>IVR[%s]</name>
  <addressOffset>0x20</addressOffset>
  <size>0x20</size>
  <access>read-write</access>
  <resetValue>0x00000000</resetValue>
  <fields>
    <field>
      <name>IVI</name>
      <description>initialization vector register (LSB IVR
      [31:0])</description>
      <bitOffset>0</bitOffset>
      <bitWidth>32</bitWidth>
    </field>
  </fields>
</register>
<register>
  <dim>4</dim>
  <dimIncrement>4</dimIncrement>
  <dimOffset>4</dimOffset>
  <name>KEYRB[%s]</name>
  <addressOffset>0x30</addressOffset>
  <size>0x20</size>
  <access>read-write</access>
  <resetValue>0x00000000</resetValue>
  <fields>
    <field>
      <name>KEY</name>
      <description>Cryptographic key, bits
      [159:128])</description>
      <bitOffset>0</bitOffset>
      <bitWidth>32</bitWidth>
    </field>
  </fields>
</register>

The svd2ada output then is:

   for AES_Peripheral use record
      CR    at 16#0# range 0 .. 31;
      SR    at 16#4# range 0 .. 31;
      DINR  at 16#8# range 0 .. 31;
      DOUTR at 16#C# range 0 .. 31;
      KEYRA at 16#10# range 0 .. 127;
      IVR   at 16#20# range 0 .. 127;
      KEYRB at 16#30# range 0 .. 127;
      SUSPR at 16#40# range 0 .. 255;
   end record;

   type KEYRA_Registers is array (0 .. 3) of HAL.UInt32;

   type IVR_Registers is array (0 .. 3) of HAL.UInt32;

   type KEYRB_Registers is array (4 .. 7) of HAL.UInt32;

conclusion

This problem was quite nasty as you never know where the cluster index or array index are. There is inconsistency in how the vendor may choose to write the cluster candidate, and whether that was ever a consideration. In some cases the array index comes first:

CM0AR8

In my tool that gets broken out as:

['CM', 0, 'AR', 8] and also: ['CM', :x, 'AR', :y]

The bookkeeping needs to be flexible to not assume that :x or :y above are array or cluster indexes just by their placement. Only through analysis of the regs can this be deduced.

The tool status now is experimental. I am still evaluating it and will be checking it in soon.





No comments:

Post a Comment