Tuesday, November 20, 2018

California fires + Air Quality + Ada + Bluepill + PMS7003

Camp fire

The Camp fire has had a big effect on the Air Quality Index (AQI) in the Bay Area over the past week or so. AQI is typically measured as an entry in one of 7 bands:


To compute the AQI (typically cited from the PM2.5 #) a formula is used. See the wiki page above.

The PM2.5# is in µg/mfortunately there is a Chinese sensor, the PMS7003 that can provide PM1, PM2.5 and PM10 in those units:


So we have a sensor, we already have a Bluepill running Ada with an SPI screen to display the results. 


The sensor is a $20 eBay purchase. Bluepills are < $2 as documented on this blog, screen is a little over $3. So for $25 or so you can build your own AQI meter.

Code is in: 



Thursday, May 10, 2018

Bluepill+: Diagnosing bad solder joints


If your like me... a FW engineer attempting to do soldering then this might sound familiar. I have been replacing Bluepill STM32F103C8's with STM32L443CC chips. Its a hack as I do it as I use hot air for the remove and then residual pad solder for the stick down of the replacement.

Not surprisingly, some pads don't get good adhesion. Worse, some pads don't show this lack of adhesion until much later when you need the pad to do something useful.

The STM32 series are quite good, I think all user pads can be made into GPIOs. What if we wrote a program that toggled each pad every second and then walked the pins with a meter. So that program was written.

https://github.com/morbos/STM32/blob/master/L/L443/pinny/src/pinny.adb

That's great! Take a look at that code for the abstraction on the GPIOs that the Ada Device Library provides with

type GPIO_Points is array (Positive range <>) of GPIO_Point;

The magic decl in pinny is:

BP_Points : GPIO_Points :=
PB0 & PB1 & PB3 & PB4 & PB5 & PB6 & PB7 & PB8 &
PB9 & PB10 & PB11 & PB12 & PB13 & PB14 & PB15 &
PA0 & PA1 & PA2 & PA3 & PA4 & PA5 & PA6 & PA7 &
PA8 & PA9 & PA10 & PA11 & PA12 & PA15 &
PC13 & PC14 & PC15;


& concatenates items together, in this case of GPIO_Point making the init of the array BP_Points.

After that, we can just say:

Toggle(BP_Points);

The lib has code to accept the array and it will walk each element and toggle each one.

So that is every useful pin on a Bluepill highlighted above in the decl. With the abstraction of the pins to be a single element, initialization and group change becomes trivial and a small amount of user code.

A corridor conversation with a co-worker, Tyson Leistiko followed. As I described my software he mentioned I should group the GPIOs in such a way that neighboring pins toggle asymmetrically and in that way look for non 3v, 0v pins indicating some form of pin bridging. So pinny was adapted easily to add two hand picked sets that reflect the property of being neighbors to the other set.

As so:

Set1 : GPIO_Points :=
PC13 & PC15 & PA1 & PA4 & PA6 & PB0 & PB10 & PB12 &
PB14 & PA8 & PA10 & PA12 & PB3 & PB5 & PB7 & PB9;
and:

Set2 : GPIO_Points :=
PC14 & PA0 & PA2 & PA3 & PA5 & PA7 & PB1 & PB11 &
PB13 & PB15 & PA9 & PA11 & PA15 & PB4 & PB6 & PB8;

Post those decl's we can preload the sets as so:


Clear (Set1);
Set (Set2);

Then both pinny programs go into a loop allowing you to probe all the pins for the 3v -> 0v -> 3v .. pattern.

loop
Toggle (BP_Points);
delay until Clock + Milliseconds (1000);
end loop;

https://github.com/morbos/STM32/blob/master/L/L443/pinny2/src/pinny2.adb

It should be said my friend Eric Schlaepfer mentioned that the EE way of doing this is to probe
the ESD diodes and look for pad to pin connectivity that way. My solution: a FW engineer caused the
problem, a FW way was used to repair the problem 😀

Saturday, April 21, 2018

Tracing a Bluepill

Did you know the lowly STM32F103C8 has non-invasive trace logic along for the ride? Did you also know that via program control:

1) Trace can be enabled out to a pin on the board (PB3).
2) 32 channels can be used to emit 32, 16 or 8bit (char) values from running code. Useful to monitor values from your code.
3) A cycle count can be emitted
4) Data Watchpoints can also be reported. See a picture below.



Showing SysTick exception handling:

Showing a sensor range value being emitted on channel 16 (that's the max 1FFE in mm's btw)

Showing a watchpoint on the read modify write of the Green LED on PC13 (HW watchpoint on: BSRR @0x40011010).

Also an Overflow is seen (first time I have seen that actually). Can be mitigated by bumping the output rate.


Bluepill shown with a VL53L10X range sensor. The trace is the yellow wire connected to TRACESWO PB3 and green wire is connected to GPIO1 from the sensor. That is plumbed up to go low when a new sample is ready. Based on the range value, the green LED is lit or not. (a version of this is used in my garage to detect if the car is far enough along to close the door w/o striking the rear or the vehicle).

Use

So a board that can cost $1.27 on Taobao has all this and more.

One omission on such an inexpensive design is ETM (Embedded Trace Macrocell). This is featured on many of the larger ST designs. Its easy to check if it exists as the register space at E0041000 will have non-zero contents. On a bluepill, that space is all zero.

No matter, Instruction Trace Macrocell (ITM) is available. This can report the PC of the running program at a sub-sampled rated.

ITM also reports when the code dips into the exception handler as shown above.

Usually though when configured, the diet is a steady stream of PC's.

Some observations of this technology.

1) You need background/sample code. A hackaday post in 2015 on the great work by Petteri Aimonen:
a)  Showing how enable the Trace Port Interface Unit TPIU/ITM and ETM
b) Adding .py scripts to Pulseview to permit GUI viewing of trace output as seen above in the images.
2) ETM, Petteri used an F1 Value Line part per his blog. I had no luck getting ETM data out of that part.

Some observations:1) You must have a debug connection to the target to get SWO to emit trace.
2) It's possible to muck around with rate settings. You can get a rate that is unreal, I had a Bluepill+ with trace @80Mhz coupled with a minimally subdivided PC. You get a firehose of PCs coming out. Perhaps one very 20-30instrs or so. Amazing.
3) My code is a translation of Petteri's C example into Ada.

https://github.com/morbos/STM32/F/F103/vl53l0x_trace_f103


Using Pulseview

The pretty picture above is done with Petteri's work. You can install Pulseview for Linux or Windows (also you can build the source (as I did to a Raspberry pi3)). The trick to decoding the trace is to have a UART capture of PB3 off of the traceswo port. Get that UART capture into Pulseview. For my path, I use an unsupported logic capture, the Salae logic pro 16. No matter, it exports binary and then its a simple matter to convert it to a .sr file that Pulseview can read:

Hedley@FASTER ~/trace
$ split --numeric-suffixes=1 --suffix-length=1 --bytes=4M untitled.bin logic-1-

Hedley@FASTER ~/trace
$ zip trace.sr version metadata logic-1-*
updating: version (stored 0%)
updating: metadata (deflated 8%)
updating: logic-1-1 (deflated 97%)
updating: logic-1-2 (deflated 97%)
updating: logic-1-3 (deflated 97%)
updating: logic-1-4 (deflated 97%)
updating: logic-1-5 (deflated 97%)
updating: logic-1-6 (deflated 97%)
updating: logic-1-7 (deflated 100%)
updating: logic-1-8 (deflated 100%)
updating: logic-1-9 (deflated 100%)

Hedley@FASTER ~/trace
$ cat metadata
[global]
sigrok version=0.3.0

[device 1]
capturefile=logic-1
total probes=8
samplerate=50 MHz
probe1=SWO
probe2=GPIO1
unitsize=1

So once in Pulseview we can add a decoder to the UART. Just add ARM ITM and set the baudrate to your capture speed (in my case, 8Mhz). Then let Pulseview's decoder show you the trace. 

Notes:

https://hackaday.com/2015/03/09/execution-tracing-on-cortex-m-microcontrollers/

Friday, February 23, 2018

The Bluepill+

The STM32L443CC 'Bluepill+'



Following up from the work I did on getting Ada going on a Bluepill.

 http://www.hrrzi.com/2017/11/ada-on-2-ebay-bluepill-board.html

Is a Bluepill a fixed entity?

Looking at the SoC the Bluepill uses we see it is a Cortex-M3 running at 72Mhz with 128k of flash and 20k of ram. There are newer STM32 chips all the way out to Cortex-M7's. I have this tool from ST called the ST MCUFinder. You can screen all their parts by different parameters. I had already hacked Ada onto the STM32L432 for the Make-with-Ada contest last year so the L series was still quite fresh in my mind. So looking at the package the STM32F103C8 on a stock bluepill was there a newer, pin compatible, STM32 cpu across the family that could drop in and work in place of the original? I will spare dear reader the suspense, indeed there are a bunch of parts that have the same footprint.

The candidate CPU for Bluepill+ STM32L443CCT6

That part uses a Cortex-M4F, runs at 80Mhz with 256k of flash and 64k of ram. Plus it has more peripherals than the STM32F103C8. This 443CC has 2 12 bit DACs. I like 2 DACs since you can make oscilloscope vectors with that HW.

Specs

  • Core: ARM® 32-bit Cortex®-M4 CPU with FPU, Adaptive real-time accelerator (ART Accelerator™) allowing 0-wait-state execution from Flash memory, frequency up to 80 MHz, MPU, 100DMIPS and DSP instructions
  • Performance benchmark
    • 1.25 DMIPS/MHz (Drystone 2.1)
    • 273.55 CoreMark® (3.42 CoreMark/MHz @ 80 MHz)
  • Energy benchmark
    • 176.7 ULPBench® score
  • Clock Sources
    • 4 to 48 MHz crystal oscillator
    • 32 kHz crystal oscillator for RTC (LSE)
    • Internal 16 MHz factory-trimmed RC (±1%)
    • Internal low-power 32 kHz RC (±5%)
    • Internal multispeed 100 kHz to 48 MHz oscillator, auto-trimmed by LSE (better than ±0.25 % accuracy)
    • Internal 48 MHz with clock recovery
    • 2 PLLs for system clock, USB, audio, ADC
  • Up to 83 fast I/Os, most 5 V-tolerant
  • RTC with HW calendar, alarms and calibration
  • LCD 8× 40 or 4× 44 with step-up converter
  • Up to 21 capacitive sensing channels: support touchkey, linear and rotary touch sensors
  • 11x timers: 1x 16-bit advanced motor-control, 1x 32-bit and 2x 16-bit general purpose, 2x 16-bit basic, 2x low-power 16-bit timers (available in Stop mode), 2x watchdogs, SysTick timer
  • Memories
    • 256 KB single bank Flash, proprietary code readout protection
    • 64 KB of SRAM including 16 KB with hardware parity check
    • Quad SPI memory interface
  • Rich analog peripherals (independent supply)
    • 1× 12-bit ADC 5 Msps, up to 16-bit with hardware oversampling, 200 μA/Msps
    • 2x 12-bit DAC, low-power sample and hold
    • 1x operational amplifier with built-in PGA
    • 2x ultra-low-power comparators
  • AES: 128/256-bit key encryption hardware accelerator
  • 17x communication interfaces
    • USB 2.0 full-speed crystal less solution with LPM and BCD
    • 1x SAI (serial audio interface)
    • 3x I2C FM+(1 Mbit/s), SMBus/PMBus
    • 4x USARTs (ISO 7816, LIN, IrDA, modem)
    • 1x LPUART (Stop2 wake-up)
    • 3x SPIs (4x SPIs with the Quad SPI)
    • CAN (2.0B Active) and SDMMC interface
    • SWPMI single wire protocol master I/F
    • IRTIM (Infrared interface)
  • 14-channel DMA controller
  • True random number generator
  • CRC calculation unit, 96-bit unique ID

Costs

If you look on eBay for the price of a Bluepill, its just a tidge over $2. Last I checked, $2.13. Take a look at the 10k quantity for STM32F103C8's btw. Its at that price(!!!). So a Bluepill has: 
1) a pcb
2) STM32F103C8
3) 8Mhz xtal
4) 32khz xtal
5) 5v - 3v reg
6) 2 LEDs
7) reset button
8) USB header

All for the 10k unit quantity of 2) above...

Anyway, the donor CPU an STM32L443CCT6 is about $2.61 in 10,000 unit qty. For me buying 5 or so was about $5 to $6 each so really a Bluepill+ costs about 8$ after surgery. Even at $8 this board is pretty good for what you get.

The donor

A $2 bluepill board was sacrificed to be a donor for Bluepill+ updating. Basically, since the parts are pin compatible you need only use a cheap hot air station (these are sub $100 on eBay) and some tweezers. Get a helper to spread the hot air and gently tweeze the package away from the board as the solder liquefies. I had also tin foiled the rest of the board to avoid stray heat. Once off the board, the donor IC is placed on the footprint (carefully!) and then using an iron, tack down the legs to the liberated pads. The residual solder is enough to make good contact I found even after cleaning the pads up a little bit post removal.

Porting Ada

So, we have a new CPU. Lets get Ada going on it! We will need some low level code changes to get the board initialized to be useful. Things like the PLL has to get locked to the right clock, other L series SoC IP needs to enabled etc. Fortunately, there are ample examples in Ada_Drivers_Library on how other F4 and F7 series cores are handled. The STM32L series is modern relative to the F1 and F4 and shares some commonality with the F7. So we look closely at the F7 when in doubt about init seqs. One easy thing to get going is to grab the SVD files from ST for the L443. These are right on the SoCs ST webpage.  Once you have them, you can run svd2ada to generate the headers, one set for Ada_Drivers_Library and one for the embedded runtimes. I know that this method has be modernized but I have not yet moved over to bb_runtimes at this time. I also wrote a tool in Ruby that sucks in the SVD fie and generates a gdb script. That script is personalized for your SoC, it generates reg dump log files of the entire SoC and saves them to disk. Subsequently, another tool pulls in the svd and the logs and generates a human readable hierarchy dump of each reg. This is documented on the www.hrrzi.com blog. So once the svd files are generated porting begins for the various IP blocks in the design The key ones to port are GPIO, SPI, I2C, TImers, ADC, DMA. This work took some time but I leveraged my work for Make with Ada (an STM32L432). At this point there is parity between my Bluepill port and the Bluepill+ port. All the code with examples is in my STM32 tree:


with libraries:


and


Also the Ruby SVD reader is here:

https://github.com/morbos/ruby

One technique I found helpful in my Ada port was that ST via its CubeMX and free SW dev tools has lots of examples. If you get stuck with some IP thats not working in Ada. Build a CubeMX example that initializes or uses the IP you need to port. Once thats up and running on your board, use the Ruby scripts I mentioned to dump the ST example codes reg settings. First you can use that as a reference, and later use logs2dump and diff your port to the example code. Where there are diffs might expain why your port is not working.


Openocd

Openocd 0.10.0 needed some minor mods to get attached to the bluepill+ Basically, the flash code needed to be told about the STM32L443's flash size based on the unknown part number. Also the target script needed some changes. I will add more to this entry later about this.



Monday, January 1, 2018

Sunday, November 5, 2017

Ada on a $2 eBay 'Bluepill' board (STM32F103C8T6)


For about $2 a 'bluepill' board can be obtained from eBay. Taking its name from the PCB patina this small board is used in the Arduino community. The SoC itself per ST's early docs on it was 64k flash and 20k data, that said, all STM32F103C8's you buy nowadays are 128k flash. The CPU is a Cortex-M3 which can be run at 72Mhz maximum. There is a good website that shows the schematic and pinout of the bluepill:

http://wiki.stm32duino.com/index.php?title=Blue_Pill



If you are an Arduino programmer, the link above will take you where you want to go. But suppose you wanted to try programming and using the bluepill a different way? Well that is what this blog entry is about. An Ada port was done via AdaCores Ada_Drivers_Library and the Libre GNAT toolchain to the bluepill. This is preliminary work but it is able to generate working code that the author already is using as of yesterday (a garage parking measurement sensor). 

The port of the library is derivative, starting from the STM32F4 series, the STM32F1 is a 10yr old design and as such, the F4 is an extension of the the F1. The F1 has some idiosyncrasies that seem to have been addressed in the F4 (eg: GPIO setup). The port fixes the SoC freq at 72Mhz. No low power work has been done yet but I assume in a WFI setting with peripherals not enabled, it should draw not too much current even with a 72Mhz functional speed. The embedded runtimes library has 2 flavours of runtime, a full and a small footprint (sfp). With only 20k of ram, even sfp is too dear for a bluepill so a new runtime that is very close to sfp was created, called ssfp (super small footprint). This runtime strips down the stack sizes to minimums, hopefully this won't bother users too much.

There are two libraries and a development folder that are needed for development and they are on github:



Development folder:


That folder has a structure of:

F
  F103
  F722
L
  L432
  L476
  L496

Each family is a subfolder.

The development folder has F103 projects such as:

adcdma_f103 ADC using DMA
adcgpio_f103 polled ADC on GPIO
bno_f103 test of I2C to a component
crc_f103 port of AdaCore's crc test to the bluepill
lsm9ds1_f103 my preliminary 9dof sensor start. I don't plan on using it with a bluepill.
spi_f103 Interfacing SPI to an Atmel 25640 (a 64k byte EEprom).
timer_f103 Timer interrupt
try_f103 my first test prob a good start for project0
vl53l0x_f103 My garage distance project.


For debug I use openocd-0.10 and a donor Stlink v2-1 from any discovery or nucleo board.
Just the other day I got a $2 Stlink v2 usb stick from China. For that one its a 4 wire attachment:

bluepillF103.cfg: (v2 shown, invert the comments for v2-1)


#source [find interface/stlink-v2-1.cfg]
source [find interface/stlink-v2.cfg]
transport select hla_swd
source [find target/stm32f1x.cfg]
reset_config srst_nogate




Bluepill running the try_f103 program connected to the STlinkv2. This is using the Raspberry Pi3 as a Wifi openocd server as is documented on this blog.

Saturday, September 23, 2017

Stanford Mips-X @IIT

This doc may be helpful to have on hand to understand this post:
http://i.stanford.edu/pub/cstr/reports/csl/tr/86/289/CSL-TR-86-289.pdf

Integrated Information Technology or as that mouthful was abbreviated to: IIT (now 8x8) was founded by Chi-Shin Wang and Y.W. Sing two Weitek veterans.  Its original product line was floating point coprocessors and lived in a niche before the giant sucking sound was heard of Intel vacuuming up peripherals such as floating point. At that point, a new product was needed. From 1989 a small group was working on a Vision Processor (VP). This SIMD(4) engine took a year or so to build. Its first customer was Compression Labs (CLI). The VP had a large reg file 64 16bit regs  times 4 SIMD instances. This was so CLI could execute the 16x16 DCT. An 8x8 DCT is far more modest in reg consumption. Another customer for the part was AT&T with the videophone 2500 in 1992. So at this point, IIT had a reasonable video business with customers that had application processors that would use the VP as a video engine. About this time a search was underway for IIT's very own controller. Chi-Shin found a 'bargain' from Stanford, Mips-X for $15K! What do you get for $15K? You get a tape from Stanford's tech licensing group. Paul Chow (now EECS prof at Toronto) sat with us for a week to go over the tape. I worked with him to understand the SW tools offering. For one item, there was no C compiler for it, there was however an assembler written in Modula-2. Over the years I wrote and maintained our port of GCC to Mips-X. Having 2 delay slots broke the reorg pass of GCC on numerous occasions with the usual email chain from me to Richard Kenner and workarounds to deal with reorg issues. Finally I wrote our own 8x8 pass in GCC that fixed up cases where a hazard was in slot1 and slot2 used it. Rather than use the Modula-2 assembler, I wrote one in C. The linker, nld, in C++ (props to Jeff Loomans at a poker game who told me to learn C++, the linker was my first C++ program). From a HW standpoint, there was no schematic(!). There was a layout, not sent out(!). So, IIT sent the layout to the fab.  It came back and had severe voltage range issues. Also, it was a black box, i.e. provide a clock, take it out of reset, and it fetched a few instructions and was dead, lifeless. Observing the code, it was clear that it was a cache issue, since the first few fetches worked until a loop branched back. Luckily for us, there was a pin to disable the cache, then, it worked.
Management at IIT then reversed the layout and sucked out the design to a schematic. From there, subsequent tapeouts could use a proper core. Work then went on to produce the Vision Controller(VC) which was IIT's version of what CLI and AT&T bolted on around the VP. So this VC had a DMA controller, a video circuit, etc. Externally, it used a brooktree part for the analogue to drive the monitor. VC came back from the fab and we could see colour bars on the screen using a polled interrupt test. When that polling became a real interrupt... all hell broke loose and Mips-X crashed. The standard crash was a vertically rolling screen. Without the processor there to manage the DMA for the active portion of the screen, the DMA just kept shipping out a rolling rectangle. Doug Neubauer who was instrumental in the HW design of VC was about to go on a fishing trip and (paraphrasing) said "I am going fishing, if this is not fixed when I return, its game over". So during his trip a logic analyzer was hauled in and clipped onto the Mips-X code SRAMs. When the crash happened, the processor was in the interrupt handler, never able to leave. Mips-X has an odd interrupt pipeline. Everything is exposed to the programmer, so when you get an interrupt, the 5 stage pipe (IF RF ALU MEM WB) allows MEM and WB to finish (look out for volatiles there btw). You get 3 PC's to restart at IF, RF, ALU and the last 2 will redo the MEM and WBs that were in flight at the interrupt. If an interrupt should come in whilst an interrupt was being serviced, the final code of an interrupt handler will be:

JPCRS
<---------------- New interrrupt.
JPC
JPC

the new interrupt interferes with the pipleline restart and you are back in the handler with a half baked restart of the prior interrupt return. In the PSW is an e bit. The bug we found was that the exception began chain shifting again making the e bit and recovery using the chain regs impossible.

How to then return from interrupts?

Here is where I nominate a reg to the best register I have ever seen. Having worked in this space between HW&SW to nominate one reg to the best means this reg is pretty special. I believe Matt Cressa was the HW engineer who crafted it. Its reason for being even is odd. There was no need for this reg really, nor wrt its semantics. From the ERS:

RIFACE_IRQSUPPRESS    (0x20004014)                       (4 bits RD/WR)
 -----------------------------------------------------------------------------
|    |    |    |    |    |    |   |   |    |    |    |    |    |    |    |    |
|  x | x  | X  | X  | X  |  X | X | X | X  | X  | X  | x  | S3 | S2 | S1 | S0 |
|    |    |    |    |    |    |   |   |    |    |    |    |    |    |    |    |
 -----------------------------------------------------------------------------
  b15                                                                      b0

RIFACE_IRQSUPPRESS - Suppress Interrupts for S3-S0 Instructions
    0x00 = 15 instructions
    0x0e =  1 instruction
    0x0f =  0 instruction

Lets look at that reg closely, observe the inversion of S3..S0, writing 0 gets the maximum suppression. How is that useful?  Lets look at this tail end of an interrupt handler. Observe R0 being assigned to the reg.



7609c: 96c04028 st 0x4014[r27], r0
760a0: 877e0000 ld 0x0[r29], r31
760a4: 60000019 nop
760a8: f8000003 jpcrs
760ac: e8000003 jpc
760b0: e8000003 jpc


The effect is that for the next 15 instructions, interrupts are suppressed. This is enough to escape the interrupt handler without being sent back into the handler in a cooked state,

Without this we cannot get back cleanly from an interrupt. Since the reg's original purpose was unknown as the problem had not manifested itself yet it really was pure serendipity that it exists at all. Also, the inversion... Had it been not inverted and a linear map of write N get N suppress, then that too would have failed. You see at the end of the interrupt handler, the regs have to go back to the values they had at the time of the interrupt, only R0 on the Mips-X, which is hardwired to 0, can be dependably used. Amazing.