Posted on Leave a comment

CH32V003 driving WS2812B LEDs with SPI – Part 4

1 March 2025

I am both displeased and a little disappointed to find my experiment still running today, with over 40,000 successful loops and zero errors.

In order to help convince myself that this is not just a statistically unlikely run of “good luck”, I will eliminate the delays between color changes that I had inserted into the code to make the color changes more obvious to us slow(er)-brained humans.

Once upon a time, a long, long time ago, I worked on a project that needed to work perfectly every time, and be able to check that it had worked perfectly every time. We were running millions of encryption and decryption cycles on blocks of data and were seeing some very rare cases of mistakes creeping in. We were able to catch the mistakes, but were ever so curious as to what was causing them. Sound familiar? So I set up a test to run over the weekend and let the test machine run full speed ahead. Returning on Monday, we found it had caught five errors in just over 50,000,000 transactions. Unacceptable!

We contacted the manufacturer of one of the critical components of the device and were told that, because of the way we had configured the circuit, the component could encounter a “race condition” and double-clock itself, if a particular signal arrived within a one nanosecond window that varied plus or minus five nanoseconds over temperature. That’s a tiny window! But we were hitting it on a reproducible scale.

The solution in that case was to clock the device synchronously by providing our own clock signal to the chip instead of depending on its internal clock. That way all the transactions would have been rigidly in step and not exploring the spaces of all possible timing combinations. Unfortunately, we had already committed to a PCB design and were on the verge of production when the whole outfit went south. Printed circuit board design and production were things to be Taken Seriously back in the day.

And by “went south” I mean the owner cleaned out the bank account and disappeared, literally leaving us at the office saying, “Stay here and I’ll go get your paychecks from the bank myself.”

But away from past mistakes and back to present mistakes. Over five million loops with no errors seems to indicate to me that the code, when properly enhardwared, works as designed. Now I need to run it up on the lift and swap the original circuit back in and see if we can continue to reproduce the error states we were previously seeing.

So at first it looked like the impossible was happening: everything now worked and yet I had changed nothing. But patience won by asking me to take a break and come back in a few minutes. When I did, I saw, just as I was sitting down, a run of errors being logged on the console.

So the next variation on the testing got underway. I wanted to try disconnecting the SPI output from the PA2 line and drive a different WS2812B externally to the board. I set about finding another suitable LED module and building another little test cable for it. Again, when I sat back down at the desk, another run of errors was simultaneously occurring. What are the odds?, one might ask.

A judicious tap-tap-tapping on the little breadboard circuit rewarded me with my answer: 100% guaranteed to fail when vigorously agitated. An intermittent connection is somehow to blame for all this mess.

Here is my list of possible candidates for where the issue lies, in decreasing order of probability (in my mind):

1.  Janky test cables made from exceedingly economical jumper wires
2.  Interconnects in the no-name solderless breadboard hosting the circuit
3.  My soldering of the header pins to the PCB
4.  Manufacturing error or tolerances in the board itself

Before I completely disassemble this prototype and build it up again in a more resilient form factor, I will go ahead and try the new LED module. Same problem, as errors continue to be encountered, even with the PA2-PC6 bridge disconnected. Disconnecting the new LED module completely, while at the same time not re-connecting the onboard LED still encounters errors. I really thought it would have no measurable effect, and I seem to be right this one time.

Usually in these situations, the first thing I look for is some sort of power interruption or brown-out condition on the power supply. The reason I don’t think this is the culprit is because the chip does not seem to be resetting itself when these errors occur, as both the loop and error counts seem to persist across these error states. Additionally, I feel that both the CH32V003 and the WS2812B are correctly and adequately decoupled using their respective manufacturers’ suggested values of capacitors.

Now one thing I have not yet done is to activate the chip’s inbuilt power monitoring circuitry. Perhaps that could tell me if there are sufficient variations in chip’s internal power distribution occurring that could cause individual peripherals to misbehave without triggering a complete system reset.

Reading about the power control peripheral, I see that it can monitor the system voltage and potentially trigger an interrupt if certain parameters are exceeded. Using the SDK, I see the first reasonable thing to do is to ‘de-initialize’ the peripheral using the PWR_DeInit() function, contained in the /Peripheral/inc/ch32v00x_pwr.h and /Peripheral/src/ch32v00x.c files. Here’s what the function does:

RCC_APB1PeriphResetCmd(RCC_APB1Periph_PWR, ENABLE);
RCC_APB1PeriphResetCmd(RCC_APB1Periph_PWR, DISABLE);

But… wait a minute. Isn’t that the “brick myself so hard” sequence I found earlier? Let’s find out!

And the answer is… yes. Yes, it does. Recovery consists of the following steps.

In the MRS2 IDE, go to the Flash -> Download Configuration menu option.
In the “Download Parameters” panel, go to the bottom and check (enable) these options:

1.  Turn off WCH-Link Power Output
2.  Clear CodeFlash by Power-Off
3.  Disable MCU Code-Protect

Now take that PWR_DeInit() function call out of your program! Your program should download properly and run again.

Now having configured and enabled the “programmable voltage detector” circuit (and don’t forget to enable the PWR peripheral’s clock, like I did!), I see that the chip thinks its supply voltage is just fine. I set it to the highest voltage, ~4.4 VDC, and actually measured 4.75 VDC at the board. The chip is rated to run at full speed all the way down to 2.7 V, or 2.8 V if’n you’re wanting ADC function, so it’s able to detect any voltage anomalies in this manner. Of course, the next step is to make the voltage monitoring an asynchronous process and have it trigger an interrupt, but we both know it’s my wiring.

I’ll wire up a more robust test fixture on the morrow.

Posted on Leave a comment

CH32V003 driving WS2812B LEDs with SPI – Part 3

28 February 2025

So running overnight, there were over 40,000 loops of the blinking demonstration program, and it was still going. What I shoulda but didna do was add in an error counter that updated with each loop. There’s still time!

Moving the experimental apparatus over to the “official” WCH CH32V003 development board was simple enough. I built another programming cable for power, programming and serial communications, as well as a little cable for another WS2812B module I had in the WS2812B bucket. Building bespoke, modular cables for these little devices takes a little bit of time but saves so much more time in their subsequent reuse.

And it works! Well, I expected it to work, at least as well as it was working previously, which was “mostly”. But I really do need to add that error counter to the program so that if it’s not immediately going to fail, I can leave it running overnight and see what happened in the morning.

The error count is being kept (I strongly suspect) and it is being printed alongside each loop message. It has gone through several hundred loops by now and no errors have occurred.

Using a “known good device” is a proven trouble-shooting stratagem, when that is possible. It is, however, not an “apples to apples” comparison, at least the way I have it set up. The WCH board actually does have a 24 MHz quartz crystal mounted on it, even though my code is still telling the clock control unit to use the 24 MHz HSI oscillator, turbo’d up to 48 MHz by the magic of a phase-locked loop. The WCH board hosts a CH32V003F4P6, which is a 20 pin TSSOP20 package (thin shrink small outline package), while my little board has the 20 pin QFN20 (quad flat no lead) package. And the WB2812B LED is different, although it’s not clear to me how that could affect the outcome, but I list it for completeness.

I am going to be both displeased and a little disappointed if the code works perfectly on the “known good” board and not on mine. While I have already designed and shipped several different PCB-based products using this family of chips, I’m the first one to admit that I still have a lot to learn. Lay some wisdom on me, little chips!

Posted on Leave a comment

CH32V003 driving WS2812B LEDs with SPI – Part 2

27 February 2025

I firmly believe that the only reason this article is not finished and already published on my web site (well, for you, Dear Reader, it is, but for me, your Humble Narrator, it is yet to be) is that I listed two goals in my introduction and only achieved the first. In retrospect, and following the also-applicable “one topic per email” rule of writing, I should have edited myself, restated my goal (singular) and been done with it.

But here we are. The first goal, as you recall, was to use the hardware SPI on the chip to create a suitable wave form to drive the WS2812B addressable LED on my little development board. That goal is mostly achieved, in that I have seen it working and verified the signal using an oscilloscope. Mostly, but not completely, as its seems to “hang up” from time to time in a most frustratingly random manner.

The second, and arguably less critical goal was to be able to adjust the apparent brightness of the LED in real time for demonstration purposes. Two completely different things that very well could have been two completely different articles, although I feel that the first goal outweighs the second in value and practicality. We’ll get to that second goal today, I hope.

Another mystery presented itself yesterday and I was tempted to just ignore it, but I think you know “how I am” about these things. When verifying that the SPI output signal was not conflicting with the supposedly high-impedance default state PA2, I was debugging the program and saw that the GPIO configuration register for GPIOA was set to all zeros. Now the RM explicitly states that the reset value is supposed to be 0x44444444, indicating all eight pins of GPIOA are configured as inputs with no pull-up or pull-down resistors. Being all zeros, or 0x00000000, this represents a configuration of all “analog inputs”, which is a different thing. But this profound mystery will have to wait for its investigation as “randomly hanging up” is not a thing I can tolerate at all.

And by “randomly” I mean very randomly. I happened to notice the LED “not blinking” as it was supposed to be cycling through eight basic combinations of red, green and blue. Then I added a debug message on the console for each pass through the entire loop. As there is a 250 ms delay after each LED combination is set, that means the loop takes two seconds (or so) to complete. I left it running overnight and it stalled at loop #27,719. That means that everything worked splendidly for at least 55,438 seconds, which is more easily comprehended as almost 16 hours.

I had left it running under the debugger, so that when (or if) it should misbehave, I would be able to examine its state. I was able to do so, and discovered that it was hanging up at the only place that it possibly could, assuming as I always do that it was my code that was causing the problem. This was in the spi_send() function that first waits for the transmit register to be empty before sending the next byte out the SPI port. And sure enough, the TXE bit of the SPI’s STATR status register is reading a solid zero, meaning that the transmit register is not empty and that more waiting is indicated. Something is amiss here.

Assuming that the SPI is still receiving clock pulses from its prescaler, anything “transmitted” should clock itself out in eight bit times, or roughly ~1.333 us. I’m not using any sort of handshaking controls or other possibly interfering mechanisms here.

Now it has hung up after only 31 loops. It’s bad. Really bad.

So at the moment it seems the only logical thing to do in this situation is to add a timeout feature to the spi_send() function. How long to wait before declaring a ‘mayday’ and implementing Directive Omega? We should know within 2 us if there is a problem, given any eight bit byte should transmit completely in eight cycles of the 6 MHz clock. The little chip can only execute at 48 MHz, and even if it were executing one instruction in every clock cycle, that would only be 64 clock cycles. It’s not, because at system clocks of 24 MHz or over, an additional wait state is introduced for every flash memory access. It’s not entirely clear to me how that maps to the final cycles-per-second equation, but it’s got to be in there somewhere.

So a very safe and humanly undetectable amount of time would be a maximum of 64 iterations of the wait loop. If this were a more time-critical matter, we could enlist the help of the system timer, which is a 32 bit counter that can be clocked by the system clock either directly or after being divided by eight. It is in many ways almost identical to the SysTick peripheral in ARM Cortex devices.

But again, we’re blinking an LED and not landing on the moon or anything of material impact, so ‘close enough’ on this fail-safe device is sufficient.

Now that we’ve calculated a reasonable time frame for the transmit register to report itself empty and ready for new data, what exactly do we do when (not “if”, it seems) this failure occurs?

The only thing that seems to work with things like this is to turn it off and on again. “Have you tried turning it off and on again?” is a classic for a reason. We can just re-initialize the SPI device and just start over again. Just to be safe, it would be prudent to send a protocol reset signal, i.e., a low-level signal of ~50 us, before resuming our attempts to transmit.

I originally coded the SPI initialization code within the main() function, as I had originally only ever intended to execute it once. Now it is its own little function, which I lovingly named spi_init(), which in no way conflicts with the SDK-provided SPI_Init() function.

Well, I almost fell into a trap here. By adding the ‘reset’ function to the end of the recovery procedure, my little function would have been, in effect, calling itself, as the ws2812b_reset() function in turn calls the spi_send() function. Now we’re talking about an exceptional condition here, not something that is guaranteed to happen every time. But the one thing we know about this situation is that we don’t know what is causing it (yet) or why it is happening, much less if or when it will recur.

And now we wait, while the code ‘tests itself’. In the meantime, I’ll describe the original code that I was using to break down the transmission protocol into manageable chunks.

You’ll recall that at the lowest level, we were using the SPI to generate some arbitrary wave forms for us. A short-ish pulse was emitted when we transmitted a 0x60 via the SPI port, and that represented a zero, while a longer-ish pulse was created by shifting out 0x7E, to be interpreted as a one. I wrote a function called ws2812b_bit() which took a single argument, either a zero or something other than a zero and transmitted the appropriate value via the spi_send() function.

Then on top of that, I wrote a function to send the eight bits in a byte by sending the MSB of a byte via the ws2812b_bit() function, then shifting the entire byte to the left, so as to move the next least significant bit up to the MSB position. This happened a total of eight times and the single byte was transmitted.

The top layer was a function called ws2812b_rgb() which took three eight-bit values for the red, green and blue components of the signal, and called the ws2812b_byte() function, except in green, red then blue order.

The application could use the ws2812b_rgb() function to send out a string of RGB values to a string of LEDs, even a string of only one LED. After all the values had been sent, the ws2812b_reset() function would confirm their election and shift all the transmitted data values to the appropriate departments within each LED and start to display them accordingly.

It was totally working and we could have totally gotten away with it, had I not turned the blinding spotlight of the oscilloscope on the signal. The signal was nowhere near running at the throughput I had hoped for. There were biiiig gaps between the individual pulses, and while it still met the ever-so-relaxed requirements of the LED, it was only running at about 250 KHz, and not the 750 KHz theoretical maximum we should have seen, given our SPI clocking constraints.

So I played with about a bazillion combinations of different timing setups, including “unrolling” my functions to eliminate any excessive call overhead, all to no avail. Then I discovered by re-reading the reference manual for the tenth time, that I was relying on the SPI’s ‘busy’ flag instead of the ‘TXE’ flag. You go read the RM and tell me how clear that would have been to you. Here’s what it says about the ‘busy’ flag:

Busy flag. This flag is set and cleared by hardware.
1:SPI is busy in communication or Tx buffer is not empty.
0:SPI (or I2S) not busy.

And here is what it says about the ‘TXE’ flag:

Transmit buffer empty.
1:Tx buffer empty.
0:Tx buffer not empty.

Not interchangeable! And now I know. Well, I think I know. Something is still very messed up. Continued testing has revealed multiple failures after only 256 loops. And these are sequential errors, occurring right after the SPI reboot. Sometimes it’s four or five errors, and sometimes it’s more than I can count, as the error messages scroll off the top of the screen.

The good news is that it always, eventually, recovers and starts playing nice again.

As this is my first real exposure to this chip’s SPI hardware, it’s not entirely unreasonable that my expectations and its actual behavior have diverged. But I really think that I am asking the ‘bare minimum’ from this peripheral. It’s not expecting any sort of input at all and we’re not even using the clock signal that it is providing. I just don’t know what else could be causing these randomly-spaced events to occur. Yet.

As a sanity check, I will try this again on the official WCH CH32V003F4 development board, with just a single WS2812B LED attached directly to PC6, without all this also-connected-to-PA2 nonsense, and see if this happens there as well.

Posted on Leave a comment

CH32V003 driving WS2812B LEDs with SPI – Part 1

26 February 2025

After thinking about the WS2812B driver (if you can call it that) for the CH32V003 chip that I described a few days ago, I determined to make a couple of small improvements:

1.  Use the hardware SPI to deliver a full-speed bit stream to the addressable LED
2.  Be able to adjust the overall brightness of the demo program in real time

I created a new MounRiver Studio 2 (MRS2) project called, imaginatively, “F4-WS2812B-SPI”. This time I adjusted the system clock to the full 48 MHz, but using the internal HSI oscillator as the base instead of the external quartz crystal that is still not there.

In the MRS2-supplied file, system_ch32v00x.c, I un-commented the desired setting, like this:

//#define SYSCLK_FREQ_8MHz_HSI    8000000
//#define SYSCLK_FREQ_24MHZ_HSI   HSI_VALUE
#define SYSCLK_FREQ_48MHZ_HSI   48000000
//#define SYSCLK_FREQ_8MHz_HSE    8000000
//#define SYSCLK_FREQ_24MHz_HSE   HSE_VALUE
//#define SYSCLK_FREQ_48MHz_HSE   48000000

I find the best test of your system operating frequency is a serial terminal. If your USART is setting the baud rate based on the assumed clock frequency, you’re going to find out quickly if it is right or not. The generic, boiler-plate code created by the MRS2 new project wizard for this chip family (-003) sets up USART1 to be able to use the printf() family of console output functions. It also prints out the “System Clock” value and the unique Chip ID before entering the main loop of the application. I added the program name announcement to this list just so I can keep track of which program is actually running on the terminal. So I normally get this output every time the chip is either re-programmed or reset:

SystemClk:48000000
ChipID:00310510
F4-WS2812B-SPI

So I have confirmation that the system clock is somewhere in the neighborhood of 48 MHz. First, it told me itself. Second, I can actually read what it wrote, so that’s another Good Sign.

Now I’m now having some curiosity spring up around exactly how “unique” this “ChipID” really is. But perhaps I can follow up on that in the near future. It’s not looking altogether unique at this very moment.

So to talk to the WS2812B addressable LED with a ‘serial peripheral interface’ (SPI), um, peripheral, I should warn you that we are going not going to use the SPI as it was originally intended. You already know that the WS2812B uses its own proprietary bit stream protocol, which I vaguely described in a very hand-wavy manner in the previous article. It’s certainly not SPI-compliant, on the face of it.

But since SPI is a protocol of Very Little Brain, we can use it more as a ‘waveform generator’ than strictly a data transmission protocol. Any eight bit byte that you transmit through the SPI emerges as a sequence of bits from a single pin, along with a synchronized clock signal on another pin. We will not be using the clock pin at all, just the data line.

Now the SPI is a versatile beastie with ever so many options for configuring the data stream. This works out well because there are ever so many different SPI-enabled devices and every one of them has its own idea of what is a right and proper configuration.

As a peripheral of the first rank on this chip, it gets an entire chapter (Chapter 14) in the Reference Manual (RM). And here we see again the lingering legacy of “master” and “slave” devices. I’ve described my opinion on this topic in the past, so I will be referring to these two roles as “coordinator” and “participant” from now on. Our chip will coordinate the data flow and the LED will participate in this activity.

The SPI peripheral, which is unfortunately but irrevocably redundant, has access to up to four (4) input and output pins, depending on the required configuration. As previously stated, we will only need one, which is the output data line, called “MOSI” which translates to “coordinator out, participant in”. Other chips from other manufacturer’s sometimes refer to this pin simply as “SDO”, for ‘serial data out’. This pin is routed to PC6 (GPIO port C, pin 6), which is pinned out on the CH32V003F4U6 package on physical pin 13.

Now while the CH32V device is housed in a tiny (3x3mm) square plastic package with teensy weensy pads on the bottom of it, I had the foresight to route all the signals to the correspondingly numbered pins of a 20 pin DIP package, which is the form factor of the little development board I’m using on this project. So pin 13 on the QFN20 (quad flat no leads, 20 pins) maps directly to pin 13 on the dual in-line (DIP) footprint of the board.

Of course, before I go too far on congratulating myself on what a great job I did on laying out this board, let’s consider that I routed the output to the LED on the wrong pin entirely. I picked PA2 only because I had used that pin in the past as an output in a similar project. Now I need to figure out how to “correct” this error and get the signal from the SPI output to the LED.

Well, it’s not at all hard to do. Since the default state of most of the device pins is a high-impedance input, there should be no conflict if I just short PC6 to PA2 using a short jumper wire. I might mention at this point that I have installed the little DIP prototype development board onto a small solderless breadboard. Adding more components and attaching them to the device becomes very easy. Also, I don’t have to do any micro-circuit-surgery on the little board.

The down side is that I won’t be able to use PA2 for anything else.

So now let’s configure the SPI for our purposes. This begins with setting up PC6 as an “alternate function, push-pull output”, i.e., an output driven by one of the internal peripherals and not by the GPIO port. Then configure the SPI port to blast out those bits. Here is the configuration code:

// configure SPI

RCC_APB2PeriphClockCmd(RCC_APB2Periph_SPI1 | RCC_APB2Periph_GPIOC, ENABLE); // enable GPIOC peripheral clock

GPIO_InitTypeDef GPIO_init_struct = { 0 }; // GPIO initialization parameter structure

GPIO_StructInit(&GPIO_init_struct); // set default values
GPIO_init_struct.GPIO_Pin = GPIO_Pin_6; // PC6 is SDO
GPIO_init_struct.GPIO_Speed = GPIO_Speed_10MHz; // need 6 MHz
GPIO_init_struct.GPIO_Mode = GPIO_Mode_AF_PP; // alternate function, push-pull output
GPIO_Init(GPIOC, &GPIO_init_struct); // initialize PC6
GPIO_WriteBit(GPIOC, GPIO_Pin_6, Bit_RESET); // clear PC6

SPI_InitTypeDef SPI_init_struct = { 0 }; // SPI initialization parameter structure

SPI_I2S_DeInit(SPI1); // reset peripheral
SPI_StructInit(&SPI_init_struct); // set default values
SPI_init_struct.SPI_Direction = SPI_Direction_1Line_Tx; // one line for output only
SPI_init_struct.SPI_Mode = SPI_Mode_Master; // or 'coordinator', if you prefer
SPI_init_struct.SPI_DataSize = SPI_DataSize_8b; // 8 bits
SPI_init_struct.SPI_BaudRatePrescaler = SPI_BaudRatePrescaler_8; // 48 MHz / 8 = 6 MHz SPI clock
SPI_init_struct.SPI_FirstBit = SPI_FirstBit_MSB; // MSB first
SPI_Init(SPI1, &SPI_init_struct); // initialize SPI
SPI_Cmd(SPI1, ENABLE); // enable SPI

So to send the individual ‘wave forms’ that make up the binary ones and zeros that the WS2812B understand, we’ll shift out a few ones as a zero and a few more ones as a one. Yes? Yes!

For example, to send the code for a zero, we send a shorter high-level pulse, followed by a longer low-level pulse. I use a 0x60 byte value, or 01100000 in binary. To send a one, I use the value 0xFC, or 11111100 in binary, instead.

I wrote a simple function that sends the data byte out the SPI port, while waiting for any previously-transmitted bytes to clear first. It looks like this:

void spi_send(uint8_t data) { // send 8-bit data out via SPI

    while((SPI1->STATR & SPI_I2S_FLAG_TXE) == 0) {
        // wait for transmit register to be empty
    }

    SPI1->DATAR = data;
}

Now if you’ve done the math, and I know you’ve done the math, you’ll quickly figure out that the timing is still not exactly right on these transmissions. This is due to the limited number of SPI clock prescalers available. The system is running at 48 MHz, and we are only provided with powers-of-two for clock divisors. For our purposes, we use “/8” so that we get a 6 MHz clock running the SPI. This means that each “bit” in the eight bit byte that gets sent out occupies ~167 ns, and eight of them adds up to 1.33 us, which is longer than the 1.25 us minimum bit cell duration. So we’re getting 750 KHz instead of 800 KHz. Not perfect, and not 100% of what is possible, but much better than before.

So that’s the first of my two goals accomplished. Now to “adjust” the apparent brightness of the LEDs in real time for demonstration purposes.

Posted on Leave a comment

CH32V003 Driving WS2812B Addressable LEDs

23 February 2025

I’d like to be able to control some WS2812B addressable RGB LEDs using the CH32V003 chips. I’ve designed a little development board with a CH32V003F4U6 QFN20 device and it has a microscopically tiny WS2812B-compatible LED on it.

When I designed this board, I connected the data pin of the WS2812B LED to PA2, but only because I had written some earlier code that already used that pin to drive the LEDs. In retrospect, I should have used PC6, as it is also the SPI MOSI output, which is ideal for shifting out bits in a serial fashion, which is what the WS2812B type of LEDs want.

But here we are and I already have a stack of these boards here to play with, so play with them I will.

I’m using the new MounRiver Studio 2 (MRS2) IDE for this project. I created a new project called F4-WS2812B because that’s how creatively I name things.

I made a few tweaks to the project settings. Instead of attempting to crank up a 24 MHz quartz crystal attached to PA1 and PA2 that is not there, I set the system clock to 8 MHz. This is coincidentally the default system clock for the CH32V003 chips in their native state, before MRS2 decides these things for you. The chip powers up with the high-speed internal (HSI) oscillator running at ~24 MHz and sets up a prescaler of 3 on the system clock. For the speed crazy fans out there, you can speed up this little chip to 48 MHz, no problem. But for this project, I am currently estimating (i.e., totally guessing) that 8 MHz should be sufficient for driving the serial bitstream out to the LED in accordance with its timing constraints.

I also switched the default compiler from GCC8 to GCC12 and made a handful of other, less signficiant changes to the project settings.

So the first thing to do is to configure PA2 as a push-pull output with modest speed capabilities. The slowest setting is 2 MHz, and the target signal specification is 800 KHz. I’m not 100% clear on what all this affects on the output pin drive circuitry, but I assume it reduces some of the EMI that might otherwise be emitted if driven at the maximum speeds. Here is the code to do that, using the supplied HAL library:

// configure PA2 as push-pull output, 2 MHz max
RCC_APB2PeriphClockCmd(RCC_APB2Periph_GPIOA, ENABLE);
GPIO_InitTypeDef  GPIO_InitStructure = {0};
GPIO_InitStructure.GPIO_Pin = GPIO_Pin_2;
GPIO_InitStructure.GPIO_Speed = GPIO_Speed_2MHz;
GPIO_InitStructure.GPIO_Mode = GPIO_Mode_Out_PP;
GPIO_Init(GPIOA, &GPIO_InitStructure);
GPIO_WriteBit(GPIOA, GPIO_Pin_2, Bit_RESET); // PA2 low

The next thing to do is to start sending out some pulses and see how close we can get to the WS2812B’s timing requirements. Running at 8 MHz, each instruction cycle lasts 125 nanoseconds, and the RISC-V CPU in this chip executes most instructions in a single cycle.

There are three types of pulses that we need to send to talk to this LED. A short high pulse followed by a longer low pulse counts as a zero. A longer high pulse followed by a shorter low pulse counts as a one. A long low pulse of at least 50 microseconds acts as a ‘reset’ signal, telling the LED to latch in any data that has been shifted into it and using that data to light up the red, green or blue LEDs accordingly.

Now much has been said about the strictness of these timing requirements. The only thing that is really critical is the difference in the “short” and “long” periods of the high portion of the pulses. There can be quite a bit of variability on the low portion of the pulse, as long as it’s not so long as to be interpreted as the reset signal.

My simple adaptation of this timing protocol has pretty good control over the high part of the pulse, but the low parts tend to go on just a bit too long – but it still works just fine. The downside is that the overall bit frequency is only about 250 KHz, much lower than the maximum of 800 KHz. Right now I’m only trying to light up a single addressable LED, so this works fine, but if I was wanting to talk to a lengthy string of LEDs, this would seriously limit the maximum update rate for the entire string.

At the lowest level, I created a function called ws2812b_pulse() that takes a single argument. For a short pulse, you send it a zero. For the longer pulse, send it a 1. To send the reset signal, send it a 2. Here is the code:

void ws2812b_pulse(uint8_t length) { // send out a pulse

    // note:  system clock assumed to be 8 MHz

    // 0 = short pulse
    // 1 = long pulse
    // 2 = 'reset' signal

    switch(length) {
    case 0: // short pulse, 250 ns
        GPIOA->BSHR = GPIO_Pin_2; // high
        GPIOA->BCR = GPIO_Pin_2; // low
        break;
    case 1: // long pulse, 750 ns
        GPIOA->BSHR = GPIO_Pin_2; // high
        __asm__("nop"); // extend that pulse by 125 ns
        __asm__("nop"); // extend that pulse by 125 ns
        __asm__("nop"); // extend that pulse by 125 ns
        __asm__("nop"); // extend that pulse by 125 ns
        GPIOA->BCR = GPIO_Pin_2; // low
        break;
    case 2: // reset, > 50 us
        GPIOA->BCR = GPIO_Pin_2; // low
        Delay_Us(50);
        break;
    }
}

Next up the chain, I wrote a function that repeatedly calls the ws2812b_pulse() function with the data bits of a single byte, starting with the most significant bit (MSB) and going down to the least significant bit (LSB), as this is how the WS2812B listens for bytes. Here is the code:

void ws2812b_byte(uint8_t byte) { // send a byte one bit at a time, MSB first

    uint8_t i; // bit counter

    for(i = 0; i < 8; i++) { // loop through all the bits in the byte
        if(byte & 0x80) { // send a 1
            ws2812b_pulse(1);
        } else { // send a 0
            ws2812b_pulse(0);
        }
        byte <<= 1; // shift all the bits
    }
}

The actual protocol of the WS2812B is to get three bytes worth of data, in green, red, blue order. If you send in another three bytes, it will shift out the previous bits to the next LED. Once you’re ready to commit, you send the reset signal, and the chips latch all their data and start showing the corresponding colors with their LEDs.

Here is the code to send all three bytes in the proper order:

void ws2812b_rgb(uint8_t red, uint8_t green, uint8_t blue) { // send RGB data to LED

    ws2812b_byte(green); // green data
    ws2812b_byte(red); // red data
    ws2812b_byte(blue); // blue data
}

To latch in that data, I created a macro that sends the reset pulse:

#define ws2812b_reset() ws2812b_pulse(2) // send 'reset' pulse to latch data

As a demonstration of all the possible colors at the lowest possible brightness, I wrote this simple loop that repeats endlessly:

while(true) { // an endless loop

    ws2812b_rgb(0, 0, 0); // black
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(1, 0, 0); // red
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(1, 1, 0); // yellow
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(0, 1, 0); // green
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(0, 1, 1); // cyan
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(0, 0, 1); // blue
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(1, 0, 1); // magenta
    ws2812b_reset(); // reset
    Delay_Ms(250);

    ws2812b_rgb(1, 1, 1); // white
    ws2812b_reset(); // reset
    Delay_Ms(250);
}

You’ll find that these little LEDs are quite bright when told to shine at their utmost capacity. In fact, you need to be careful when you are working with more than just a very few of these in a string, as the current consumption goes way up way quick, and they tend to produce a good amount of heat in the process. But one little LED on my little dev board is going to continue to behave itself and blink merrily along into the night.

Posted on Leave a comment

Getting getchar() to Get Characters

22 February 2025

Yesterday I was poking around inside the GPIO ports of the CH32V003 chip, and “printing” out the results to the “console”. The default applications created by the MounRiver Studio 2 “new project wizard” sets up a nice mechanism whereby us old-school programmers can use the printf() function from the standard I/O library, stdio. I tried, unsuccessfully to use the corresponding getchar() function to read a single character back from the console, but it flat didn’t work at all.

I totally guessed yesterday, albeit correctly, that this was due to a lack of a lower-level function to redirect console input from the USART. Today, I did some research and discovered that it takes that lower-level function and an additional incantation to get the thing to work as I wish it to work.

That lower-level function is called _read(), with a leading underscore, and it is expected to have the following prototype:

int _read(int file, char *result, size_t len);

Since I’m not using a bunch of files and other niceties, I can just ignore the first parameter. If, in the future, I wanted to support “getting characters” from other sources, I could go back and match up the file number specified with the various sources. Today I skip it. If you like to compile with lots of error checking, you will most likely get an “unused parameter” error, so you might need to flag it as used. The default setup provided by the MRS2 project is quite forgiving in this regard.

The “result” parameter is a pointer to an array where we will store the incoming number of bytes requested. The final parameter indicates how many characters are wanted by the caller.

The little “helper function” needs to send back a return code that either represents the number of characers actually read or -1 in the case of an error.

Here’s what mine ended up looking like:

int _read(int file, char *result, size_t len) { // support stdio.h getchar(), etc.
    
    int return_code = 0; // code we will return
    size_t bytes_to_return = len; // capture number of requested bytes to return

    if(len == 0) return 0; // that was easy

    while(bytes_to_return) {

        if((USART1->STATR & USART_STATR_RXNE) == USART_STATR_RXNE) {

            // there is a character ready to be read from the USART

            *result = USART1->DATAR; // read character from USART data register and store it in the requested buffer
            result++; // advance buffer address
            bytes_to_return--; // decrement requested number of bytes requested
            return_code++; // count how many bytes have been returned so far

        } else {
            // probably should add some sort of time-out mechanism here
        }
    }

    return return_code; // number of bytes returned;  no error states to report as yet
}

You’ll note a special case at the beginning where if the caller asks for exactly zero bytes, we throw up our hands in joy and say, “Your wish is granted!” and just return. After all, we did everything that was asked.

Next the code goes into a loop to retrieve the requested number of characters, one at a time. Within the loop, we just look at the status bit RXNE (receive register not empty) in the USART STATR status register. If it is set, we read the character from the USART data register DATAR and stuff it into the receive buffer. If it is not set, we effectively wait forever until something comes in.

The number of bytes to return is tracked as is the return_code, which represents the number of bytes actually returned and is kept in the variable named return_code. The loop eventually exits and the return_code (number of bytes returned, in this case) is returned to the caller.

This code by itself is not enough to make the getchar() function get any characters; well, at least not in the way you would expect it to. Since all these standard I/O functions are inherited from Linux/UNIX and were originally written for much larger systems, the default behavior of the function is to gather up a bunch of characters and buffer them before actually returning to the caller. This makes sense when your program is being run alongside many other programs as well as an operating system. On the scale of our little CH32V chip, maybe not so much.

So here is the incantation I promised you:

setvbuf(stdin, NULL, _IONBF, 0); // disable buffering on stdin

As you see explained in my little code comment, this otherwise cryptic and mysterious function disables buffering on ‘stdin’, the default input device for the imaginary console we are using.

Now we can use the very handy getchar() function to wait for and get a character from the USART, which is in turn connected via a system of tubes and hoses to our host development system, somehow.

Future improvements to the very basic _read() function demonstrated here would include both a time-out mechanism and the ability to differentiate between the various possible sources for input to our little chip. Perhaps you can think of some more improvements, as well. Please do share them in the comments.

Posted on Leave a comment

An Experiment to Satisfy My Curiosity

21 February 2025

I was setting up a small CH32V003 demo project to see which GPIO pin got toggled in the default MounRiver Studio 2 application. On the CH32X035 default application, it’s PA0 (GPIO port A, pin 0). But there’s no PA0 on the CH32V003, even on the largest package. So which pin is it?

The answer surprised me.

It turns out that the default application generated by MRS2 does not blink an LED or toggle a GPIO pin at all. It sets up the USART to receive and echo characters via the virtual serial port on the WCH-LinkE programming adapter. Technically, it doesn’t faithfully echo the character; it inverts all the bits of any received character then transmits that back to the console.

So as not to feel entirely shut down, I plowed ahead and made it blink an LED. No PA0, you say? No problem, I answer. I see that there is a PA1, which is pin 2 on the CH32V003F4U6 I’m using, and that should do just as well.

I added the requisite code to enable GPIOA and set up PA1 as a push-pull output of modest speed:

    RCC_APB2PeriphClockCmd(RCC_APB2Periph_GPIOA, ENABLE); // enable GPIOA peripheral clock

    GPIO_InitTypeDef GPIO_InitStructure = { 0 };

    GPIO_InitStructure.GPIO_Pin = GPIO_Pin_1; // PA1
    GPIO_InitStructure.GPIO_Mode = GPIO_Mode_Out_PP; // output, push-pull
    GPIO_InitStructure.GPIO_Speed = GPIO_Speed_2MHz; // doesn't have to be fast

    GPIO_Init(GPIOA, &GPIO_InitStructure);

Actually, I cut and pasted that code from the existing, pre-generated code from the project that sets up the USART. Now since the default mapping of the USART’s transmit (PD5) and receive (PD6) pins belong entirely to GPIOD, not GPIOA, it was a while before I noticed that my initialization code was wrong. This only took an embarrassingly long time to find by single-stepping the code and looking at the bits in the configuration register for the GPIOA peripheral.

So once I had discovered that I was, in fact, re-initializing GPIOD, at least pin 1 in any case, I assumed it would just start working. I had attached one of my very favorite LEDs, a 5mm blue LED from waaay back. Indeed, it might be one of the first blue LEDs I ever obtained. I still remember the moment that Billy Gage of BG Micro showed these to me. It was an amazing experience.

So I’ve attached the blue LED with its requisite 270Ω resistor between PA1 and ground. You know the steps: save the file, recompile and download. Blinky blue goodness?

Goodness, no. Still no blinkage. I’m getting just a little exasperated at this point. Blinking an LED is the “Hello, world!” of the embedded development world. It is both a rite of passage and a trivial accomplishment at the same time. I would have assumed at this point in my career and at this particular point on my learning curve of these devices that I would be seeing a blinking blue LED. I had done it before, countless times. I would do it again!

The only other thing I could think of was that the PA1 pin had been re-mapped to a different function. These new microcontrollers have so many internal peripherals that not all of them get their very own pins. Scarce resources must be thoughtfully allocated. I looked up the re-mapping options for this pin in the CH32V003 data sheet. Yes, it can be re-purposed as the external crystal input, OSCI. But I hadn’t asked it to do that.

Cranking up the debugger again (how would I even survive without this tool?), I look at the remap register PCFR1 in the AFIO peripheral, and there is PA12_RM. Now that’s not the best possible name for it, is it? It’s the ‘remap option bit for PA1 and PA2’, but it sure looks like they are referring to a mysterious PA12, which totally does not exist on this chip.

And yes, the bit is set, meaning that the function of PA1 (and PA2) has been shifted over to quartz crystal oscillator duty, and not GPIO function, as I intended.

Now this is not the default state of this re-mapping option. Someone, somewhere, was sneaking in during the night, replacing everything with an exact duplicate and setting that bit in blatant contradiction to my wishes.

Something told me to review the clock options located in the “system_ch32v00x.c” source file, which is created by the MRS2 new project wizard. Sure enough, it had selected “#define SYSCLK_FREQ_48MHz_HSE 48000000” as the default clock for the system. The HSE is the “High Speed External” oscillator. My circuit has no quartz crystal attached to PA1 and PA2. You might remember that I have a very special blue LED attached to PA1. PA2 happens to have a WS2812 programmable LED attached to it, but I wasn’t even going to play with that (yet).

Changing the selection to “#define SYSCLK_FREQ_8MHz_HSI 8000000”, saving, recompiling and downloading finally gave me the blinky blue triumph I felt that I deserved at this point. Whew!

Now you may be asking, “How could the system even run at all with no crystal attached, if that was how it was configured to run?” And that would be an excellent question. The answer is that it goes through a sequence of steps to get to that point, and when any of those steps fail, it just continues on. There is an internal variant of the high speed oscillator, properly named the HSI oscillator, that is always present and is on by default when the chip first powers up. It runs at a nominal 24 MHz, but can be divided by a selection of integer prescalers (1, 2, 3, 4, 5, 6, 7, 8, 16, 32, 64, 128 and 256). It divides the 24 MHz signal by 3 to give me my selected 8 MHz clock, once I specified it correctly. So it was previously running at 24 MHz, clocked from the HSI oscillator, since the HSE failed to start, and so it never even tried enabled the built-in phase locked loop to double the frequency to 48 MHz. Additionally, the mechanism to switch system clocks will just silently ignore your request if the required signal is not available and stable.

So now I have my blinking blue LED and all is well with the world. I should stop here, right? Always quit a winner, they say.

Well, of course not. Now is the time to answer all the other nagging questions I have had about certain aspects of this chip, and specifically some of the functions of the pins.

Having first been introduced to this family of chips by the smaller, eight pin packaged CH32V003J4, I had struggled to understand the availability of pins and functions. That particular beastie has multiple GPIO pins tied to each physical pin – but not all of them! PD7, GPIO port D, pin 7, which can double as the external reset signal NRST, was not pinned out at all. On the expansive F4U6 package (QFN20, quad flat no leads 20 pins) sitting before me, PD7 is brought out to pin 1. Now what will it take to actually be able to use this pin as a chip reset?

The answer might surprise you.

Nothing, actually. It’s already set up from the factory to be the reset input signal. In fact, you would have to go into the ‘user option bytes’ and change the configuration of the RST_MODE field to allow PD7 to be used as a GPIO pin. Then you would have to reset the chip for the new setting to take place.

I set out to confirm this theory by connecting a momentary push button switch between PD7 and ground. When I press the button, the chip resets. If I hold the button down, the chip does nothing at all.

Now a clever sort of developer could enable PD7 as a GPIO pin, then connect an external interrupt to it, so that it could be an ‘intelligent’ reset input, while still being completely asynchronous. The interrupt handler would consider the ‘request for reset’ and decide, based on what was important at the time, whether to reset or not. Resetting the chip from code can actually be done in a number of ways. How many do you know? Share your favorites in the comments.

So that experiment was quick and satisfying for me. PD7, which is available on every package except the -J4 SOP8, is a perfectly cromulent nRST input, and works exactly as one would expect it to work.

So does that wrap up all the experimentation for today? Can you not see the scroll bar on the right side of this screen? Of course it doesn’t!

Reviewing the pinout of the F4U6 package, I see that GPIO port A has only two pins present, PA1 and PA2, while ports C and D both have eight pins each. Now still thinking that these devices are bigger on the inside than they are on the outside, as far as available peripheral connections to available physical pins is concerned, it seems to me that it was odd that GPIOA only had two pins to it. It probably wasn’t even going to have any pins, as those two pins would or could be allocated to an external crystal for system clocking purposes. But it would seem to be a waste of two perfectly good pins if the end-application did not require the exquisitely precise timing that a quartz-based oscillator can provide. So they wisely put in another GPIO port on the chip.

But does it really only have two pins in it? Or is it, and this was my suspicion, actually an exact copy of the other two ports, GPIOC and GPIOD, with a total of eight ‘pins’ internally and only two of those pins brought out to physical pins on the package?

Now without decapsulating the device and taking some pictures through a microscope, which is completely a reasonable thing to do in someone else’s laboratory (not mine), how could we determine if those phantom pins exist or not?

One way would be to write various bit patterns to the output register, OUTDR, then read them back in and see if they all toggled on and off in unison. So just to be exhaustive, I wrote a short loop that wrote all 256 combination of ones and zeros to GPIOA->OUTDR, then read them back in and compared the results. If they all matched, it meant that all eight bits were realized internally and just not pinned out. If there were mismatches, it would indicate that some or all of the other bits were, in fact, unimplemented.

So I got tons and tons of mismatches. But since I was writing them out one line at a time on the serial terminal, the first results scrolled past too fast to examine.

I added a dummy ‘getchar()’ call to wait for the user (me) to hit a key on the keyboard after every 16 lines, for a very simple sort of pagination of the output.

For some reason that I have yet to investigate, the getchar() function simply returns without ‘getting’ any ‘chars’ at all. It probably has something to do with the fact that I have not provided a low-level read() function for the stdio library to use to let it know whence the aforementioned characters. An experiment for a future day.

Since I already had the USART initialized for the console output using the printf() function, et al., I just called the SDK-provided function to wait for a character to arrive, then read and discard said character. Pagination accomplished.

Now while my progression of output test patterns went from 0x00 to 0xFF in the expected order, the returned values that were read back in consisted only of 0x00, 0x02, 0x04 and 0x06. These values represent the four possible states in binary of bit positions 1 and 2, or PA1 and PA2 as we know them.

The conclusion I reach at this point is that only the two published GPIO pins, PA1 and PA2, are actually implemented on this chip. Do you agree or disagree with my conclusion? What other testing methodology should I apply to dive deeper into this Important Scientific Investigation?

Just to help me feel better about the testing I had done on GPIOA, I repeated the same test on both GPIOC and GPIOD. In all 256 cases, each port read back the exact expected value as had been written to it. All eight bits of GPIOC and GPIOD are implemented, which is not surprising at all as they have all been routed to different pins on the package. But it does give me a positive result to help me have a little confidence in my testing strategy.

What I found especially interesting about the testing on GPIOD was that it ‘succeeded’ even when some of the pins were being used for other function, such as the USART (PD5 and PD6) and the nRST input (PD7).

But you may be asking, “Wait a minute… what happened to GPIO port B?” And that would be an excellent question. So I set out to try to discover if there was any vestige of a GPIO port B on the chip.

The first thing I did was to try to set the ‘peripheral reset’ bit for GPIOB in the RCC peripheral. There are bits defined to reset the GPIO ports A, C and D, as well as the AFIO (alternate function input output controller) peripheral. There is a suspiciously ‘reserved’ spot between IOPARST and IOPCRST bits in the APB2PRSTR register within RCC. I fudged my own definition for this missing bit, as well as the upcoming IOPBEN peripheral clock enable bits, like this:

#define RCC_IOPBRST (1 << 3) // not defined in device header (for a reason)
#define RCC_IOPBEN  (1 << 3) // not defined in device header (for a reason)

If you write all ones into this peripheral reset register (there are two of them, actually), then read back that register, you will find that only some of the bits still have ones in them. Those ones represent peripherals that are 1) implemented and 2) able to be reset. GPIOB, as represented by my completely fake RCC_IOPBRST bit, was a zero.

Now remember to release all those peripherals that you just reset or they will remain in a reset state in perpetuity.

You can do the same thing with the peripheral clock enable registers (there are three of these in total). Again, GPIOB fails to stick when writing a one to the RCC_IOPBEN bit.

So there really is no GPIOB implemented on this chip.

Now we know.

Posted on Leave a comment

Notes on RISC-V Assembly Language Programming – Part 19

14 February 2025

I spent some more time debating with myself about expanding my coverage of these lovely Hershey fonts to some of the other sets, but have decided for the moment to pause with the plain and simplex versions of the Roman set. These more than meet my immediate requirements for the present project and are nice to look at as well.

Now I need to go beyond plotting a single character in the center of the screen and write a little code to send them out to the screen in a more utilitarian manner. For now I’m going to use the ‘native’ resolution of 21 ‘raster units’ in height and see if I can get three lines of readable type on the screen at once.

Without scaling the fonts, the most I can get is two lines of text. But that is when I don’t accomodate the ‘tall bois’, like the ‘[‘ and ‘]’ brackets and , surprisingly, the lower case ‘j’. Expanding all the margins so that everything actually fits only allows a single line of text, sometimes with as few as 4 characters, for important messages such as “mmmm” or “—-“.

Revisiting the ever-so-fascinating statistics of a few days ago, we see where this is coming from:

Max width   30  613  m
Max x       11  613  m
Min x       -11 613  m
Max y       16  607  g
Min y       -16 719  $

Well, there’s our friend, the expansive ‘m’ and the other titans of the simplex set.

So now it’s time to scale the fonts and see if I can get a more useful number of characters on the screen at the same time and still have them be legible.

Posted on Leave a comment

Notes on RISC-V Assembly Language Programming – Part 18

12 February 2025

I can fit the scalar values for each glyph into a one-dimensional array. Then I need an array of pointers to variable-length arrays of coordinates. Others have been able to do all this with a single array, but I see a lot of wasted space in there.

I’m trying to decide ahead of time if I need to reproduce the left column and right column values in the representation array, or if I can just get away with character widths. Or do I even need to keep track of the character widths? I could just treat these as monospaced characters and just pick a number.

Here are the leftest and rightest columns from the plain set:

Max left = (-2, 9)
Min left = (-8, 1241)
Max right = (9, 1273)
Min right = (2, 9)

And here are the same statistics from the simplex set:

Max left = (-4, 509)
Min left = (-15, 613)
Max right = (15, 613)
Min right = (4, 509)

After I’m ‘done’ with these scalable fonts, there’s one more bit-mapped font trick I want to try. I can take my existing 5×8 font and double or triple it in size, giving a blocky character. That might better represent the types of letters and numbers seen on temporary highway signs, as those still tend to be composed of 5×7 (or so) LED matrices. But when am I ever ‘done’ with anything?

So I am going to assume that I need all this data for now, and incorporate it into some data structures and try to port them over to the project and see if I can plot some nice looking characters onto the little OLED screen.

The first array encodes the ASCII value of the character as the index, so that doesn’t need an actual slot in the data file. In reality, since the first 32 ASCII characters are technically unprintable, our array index [0] points to ASCII value 32, the space, which, ironically, while a ‘printable’ character, does not print anything. This offset is just something that has to be remembered.

Each entry in the array will be a typedef’d structure containing the requisite information:

Information         Plain       Simplex
------------------  --------    ---------
Number of vertices  (0, 38)     (0, 56)
Left hand column    (-8, -2)    (-15, -4)
Right hand column   (2, 9)      (4, 15)
Coordinate index

Note that these sampled data values only represent the two subsets, roman plain and roman simplex. Using any of the other styles will have different values. Just for completeness, here are the statistics for the entire occidental glyph set:

Statistic           Value   Character
------------------- ------  ---------
Max vertices        143     3323
Max left            0       197
Min left            -41     907
Max right           41      907
Min right           0       197
Max character width 82      907
Max x               41      907
Min x               -41     907
Max y               41      907
Min y               -48     2411
Max dx              40      796
Min dx              -29     2825
Max dy              78      2405
Min dy              -80     2411
------------------- ------
Total vertices      47,465

Just looking at the total number vertices, and remembering that each vertex will require a minimum of two bytes for storage, we see that this little device with its 62K of flash memory will not be big enough to render every one of these characters, without adding an external memory device of some sort. So for now, I’ll content myself with the plain and simplex roman variations.

The vertex encoding get tantalizingly close to a single byte per coordinate pair. However, I want to also encode the ‘pen up’ information, which I use to distinguish ‘move to’ and ‘draw to’ commands. If I felt like running histograms on these data sets, I might be able to see a further pattern or trend that would allow me to use a look-up table for these values. But I am going to leave that as an exercise for you, my Dear Reader. I have to draw the line, somewhere.

So it looks like our benefactor, Dr. Hershey, was on to something when he originally encoded his coordinates as pairs of single digits. I’m not going to use his precise technique, although it will still end up as 16 bits of data per vertex. I’m just folding in the out-of-band ‘pen up’ condition to each coordinate pair.

Reviewing the summary, it looks like our friend character 906 is bringing home all the gold medals. It’s the ‘very large circle’ glyph, and I’m going to disqualify it for being an outlier. This is the one that broke my Python script and simplistic transmission encoding. It’s a lovely pentacontagon, or fifty-sided polygon, and therefore the smoothest of the approximated circles in the repertory.

Statistic           Value   Character
------------------- ------  ---------
Max vertices        143     3323
Max left            0       197
Min left            -27     2411
Max right           24      2381
Min right           0       197
Max character width 46      992
Max x               22      906
Min x               -24     2411
Max y               39      2403
Min y               -48     2411
Max dx              40      796
Min dx              -29     2825
Max dy              78      2405
Min dy              -80     2411
------------------- ------
Total vertices      47,415

So for the vector array, each vector will be a typedef’d struct holding the x and y coordinates as signed integers, as well as a boolean ‘pen up’ flag to distinguish ‘move to’ from ‘draw to’. Since the x axis shows a slightly smaller range of values, I’ll squeeze the ‘pen up’ flag into the x side, perhaps like this:

typedef struct { // vertex data
    int         x:7;        // x coordinate
    PEN_UP_t    pen_up:1;   // 'pen up' flag
    int         y:8;        // y coordinate
} VERTEX_t;

So I’ll need to add some more to my little Python script to generate the data for these two arrays, then emit it in a close approximation of my C coding style.

It took a bit of fiddling and also some back-and-forth to get the data structures ‘just right’, but I was able to port over both the plain and simplex roman character sets and have them plot out on the OLED screen. One thing that tripped me up was the vertex count. The original definition file described a ‘vertex count’ that also included the left and right column data as an additional vertex. Also, it counted, as it should, the ‘PEN_UP’ codes. These two little deviations that I introduced into the True Form sure made things look weird on the little screen for a while. But I eventually realized the error of my ways and corrected the code. Now it runs through either the plain set or the simplex set with the greatest of ease. Drawing a single character at a time happens so quickly, it seems almost instantaneous. I’ll have to try printing out a whole screen of text and see if I can tell how long it’s taking.

Next I’ll need to see about scaling these ‘scalable’ fonts to fit my imagined sizes for the different formats I’d like to support. I also need to look at the big-blocky font I suggested previously.

Posted on Leave a comment

Notes on RISC-V Assembly Language Programming – Part 17

11 February 2025

Now I can focus on compacting the vector data for the glyphs I need for the project. But first, I have to identify them. This has already been done many times in the past by many people, but I feel that I have to do it myself. Unless I change my mind, which is something I can totally do.

A clever collection of interesting Hershey font information has been published by Paul Bourke:

https://paulbourke.net/dataformats/hershey/

Included in this archive, dated 1997, are two files, romanp.hmp (Roman Plain) and romans.hmp (Roman Simplex). These files contain the ASCII mapping data for the plain and simplex varieties, respectively.

The ‘plain’ subset consists of the smaller glyphs. There are no lower case versions (miniscules). The upper case (majiscules) is repeated in their stead. Some statistics I gathered from the plain subset include:

Statistic       Value   Character
--------------  -----   ---------
Max vertices    38      1225 {
Max width       17      1273 @
Max x           7       1273 @
Min x           -6      1246 ~
Max y           10      1223 [
Min y           -10     1223 [
                -----
Total vertices  764

These glyphs can be encoded with 4 bits for the x coordinate and 5 bits for the y coordinate.

The ‘simplex’ subset contains the larger glyphs, including upper and lower case, numerals and punctuation. They are also much more detailed. Here are the same statistics from the simplex set:

Statistic       Value   Character
--------------  -----   ---------
Max vertices    56      2273 @
Max width       30      613  m
Max x           11      613  m
Min x           -11     613  m
Max y           16      607  g
Min y           -16     719  $
                -----
Total vertices  1303

These larger glyphs can be encoded using 5 bits for the x coordinate and can almost squeeze the y coordinate into 5 bits… almost.

So far we’ve only been using absolute coordinates for these mappings. I wonder how much space we could save by using a relative distance from point to point? Start with an absolute coordinate and then just specify relative motion along each axis?

For the plain set, we get these statistics for relative distances:

Statistic   Value   Character
---------   -----   ---------
Max dx      10      809  \
Min dx      -12     1246 ~
Max dy      20      1223 [
Min dy      -20     1223 [

For the simplex set, we get these numbers:

Statistic   Value   Character
---------   -----   ---------
Max dx      18      724 -
Min dx      -18     720 /
Max dy      32      720 /
Min dy      -32     733 #

So the answer is no, the relative values have a greater range than the absolute values. I find this result entirely counter-intuitive.