Gas station without pumps

2017 August 15

Beacon detector SPI interface

Filed under: Robotics,Uncategorized — gasstationwithoutpumps @ 12:53
Tags: , , , ,

In Beacon detector board, I introduced the hardware for the IR beacon detector, and in Digital tone detection with Goertzel’s algorithm, I discussed the algorithm used for detecting a 2kHz signal in a noisy background.  In this post I’ll talk about the software for interfacing the board to microcontrollers as if it were a standard peripheral device.

When I was designing the hardware, I had to choose between providing a UART interface, an SPI interface, or an I2C interface. At one point, I considered including all three, since I had just enough pins to do that and plenty of board space for connectors.

My son urged me to use a UART interface, since asynchronous communication is much easier work with than the two synchronous interfaces—there are no tight timing constraints. UART communications tends to be slow (115200 baud is a common top speed, resulting in a maximum of about 11.5–12.8kbytes/second transferred), and the two sides must agree on the baud rate to within 2.5%. But there are no latency constraints—you can respond to a request whenever you are finally ready—so there are no worries if you need to wait for a timer task to complete before communicating. The main downside of a UART interface (besides the usually low speed) is that it is a one-to-one interface. Two pins are needed for each UART a microcontroller uses, while the synchronous interfaces provide shared bus communication.

I knew that I could do a UART interface and for the small amounts of data I needed to communicate (10–75 bytes every 60th of a second, or 600–4500 bytes/s) it would be fast enough. But I wanted a challenge that would help me learn something new, so I decided to do a synchronous interface. Of the two choices supported by the Teensy LC hardware, SPI and I2C, the SPI interface looked the simpler to implement in software, as the timing seemed a little less challenging, and there was a FIFO that could be loaded with several bytes of information to transfer, reducing the timing demands. (In retrospect, I2C may have been easier—I’ve not tried it, though, so there may be some unexpected challenges.)

I created the board with only one I/O port (other than the USB port used for programming the board), the SPI0 4-wire interface using pins CS=D10=PTC4, MOSI=D11=PTC6, MISO=D12=PTC7, SCK=D13=PTC5.  I used a 6-pin right-angle header at the edge of the board, so that I would be able to use a ribbon connector to connect to another microcontroller. The order of the pins is the same as Digilent uses for their PMOD connector—an arbitrary choice, but there does not seem to be a commonly accepted standard for SPI connections.

The choice of pins was the first mistake I made in the SPI design—I should have used SPI1 instead of SPI0.  In reading the reference manual for the KL26 chip that is the basis for the Teensy LC, the SPI FIFO is only implemented for SPI1, and not for SPI0.  Also, the legal SPI clock speeds are different for the two interfaces:  SPI0 is clocked off the bus clock (default 24MHz on the Teensy LC) and SPI1 off the system clock (default 48MHz). The external clock SCK is allowed to go from 0Hz to ¼ the module clock, so 6MHz for SPI0 and 12MHz for SPI1.

The lack of a FIFO on SPI0 made the timing for loading the transmit buffer a little tricky.  The hardware has two registers for the transmit: the shift register used for serializing a byte and the transmit buffer that holds the next byte to transfer to the serial register.  If the transmit buffer is not loaded in time for the serial shifting, then the SPI interface will duplicate the previous byte, messing up communication.

The Teensyduino library contains SPI master software, but not SPI slave software.  I was wondering about the reason for that before I started my coding, but I quickly realized why—the timing for SPI is almost entirely under control of the master and the slave is constrained to respond at times specified by the master.  The most common SPI protocols for peripherals consist of the master sending a 1-byte (or longer) command to the slave and getting a response back in the next byte slot(s).  This gives the slave about half a clock period to interpret the command and prepare the response. With a 1MHz SCK, that gives the processor about 500ns to set up the response. That’s plenty of time for an all-hardware system, but is a very tight constraint for a software implementation.  In fact, given that the interrupt latency on the KL26 can be about 800ns, it is an impossible constraint—I don’t even know that I need to send a byte, much less figure out what it is, in the available time.  The more complicated I2C interface allows a slave to stretch out the clock when it can’t respond in time, but the SPI interface doesn’t allow the slave to change the clock rate.

So I couldn’t use the most common SPI interface protocol of command followed by immediate response. I had a few choices of workarounds for the latency problem:

  • Have a standardized packet of material that is sent independent of what the master does.  This seems to be the solution used by SPI slave libraries others have written for the Teensy boards (not part of the standard Teensyduino library).  It allows high throughput, especially if DMA is used to load the transmit information, but is not very flexible.  It is not a commonly used protocol on SPI peripherals.
  • Require the master to wait before asking for the response bytes.  This is very flexible and allows the SPI bus to run at maximum speed during transfers, but is not really in the spirit of a synchronous interface—the slave should respond at a standardized time, and the master should not have to know how long it will take the slave to be ready to send.
  • Send a dummy byte in the slot after the request so that I have a full byte time (8 SCK periods) instead of half a bit time to get the right information ready. A disadvantage is that the throughput is reduced—the dummy bytes take up bus time without communicating useful information. If the protocol calls for a 1-byte command and a 1-byte response, then each transfer would take 3 byte times.  (A 1-byte command and an n-byte response takes n+2 byte times.)

I ended up using a slight variant of the dummy-byte approach: I allow the master to send new commands before the first response comes back.  Dummy bytes are only added when there are no real bytes to send, so a series of n commands requesting 1-byte responses could be sent in a row, with the total time for the transfer taking n+2 byte times—the first two slots of which the slave is sending dummy bytes, and the next n are the requested data.

To make this protocol work, I had to ensure that the SPI interrupt priority is higher than any other interrupts in the program, in particular, higher than the PIT timer that I was using for the 15kHz sampling frequency.  This means that the SPI transmissions will introduce jitter into the ADC sampling—I’ve not yet determined how much of a problem this will be, but I suspect it won’t be much of a problem, since it will only affect one or two samples in the 250-sample filter block.  The default interrupt priority on the KL26 processor in the Teensy LC is for all interrupts to be at the highest priority, so I had to lower the PIT priority rather than raising the SPI priority.

The basic notion is that I keep the SPI transmit buffer always full.  When there is a transmit-buffer-empty interrupt, I load the buffer from a software FIFO I maintain, or with a dummy byte (0xff) if there is nothing in the FIFO.  The responses to a command are loaded into the FIFO.  As long as the SPI interrupt response (including interpreting the command and any FIFO loading or popping) takes less than the time of one byte, the protocol runs smoothly.  With commands that can send up to 4-byte responses, I was able to run SCK up 1.25MHz without losing synchronization.

My current, minimal command set for the beacon detector has the following commands:

  1. Freeze the information from filter so that all subsequent commands refer to the same set of data.  Return 1 on success, 0 on failure.  (The failure is only possible if the request is sent too soon after a reset of the beacon detector board.)
  2. Report which channel has the highest power (1-byte: 0xff if no beacon detected, otherwise 0..7)
  3. Report the estimated angle of the beacon in 0.1° units (2-bytes: 0..3599)
  4. Report the power in the max-power channel (4-bytes: unsigned int)

I could also report the power in specific channels and the power from an alternative (3kHz instead of 2kHz) filter for each channel, which would be another 64 bytes of information, but I’ve not implemented those commands yet.  The minimal set I currently have responds with 8 bytes of information, so takes 10 byte times for a burst of communication. With the highest clock rate I’ve gotten to work (SCK=1.25MHz), a full packet takes 64µs, which would interfere with only one or two samples of the 15kHz sampling.

I may play around with different command sets after I get another board programmed as a master to communicate with the beacon detector and determine what information I really need from the beacon detector.  One possibility is to use just a single 2-byte response that encodes everything but the power: that would allow a 4-byte-time packet (25.6 µs).

Setting up the SPI software ran into all the usual problems with setting up a peripheral on a Kinetis ARM processor—there are many registers that need to be set up correctly, and nothing works until they are all right:

  • Turn on the SPI module clock: SIM_SCGC4 |= SIM_SCGC4_SPI0
  • Set the SPI control registers SPI0_C1 and SPI0_C2 for 4-wire SPI with CPOL=1, CPHA=1, 8-bit mode, interrupt on receiving a byte or on transmit buffer empty.
  • Set up the isr to interpret the recieved bytes and keep the transmit buffer full.                                           
  • Configure the 4 pins used to be ALT 2 (SPI0) rather than the default GPIO.
  • Set the priorities using NVIC_SET_PRIORITY(IRQ_SPI0, 0) and NVIC_SET_PRIORITY(IRQ_PIT_CH1, 64).  The KL26 processors have only 4 interrupt priority levels that are multiples of 64.  Lower numbers are higher priority.
  • Enable the interrupts with NVIC_ENABLE_IRQ(IRQ_SPI0)

I debugged the SPI interface using the protocol and logic analyzer tools in the Analog Discovery 2.  I found the user documentation for these tools really awful (almost no information), but I managed to get things working by trial and error (except for the “sensor” script options for the SPI protocol tool—that always resulted in the spinning beachball and force-quitting Waveforms 2015).  I initially tested with the “Master” tab of the protocol tool, but I fairly quickly switched to using the “Custom” tool which allowed me to set up more complicated sequences of reads and writes, though eventually I simplified back to writes that could be done with the “Master” tab.

The “debug” mode of the protocol tool sets up the logic analyzer to see the signals, but then doesn’t provide the decoding of the protocol in the protocol tool—one has to go to the logic analyzer to set up the triggering, switch to the protocol tool to send the stimulus, then switch back to the logic analyzer to see the results.  It is possible to get the protocol tool and the logic analyzer tool in different windows (double-click on the tab for the tool you want in a new window), but my screen is too small for showing both windows in sufficient resolution at once.

To see when the interrupt service routine for the SPI was active I added CORE_PIN1_PORTSET = CORE_PIN1_BITMASK to the beginning of the isr and CORE_PIN1_PORTCLEAR = CORE_PIN1_BITMASK, and used some wirewrap wire to get the pin 1 signal out to the logic analyzer. I’d not added test points for the unused pins, but because the Teensy board is socketed with female headers, it was easy to wrap a wire around the pin before pushing the board into the female headers, and use a loose header pin to connect the other end of the wire to DIO6 of the logic analyzer.

Here is a result of a test of sending the 4 commands in a row in an 11-byte frame, to make sure that the dummy bytes occur in the right places (in the first 2 and last slots of the 11-byte frame):

The MISO line shows the correct responses from the beacon detector: 0xff, 0xff, 1 (freeze ok), 0 (channel 0), 0x00e1=225=22.5°, power=0x08436020=138633248.

As it turns out, the end of the isr for the 4-byte response comes 0.4µs later than it should if there had not been any other bytes already in the queue. Indeed, if I start with the “4” command, things just barely work, because the isr routine starts a little sooner. The 6.8µs delay to the end of the isr routine suggests to me that the maximum safe rate for SCK is only about 1.15MHz, not 1.25MHz.  I’ll have to do a lot of testing with many data packets, to see what rate is really safe without intermittent errors.  That will have to wait until I’ve written a program to act as the master (unless I can figure out why the “sensor” script keeps crashing Waveforms 2015).

1 Comment »

  1. […] much of it to cut away for the ball launcher(s), which I’ve not designed.  I also have the digital beacon detector done, though I think I want to retweak the code for it, as I messed it up trying to get more […]

    Pingback by Progress report on mechatronics | Gas station without pumps — 2017 November 15 @ 19:27 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: