^ Blog index    << KONIK logic game
BlueNRG-2 wake up problem
2018-08-01   Piotr Romaniuk, Ph.D.
Protocol over SPI
The first test
Going to sleep and waking up - observation
Statistical testing of the problem
Further problem analysis
How to recover from the problem?
A device consists of host processor STM32L4xx and BlueNRG-2 chip. They communicate over SPI.
Processor STM32L4xx is responsible for main device functions, while BlueNRG-2 provides wireless communication in Bluetooth Low Energy standard. BlueNRG-2 works in Network Coprocessor configruation. It executes
software provided by the chip manufacturer (ST Microelecronics) with small modification.
DTM project were used (from software development packet BlueNRG-1_2 DK 2.6.0_2, project: Project/BLE_Examples/DTM), and from original code, upgrade functionality was removed just to simplify.
The project was compiled in Atollic TrueSTUDIO for STM32 9.0.0.
The host procssor runs own code, written on the basis an example from the Development Kit for that configruation (i.e. communication over SPI with BlueNRG2).
During long regular communication with BlueNRG-2 over SPI timeout appears - there is no response of the chip for sent command (different functions of an library return BLE_STATUS_TIMEOUT = 0xFF).
The problem occurs statistical once per hour of correct communication, it can be after 15 minutes or 2 hours.
Time instants when the problem appears seams to be non-deterministic. The timeout status code is
returned from a function that previously has worked. After the problem further communication with BlueNRG-2
is not possible, any library functions called at host processor (STM32L4xx), returns timeout.
Structure of the software is illustrated in following diagram:
Fig. 1. Software structure including split between processors.
The Bluetooth LE Library, host part, is a kind of adapter, that interconnects
the library API (functions provided for application layer) and SPI communication.
The adapter encapsulates communication over hardware interface.
A functionality of this part of the library is not very complex, and mainly consists in
repacking arguments from functions to messages sent/received on the serial interface.
The communication is implemented as a dialog - REQ-ACK fashion and events sent asynchronously.
Inside BlueNRG-2 (Cortex-M0) processor is Transport Layer that is analogic adapter but on
this side of SPI. It is not very complex as well. Main functionality of Bluetooth LE Stack
is in layer provided in binary form - Bluetooth LE ST Library (controller part).
In this chip there is also small part of software that controls going into sleep
when nothing is sent. It controls waking up process on hardware request or the host processor signals
that needs to transmit a message.
Protocol over SPI
Detailed description of the protocol is in the Development Kit (Docs/SPI_protocol_specification/SPI_protocol_specification.html).
The main features are:
- 5 hardware lines (CS, SCK, MISO, MOSI, IRQ)
- host processor is a spi master
- BlueNRG is a slave
- IRQ line has two functions: acknowlegement and signaling a sending request
- if host procesor has a message to send it activates CS (low level) and waits for acknowlegement by high level on IRQ line - this means that BlueNRG is awaken and is ready to exchange messages,
- if BlueNRG needs to send something to host processor, it signals the request by raising IRQ line,
and waits for transmission
- the message starts from a header, where direction is encoded, number of data and
left capacity in receive buffer at BlueNRG side,
The first test
The timout is observed on the library API - between application code and host part of the BLE library.
This is only a manifestation of the problem, that is observed on the top.
But where the problem arise and what is the source of that?
Brief source code analysis point to missing BlueNRG response to CS activation.
In order to confirm that observation, logic signals on SPI were recorded by logic analyser.
Fig. 2. Captured the problem case when timeout appears.
Although CS has been activated there is no BlueNRG response. No acknowledgement is observed on IRQ line -
the line is still low. The conclussion is that the problem is in BlueNRG.
Going to sleep nad waking up - the observation
According to schematics, CS line is connected to pin IO11 (WAKE_UP) of the BlueNRG chip and
can wake it up by low level.
A general diagram of signals routing inside the chip can be found in BlueNRG-2 documentation - chapter3.5.1 Reser management.
Analysis of the source code provides a conclusion that the chip goes into sleep (deep sleep)
and the processor state (registers etc.) is stored in RAM memory.
Waking up process is very similar to hardware reset - the processor starts executing
instructions from the begining (HandlerReset). The reason of the restart can be determined
by reading two hardware registers (see function SysCtrl_GetWakeupResetReason()).
The values of the registers help to distinguish between wakink up and hardware reset.
This way DTM code detects that chip has been woken up and then restores the processor state,
making possible to continue the program from sleeping point.
Unfortunately hardware debug port is disabled when processor is in sleep mode, so JTAG cannot be
used for testing processor state when the problem is detected.
Further testing was performed with different methods, mainly by signalling on hardware pin some events.
The sleeping state of the processor can be detected indirectly, just by checking a level of the pin.
That is possible because the chip has pull-down resistors and in sleep state the chip stops controlling the pins. In the figure 3 two analogic cases are illustrated (with the same messages on SPI):
correct and erroneous. Channel 0 of the recording is equivalent to sleep mode - just because
of the mentioned pull-down resistors. Low level means that BlueNRG is in sleep mode.
Fig. 3. Comparison of correct and erroneous case.
Since the problem has some relation with sleeping state and waking up, the question arise:
Why generally the waking up is correct and sometimes not? What is the factor that distinguish
these two cases? The purpose is to understand the problem and a proposition of some work-around
if the problem cannot be solved.
The measurements of several cases showed that the difference is the time between going to sleep
and a request to wake up. Multiple recordings were performed just to check this observation.
For all problem cases this time was equal 30us.
All correct cases have the time with different values: smalles or larger than 30us (once 30us
were receorded) - see figure 4.
the problem arise when waking up request is 30us
after BlueNRG-2 went to sleep.
Fig. 4. Statistics of the time between waking up request and sleeping instance
(red dot=error, blue=correct behavior.
Further tsting of the problem
In order to better understanding of the problem software tracing were performed.
This was obtained by markers inserted into code. Execution of the marker is signalled on
external pin of the chip. Because of limited number of hardware pins that can be used for this purpose
serial interface was used.
Each marker had assigned different character that was transmitted.
The markers were inserted in key software locations and serial the line was recorded together
with spi lines via logic analizer.
Additionally, on another pin interrupts were illustrated - execution of the ISR code.
No differences in program flow nor anomallies were observed for correct and incorect cases.
Fig. 5. Testing program flow in problem neighbourhood
(3=enter to saving state procedure just before going to sleep, 4=chip goes into sleep, 10=waking up request,
5=restoring the state after waking up, but before interrupts were enabled)
How to recover from the problem?
The easiest way is performing the reset. The RESET pin is connected to host processor anyway,
and can be controlled by host processor.
If it detects the timeout the reset together with re-configuration must be done (setting services, characteristics etc.).
This process can be reduced to tens of miliseconds. This can be enough when specific features of
Bluetooth LE are considered (time scale, etc.)
More advanced solution may consists in detection problematic time and skipping it over.
Sleeping state can be signalled by mentioned above method - suing pull-down resistors.
In host processor hardware timer can be utilized for time delay measurement.
The value of the timer would be checked just before CS activation. If the value would be close to
critical 30us, some small delay were added (e.g. 10us) before CS is activated.
Please note that mechanism of the locking BlueNRG in the sleeping state is not known.
I suppose that may be related to some hardware error in synchronization inside wakeup controller in
the chip. This may cause a hazard that is manifested for for this 30us.
It is interesting that 30us is very close to one period of slow clock (SXTAL) in BlueNRG.
Described here problem were posted at community.st.com, I also tried to report it directly to ST Microelektronics.
Unfortunatelly, there was no interest nor the help from manufacturer of the chip.
I regret that because BlueNRG is very promissing chip with many features.
 Manufacturer site of the BlueNRG-2 chip
- there is documentation of the chip
 Development Kit - Bluetooth LE Library with examples for BlueNRG, provided by ST Microelectronics
 Here is a discussion on community.com.st, posts in chronologic order as they apearred during analysing the problem