0xWS2812 STM32 driver for WS2812(B) RGB LEDs
0xWS2812 STM32 driver for WS2812(B) RGB LEDs
0xWS2812 pronounced "hex-WS2812"
This code aims at providing a basic interface to the WS2812(B) individually addressable RGB LEDs by WorldSemi.
The code outputs 16 parallel data streams to 16 parallel strings of LEDs. This allows the MCU to drive a large number of LEDs rather memory efficient, although RAM size is still the limiting factor for the number of LEDs that can be driven.
The number of LEDs that can be driven by this library can be approximated by the following formula (it won't be exactly that many as the library needs some RAM, too): Number of LEDs = (RAM size in bytes / 48) * 16
Whaaa? This code is crap and incomplete! WTF did you think calling this a library?!
Calm your finite state machines.
This code is a work in progress and I admit that in it's current form it's a bit of a hassle adapting it to different MCUs. I will get around to working more on this when I have time as I'm also busy with school finals. If you find a bug please report it and if you can and are willing to, provide a fix.
At some point this might actually become a full grown library that supports STM32F100, STM32F4 etc.
Why does this library exist?
Due to the non-standard NRZ protocol used to control these LEDs the correct timing of the data stream is very important and is not easily achievable with standard MCU peripherals like SPI/USART/I2C.
How does it work?
The approach used here is similar to the approach of the OctoWS2812 library for the Teensy.
This library makes use of the output compare features of the STM32s General Purpose Timer and the DMA (Direct Memory Access) controller. The DMA allows to transfer data from memory to a peripheral register in this case a GPIO port quickly without the CPU being involved. Therefore the CPU can already prepare the next frame to be sent while the current frame is still being transmitted.
The idea to create 16 parallel 800kBit/s data streams is the following:
- Use a Timer to create an 800kHz time base and a DMA request every 1.25us.
Use 2 compare modules to create DMA requests at the low bit time (350ns) and the high bit time (700ns)
- The 1.25us DMA request sets all bits of the GPIO port high
- The 350ns DMA request transfers the data from the frame buffer to the GPIO port. If the bit is a 0, the GPIO pin will go low, otherwise it will stay high.
- The 700ns DMA request sets all GPIO pins low.
- Repeat steps 1 to 3 until all bits have been transmitted.
This creates a stream of pulses with a pulse period of 1.25us and a pulse width of either 350ns or 700ns depending on the bit value the pulse represents.
Transferring the data via DMA to the GPIO port means that per 16 LEDs one half word (two bytes) is needed per bit. At 24 bits per LED that makes 24 half words (48 bytes) per 16 LEDs.
The frame buffer is transmitted MSBit first in the order G-R-B.
How do I use it?
Currently you have to fill the frame buffer with 24 bytes per 16 LEDs and then call the WS2812_sendbuf(24*#LEDs).
Licensing
This code is licensed under the MIT License, see LICENSE for more info.
/* 0xWS2812 16-Channel WS2812 interface library
*
* Copyright (c) 2014 Elia Ritterbusch, http://eliaselectronics.com
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/ #include <stm32f10x.h> /* this define sets the number of TIM2 overflows to append to the data frame for the LEDs to load the received data into their registers */
#define WS2812_DEADPERIOD 19 uint16_t WS2812_IO_High = 0xFFFF;
uint16_t WS2812_IO_Low = 0x0000; volatile uint8_t WS2812_TC = ;
volatile uint8_t TIM2_overflows = ; /* WS2812 framebuffer buffersize = (#LEDs / 16) * 24 */
uint16_t WS2812_IO_framedata[ ]; /* Array defining 12 color triplets to be displayed */
uint8_t colors[ ][ ] =
{
{ 0xFF, 0x00, 0x00 },
{ 0xFF, 0x80, 0x00 },
{ 0xFF, 0xFF, 0x00 },
{ 0x80, 0xFF, 0x00 },
{ 0x00, 0xFF, 0x00 },
{ 0x00, 0xFF, 0x80 },
{ 0x00, 0xFF, 0xFF },
{ 0x00, 0x80, 0xFF },
{ 0x00, 0x00, 0xFF },
{ 0x80, 0x00, 0xFF },
{ 0xFF, 0x00, 0xFF },
{ 0xFF, 0x00, 0x80 } }; /* simple delay counter to waste time, don't rely on for accurate timing */
void Delay(__IO uint32_t nCount)
{
while(nCount--)
{
}
} void GPIO_init( void )
{
GPIO_InitTypeDef GPIO_InitStructure;
// GPIOA Periph clock enable
RCC_APB2PeriphClockCmd( RCC_APB2Periph_GPIOA, ENABLE );
// GPIOA pins WS2812 data outputs
GPIO_InitStructure.GPIO_Pin = 0xFFFF;
GPIO_InitStructure.GPIO_Mode = GPIO_Mode_Out_PP;
GPIO_InitStructure.GPIO_Speed = GPIO_Speed_50MHz;
GPIO_Init( GPIOA, &GPIO_InitStructure );
} void TIM2_init( void )
{
TIM_TimeBaseInitTypeDef TIM_TimeBaseStructure;
TIM_OCInitTypeDef TIM_OCInitStructure;
NVIC_InitTypeDef NVIC_InitStructure; uint16_t PrescalerValue; // TIM2 Periph clock enable
RCC_APB1PeriphClockCmd( RCC_APB1Periph_TIM2, ENABLE ); PrescalerValue = (uint16_t) ( SystemCoreClock / ) - ;
/* Time base configuration */
TIM_TimeBaseStructure.TIM_Period = ; // 800kHz
TIM_TimeBaseStructure.TIM_Prescaler = PrescalerValue;
TIM_TimeBaseStructure.TIM_ClockDivision = ;
TIM_TimeBaseStructure.TIM_CounterMode = TIM_CounterMode_Up;
TIM_TimeBaseInit( TIM2, &TIM_TimeBaseStructure ); TIM_ARRPreloadConfig( TIM2, DISABLE ); /* Timing Mode configuration: Channel 1 */
TIM_OCInitStructure.TIM_OCMode = TIM_OCMode_Timing;
TIM_OCInitStructure.TIM_OutputState = TIM_OutputState_Disable;
TIM_OCInitStructure.TIM_Pulse = ;
TIM_OC1Init( TIM2, &TIM_OCInitStructure );
TIM_OC1PreloadConfig( TIM2, TIM_OCPreload_Disable ); /* Timing Mode configuration: Channel 2 */
TIM_OCInitStructure.TIM_OCMode = TIM_OCMode_PWM1;
TIM_OCInitStructure.TIM_OutputState = TIM_OutputState_Disable;
TIM_OCInitStructure.TIM_Pulse = ;
TIM_OC2Init( TIM2, &TIM_OCInitStructure );
TIM_OC2PreloadConfig( TIM2, TIM_OCPreload_Disable ); /* configure TIM2 interrupt */
NVIC_InitStructure.NVIC_IRQChannel = TIM2_IRQn;
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = ;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = ;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init( &NVIC_InitStructure );
} void DMA_init( void )
{
DMA_InitTypeDef DMA_InitStructure;
NVIC_InitTypeDef NVIC_InitStructure; RCC_AHBPeriphClockCmd( RCC_AHBPeriph_DMA1, ENABLE ); // TIM2 Update event
/* DMA1 Channel2 configuration ----------------------------------------------*/
DMA_DeInit( DMA1_Channel2 );
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t) &GPIOA->ODR;
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) WS2812_IO_High;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralDST;
DMA_InitStructure.DMA_BufferSize = ;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Disable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Word;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
DMA_InitStructure.DMA_Priority = DMA_Priority_High;
DMA_InitStructure.DMA_M2M = DMA_M2M_Disable;
DMA_Init( DMA1_Channel2, &DMA_InitStructure ); // TIM2 CC1 event
/* DMA1 Channel5 configuration ----------------------------------------------*/
DMA_DeInit( DMA1_Channel5 );
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t) &GPIOA->ODR;
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) WS2812_IO_framedata;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralDST;
DMA_InitStructure.DMA_BufferSize = ;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Word;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
DMA_InitStructure.DMA_Priority = DMA_Priority_High;
DMA_InitStructure.DMA_M2M = DMA_M2M_Disable;
DMA_Init( DMA1_Channel5, &DMA_InitStructure ); // TIM2 CC2 event
/* DMA1 Channel7 configuration ----------------------------------------------*/
DMA_DeInit( DMA1_Channel7 );
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t) &GPIOA->ODR;
DMA_InitStructure.DMA_MemoryBaseAddr = (uint32_t) WS2812_IO_Low;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralDST;
DMA_InitStructure.DMA_BufferSize = ;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Disable;
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Word;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
DMA_InitStructure.DMA_Priority = DMA_Priority_High;
DMA_InitStructure.DMA_M2M = DMA_M2M_Disable;
DMA_Init( DMA1_Channel7, &DMA_InitStructure ); /* configure DMA1 Channel7 interrupt */
NVIC_InitStructure.NVIC_IRQChannel = DMA1_Channel7_IRQn;
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = ;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = ;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init( &NVIC_InitStructure );
/* enable DMA1 Channel7 transfer complete interrupt */
DMA_ITConfig( DMA1_Channel7, DMA_IT_TC, ENABLE );
} /* Transmit the frambuffer with buffersize number of bytes to the LEDs
* buffersize = (#LEDs / 16) * 24 */
void WS2812_sendbuf( uint32_t buffersize )
{
// transmission complete flag, indicate that transmission is taking place
WS2812_TC = ; // clear all relevant DMA flags
DMA_ClearFlag( DMA1_FLAG_TC2 | DMA1_FLAG_HT2 | DMA1_FLAG_GL2 | DMA1_FLAG_TE2 );
DMA_ClearFlag( DMA1_FLAG_TC5 | DMA1_FLAG_HT5 | DMA1_FLAG_GL5 | DMA1_FLAG_TE5 );
DMA_ClearFlag( DMA1_FLAG_HT7 | DMA1_FLAG_GL7 | DMA1_FLAG_TE7 ); // configure the number of bytes to be transferred by the DMA controller
DMA_SetCurrDataCounter( DMA1_Channel2, buffersize );
DMA_SetCurrDataCounter( DMA1_Channel5, buffersize );
DMA_SetCurrDataCounter( DMA1_Channel7, buffersize ); // clear all TIM2 flags
TIM2->SR = ; // enable the corresponding DMA channels
DMA_Cmd( DMA1_Channel2, ENABLE );
DMA_Cmd( DMA1_Channel5, ENABLE );
DMA_Cmd( DMA1_Channel7, ENABLE ); // IMPORTANT: enable the TIM2 DMA requests AFTER enabling the DMA channels!
TIM_DMACmd( TIM2, TIM_DMA_CC1, ENABLE );
TIM_DMACmd( TIM2, TIM_DMA_CC2, ENABLE );
TIM_DMACmd( TIM2, TIM_DMA_Update, ENABLE ); // preload counter with 29 so TIM2 generates UEV directly to start DMA transfer
TIM_SetCounter( TIM2, ); // start TIM2
TIM_Cmd( TIM2, ENABLE );
} /* DMA1 Channel7 Interrupt Handler gets executed once the complete framebuffer has been transmitted to the LEDs */
void DMA1_Channel7_IRQHandler( void )
{
// clear DMA7 transfer complete interrupt flag
DMA_ClearITPendingBit( DMA1_IT_TC7 );
// enable TIM2 Update interrupt to append 50us dead period
TIM_ITConfig( TIM2, TIM_IT_Update, ENABLE );
// disable the DMA channels
DMA_Cmd( DMA1_Channel2, DISABLE );
DMA_Cmd( DMA1_Channel5, DISABLE );
DMA_Cmd( DMA1_Channel7, DISABLE );
// IMPORTANT: disable the DMA requests, too!
TIM_DMACmd( TIM2, TIM_DMA_CC1, DISABLE );
TIM_DMACmd( TIM2, TIM_DMA_CC2, DISABLE );
TIM_DMACmd( TIM2, TIM_DMA_Update, DISABLE ); } /* TIM2 Interrupt Handler gets executed on every TIM2 Update if enabled */
void TIM2_IRQHandler( void )
{
// Clear TIM2 Interrupt Flag
TIM_ClearITPendingBit( TIM2, TIM_IT_Update ); /* check if certain number of overflows has occured yet
* this ISR is used to guarantee a 50us dead time on the data lines
* before another frame is transmitted */
if ( TIM2_overflows < (uint8_t) WS2812_DEADPERIOD )
{
// count the number of occured overflows
TIM2_overflows++;
}
else
{
// clear the number of overflows
TIM2_overflows = ;
// stop TIM2 now because dead period has been reached
TIM_Cmd( TIM2, DISABLE );
/* disable the TIM2 Update interrupt again
* so it doesn't occur while transmitting data */
TIM_ITConfig( TIM2, TIM_IT_Update, DISABLE );
// finally indicate that the data frame has been transmitted
WS2812_TC = ;
}
} /* This function sets the color of a single pixel in the framebuffer
*
* Arguments:
* row = the channel number/LED strip the pixel is in from 0 to 15
* column = the column/LED position in the LED string from 0 to number of LEDs per strip
* red, green, blue = the RGB color triplet that the pixel should display
*/
void WS2812_framedata_setPixel( uint8_t row, uint16_t column, uint8_t red,
uint8_t green, uint8_t blue )
{
uint8_t i;
for ( i = ; i < ; i++ )
{
// clear the data for pixel
WS2812_IO_framedata[ ( ( column * ) + i ) ] &= ~( 0x01 << row );
WS2812_IO_framedata[ ( ( column * ) + + i ) ] &= ~( 0x01 << row );
WS2812_IO_framedata[ ( ( column * ) + + i ) ] &= ~( 0x01 << row );
// write new data for pixel
WS2812_IO_framedata[ ( ( column * ) + i ) ] |= ( ( ( ( green << i ) & 0x80 ) >> ) << row );
WS2812_IO_framedata[ ( ( column * ) + + i ) ] |= ( ( ( ( red << i ) & 0x80 ) >> ) << row );
WS2812_IO_framedata[ ( ( column * ) + + i ) ] |= ( ( ( ( blue << i ) & 0x80 ) >> ) << row );
}
} /* This function is a wrapper function to set all LEDs in the complete row to the specified color
*
* Arguments:
* row = the channel number/LED strip to set the color of from 0 to 15
* columns = the number of LEDs in the strip to set to the color from 0 to number of LEDs per strip
* red, green, blue = the RGB color triplet that the pixels should display
*/
void WS2812_framedata_setRow( uint8_t row, uint16_t columns, uint8_t red,
uint8_t green, uint8_t blue )
{
uint8_t i;
for ( i = ; i < columns; i++ )
{
WS2812_framedata_setPixel( row, i, red, green, blue );
}
} /* This function is a wrapper function to set all the LEDs in the column to the specified color
*
* Arguments:
* rows = the number of channels/LED strips to set the row in from 0 to 15
* column = the column/LED position in the LED string from 0 to number of LEDs per strip
* red, green, blue = the RGB color triplet that the pixels should display
*/
void WS2812_framedata_setColumn( uint8_t rows, uint16_t column, uint8_t red,
uint8_t green, uint8_t blue )
{
uint8_t i;
for ( i = ; i < rows; i++ )
{
WS2812_framedata_setPixel( i, column, red, green, blue );
}
} int main( void )
{
uint8_t i; GPIO_init( );
DMA_init( );
TIM2_init( ); while ( )
{
// set two pixels (columns) in the defined row (channel 0) to the
// color values defined in the colors array
for ( i = ; i < ; i++ )
{
// wait until the last frame was transmitted
while ( !WS2812_TC )
;
// this approach sets each pixel individually
WS2812_framedata_setPixel( , , colors[ i ][ ], colors[ i ][ ],
colors[ i ][ ] );
WS2812_framedata_setPixel( , , colors[ i ][ ], colors[ i ][ ],
colors[ i ][ ] );
// this funtion is a wrapper and achieved the same thing, tidies up the code
//WS2812_framedata_setRow(0, 2, colors[i][0], colors[i][1], colors[i][2]);
// send the framebuffer out to the LEDs
WS2812_sendbuf( );
// wait some amount of time
Delay( 500000L );
}
}
}
Light_WS2812 library V2.0 – Part I: Understanding the WS2812
WS2812 LEDs are amazing devices – they combine a programmable constant current controller chip
with a RGB LED in a single package.
Each LED has one data input and one data output pin.
By connecting the data output pin to the data input pin of the next device,
it is possible to daisy chain the LEDs to theoretically arbitrary length.
Unfortunately, the single-line serial protocol is not supported by standard microcontroller periphery.
It has to be emulated by re-purposing suitable hardware
or by software timed I/O toggling, also known as bit-banging.
Bit-banging is the preferred approach on 8 bit microcontrollers.
However, this is especially challenging with low clock rates due to the relatively high data rate of the protocol.
In addition, there are many different revisions of data sheets with conflicting information about the protocol timing.
My contribution to this was the light_ws2812 library V1.0 for AVR and Cortex-M0, which was published a while ago.
A V2.0 rewrite of the lib was in order due to various reasons.
And, to do it right, I decided to reverse engineer and understand the WS2812 LED protocol to make sure the lib works on all devices.
As of now, there are two different revisions of the WS2812 on the market:
The original 6 pin WS2812(S) and the newer 4 pin WS2812B.
The data sheets can be downloaded from the website of world-semi, the original manufacturer, here and here.
The data transmission protocol itself is relatively simple:
a digital “1” is encoded as a long high-pulse, “0” as a short pulse on “Din”.
When the data line is held low for more than 50µs, the device is reset.
After reset, each device reads the first 24 bit (GRB 8:8:8) of data into an internal buffer.
All consecutive bits after the first 24 are forwarded to the next device
go through internal data reshaping and are then forwarded via “Dout” to the next device.
The internal buffer is written to the PWM controller during the next reset.
So far so good.
This is where things get confusing.
I copied the timing specification from both datasheets above.
As you can see, both devices have slightly different timing for the encoding of the “1”.
Furthermore, the tolerances for the “data transfer time” are completely different and are in conflict with the “voltage time”.
So what are the real tolerances and can we find a set of timing parameters that fits both devices?
Luckily there is a relatively easy way to probe the inner workings of the device:
When data is forwarded, it is passed through the internal reshaping mechanism.
Therefore we can exercise Din and verify the correct interpretation of the input data by comparing it to Dout.
To do this, I hooked a single WS2812 to a ATtiny 85 which took the role of a signal generator.
I then monitored both Din and Dout with a Saleae logic analyzer.
There are some issues with aliasing, since the maximum sampling speed is only 24 Mhz,
but the data seemed still sufficient to understand the WS2812.
In my first experiment I tried to determine the minimum time needed to reset the LED.
My program emitted blocks of 48 bits with increasing delay time in between the blocks.
As you can see above on the left side, all input data is forwarded to the output if the reset delay is too short.
Once a certain delay threshold is reached, a reset is issues
and data forwarding will only start after the first 24 bits, as seen on the right side.
For the WS2812 under test here, the minimum reset length was 8.95 µs, way below the specifications.
The suggested reset time of 50 µs is therefore more than sufficient to reset the LEDs.
On the other hand, it means that no more than 9 µs of idle time may occur during data transfer,
or a reset may mistakenly be issued.
In the next step I looked at the data timing itself.
The image above shows an exemplary measurement of input and reshaped output waveforms.
Both waveforms can be described by two parameters each:
The duration of the hi pulse and the total period.
I programmed the microcontroller to cycle through all possible pulse input combinations
between 62.5 ns (1 CPU cycle at 16 MHz) and 4 µs with a granularity of 62.5 ns.
You can find the code ishere.
My original intention was to perform an automatic evaluation of the captured data to create a shmoo plot.
However, I quickly noticed that the behavior was quite regular and instead opted to analyze the data manually.
One of the first observations was that the delay between the leading edge of the input pulse
and the leading edge of the output pulse, T_delay_in_out, was constant regardless of the timing of the input pulse.
The image above shows a variation of T_hi_in for a constant T_period_in.
The period length, calledtotal data transfer time in the datasheet was set to the specification value of 1250 ns.
As is obvious, there are only two states of the output signal:
A short pulse for a “0” and a long pulse for a “1”.
Even the shortest input pulse (62.5 ns) is identified as “0”, w
hile even the longest input pulse (1250-62.5=1187.5 ns) is identified as a “1”.
The threshold between “0” and “1” is somewhere between 563 and 625 ns.
The LED brightness changes accordingly, suggesting that the observations
from the output signal are indeed consistent with the internal state of the LED.
Next, I varied T_period_in.
When the period time of the input signal was much shorter than 1250 ns,
the WS2812 started to reject input pulses.
As can be seen for 333 ns, only about every fifth input pulse is replicated in the output pulses.
The shortest pulse period time where all input pulses appeared on the data output was 1063 ns.
Below that the input pulses were partially or fully rejected.
Above this threshold all input pulses were interpreted correctly
and the period of the output signal reflected the period of the input signal up to 9 µs when the reset condition was met.
This is an interesting observation, because it means that while there is a strict lower limit for the period time of the input signal,
there is no real upper limit. For practical purposes, this allows relaxed timing in the software driver.
The table above summarizes my findings from the WS2812 and WS2812B each.
It is possible that there are significant differences between production batches of both types,
therefore these number can only serve as a rough indication.
All timings seem to be a bit shorter on the WS2812.
This is consistent with the data-sheet which indicates a longer pulse time for the “1” on the WS2812B.
An interesting observation is that the timing values for both LEDs are multiples of a smaller number,
~208 ns for the WS2812B and ~166 ns of the WS2812.
It appears that the internal controller circuit is actually a clocked design – possibly realized by a small state machine.
This becomes much more obvious with the diagram above, which normalizes the timing to “WS2812 cycles”.
The internal WS2812 state machine only needs to sample the input twice per bit:
First, it waits for a rising edge of the input.
This will initiate the sequence above.
The input is latched again after cycle 2.
The voltage of the input pin at this point determines whether a ‘1’ or a ‘0’ is read.
Depending on whether the LED already has received 24 bits or not,
this value will either be loaded into an internal shift register or decide whether a 2 or 4 cycle ‘hi’ level signal is emitted.
The sequence ends after cycle 5 and repeats again with the next rising edge.
So, what did we learn from this?
- A reset is issued as early as at 9 µs, contrary to the 50 µs mentioned in the data sheet. Longer delays between transmissions should be avoided.
- The cycle time of a bit should be at least 1.25 µs, the value given in the data sheet, and at most ~9 µs, the shortest time for a reset.
- A “0” can be encoded with a pulse as short as 62.5 ns, but should not be longer than ~500 ns (maximum on WS2812).
- A “1” can be encoded with pulses almost as long as the total cycle time, but it should not be shorter than ~625 ns (minimum on WS2812B).
Light_WS2812 library V2.0 – Part II: The Code
After investigating the timing of the WS2812 protocol in the previous part,
the question is now how to use this knowledge for an optimized software implementation of a controller.
An obvious approach would be to use an inner loop that uses a switch statement
to branch into separate functions to emit either a “0” symbol or a “1” symbol.
But as it is often, there is another solution that is both more elegant and more simple.
The image above shows the timing of both the “0” and the “1” code.
The cycle starts at t0, the rising edge, for both symbols.
The output has to be set high regardless of the symbol.
At t1, the output has to be set to low for a “0” and can be unchanged for a “1”.
At t2 the output goes low for the “1”. Since it is already low for a “0” we can set the output to low,
regardless of the symbol.
Finally, at t3 the complete symbol has been sent and the output can be left unchanged.
So, in the end there is only one point in time were the output is influenced by the symbol type,
t1.Everything else remains unchanged.
This means that special case handling can be limited to a very small part of the code.
This is what I ended up with in AVR assembler code:
ldi %, Loop times for one byte
loop:
out %,% // [01] - t0 Set output Hi
...wait1...
sbrs %, // [02/03] - Skip t1 if bit 7 is set
out %,% // [03] - t1 Set output Low
...wait2...
lsl % // [04] - Shift out next bit
out %,% // [05] - t2 Set output Low
...wait3...
dec % // [06]
brne loop // [08] - t3 Loop
This code outputs one byte of data, which has to be loaded into %1 (The C compiler will take care of this).
Since the protocol sends data msb first, bit 7 is tested. If it is “1”, the out instruction at t1 is skipped.
That’s it, as simple as that, only 7 instructions needed in the inner loop.
What is left now is to correct the timing. To do that, nops have to be inserted at positionswait1..wait3.
As shown in the previous part, the most critical timing is that of the “0”
where the delay between t0 and t1 may not exceed 500 ns.
The minimum achievable delay, when no nops are inserted at wait1, is two cycles.
This equals 500 ns at 4 MHz and less at higher clock speeds.
All other timings may exceed the minimum timing required from the data sheet.
This means that even this simple loop is able to control WS2812 LEDs at only 4 MHz!
This is quite an achievement, since it was previously considered to be difficult to control WS2812 LEDs even at 8 MHz.
Note that the 500 ns is safe on the WS2812B, but may be critical on the WS2812(S). It worked with my devices, though.
To make the final implementation as flexible as possible, I
opted to calculate the exact number of nops to insert at compile time from the F_CPU define,
which is usually set to the CPU clock speed in the AVR-GCC toolchain.
You can find the implementation here. The C-code tries to adjust the timing according to the following rules,
which considers at least 150 ns margin for both the WS2812 and the WS2812B timing:
ns < t1-t0 <= ns
ns <= t2-t0
ns <= t3-t0
The outer loop is implemented in pure C, since it can be safely assumed not to take more than 5 µs. This way maximum flexibility is retained.
OctoWS2811 LED Library
OctoWS2811 is a high performance WS2811 & WS2812 & WS2812B LED library,
written by Paul Stoffregen, featuring simultaneous update to 8 LED strips
using efficient DMA-based data transfer (technical details below).
Minimal CPU impact and double buffering allows complex animation.
A VideoDisplay example is included, capable of scaling to extremely large LED installations
using multiple Teensy 3.0 or 3.1 boards with a frame sync signal for precise refresh timing.
Download : OctoWS2811 (Version 1.2)
WS2811 Idle Power
When the LED is off, the WS2811 chip consumes approximately 0.9 mA of current.
For battery powered LEDs, this current can easily drain the battery.
A P-channel MOSFET transistor or similar switch may be needed to disconnect power from the LEDs,
if the battery remains connected when the LEDs are not in use.
This circuit was recommended by David Beaudry to manage the power.
David also tested Vishay SUP75P03-07-E3 and SI4465ADY-T1-E3 transistors, which are able to power more LEDs.
OctoWS2811 Technical Details
OctoWS2811 is designed for highly efficient data output to WS2811-based LEDs,
able scale to very large LED arrays.
The WS2811 requires very specific waveform timing. Each LED uses 24 bits,
each 1.25 µs, for a total of 30 µs per LED in the strip.
Driving 8 LED strips simultaneously allows each strip to be only 1/8th the length.
All LEDs update 8X faster than driving only a single long strip.
1000 LEDs can be updated in 3.8 ms, which allows a theoretical update rate of 240 Hz.
The VideoDisplay example implements a Frame Sync signal,
allowing many Teensy 3.0 boards to work together, each driving 1000 LEDs.
The boards precisely synchronize their update, even if the USB delivers data to the many boards with some varying latency.
Fast update times are preserved when scaling up to extremely large LED arrays.
OctoWS2811 uses Direct Memory Access (DMA) to create the WS2811 waveforms with nearly zero CPU usage.
Because the CPU is free and interrupts remain enabled, the processor is free to receive data
or perform computations in preparation for the next frame of display,
while the previous one is still be transferred to the LEDs.
The 8X faster update and free CPU time the key differences between OctoWS2811 and other libraries,
which create the WS2811 waveforms for a single strip using carefully timed software.
DMA is a special hardware feature which allows data to be automatically
moved between memory and I/O registers in response to hardware events,
without any CPU usage (other than initially configuring parameters).
OctoWS2811 uses 3 DMA channels to synthesize the WS2811 waveforms.
The hardware events which trigger the DMA channels are a pair of PWM waveforms,
corresponding to the WS2811 bit low and high waveforms.
The rising edge (both PWM rise at the same moment) triggers channel #1,
which copies a fixed byte (0xFF) to an I/O register which sets all 8 output bits,
causing the WS2811 waveform to begin each bit.
The first falling edge triggers DMA channel #2, which copies one byte of the actual frame buffer data to all 8 pins.
The bits which are low transition to low at the correct time to create a zero bit to each WS2811 LED.
The bits that are high have no effect, because channel #1 already set all 8 pins high.
The 3rd DMA channel triggers at the second falling PWM edge, causing all the WS2811 bits to be written to zeros.
The pins which were left high by channel #2, become low, as required by the WS2811 timing for a one bit.
The pins which were already low are not changed.
Together, these 3 I/O updates create a WS2811 waveform automatically without any CPU activity.
The ARM-based chip from Freescale used on Teensy 3.0 has crossbar switch and dual-bus RAM,
which allows the DMA and ARM CPU to work together very efficiently.
0xWS2812 STM32 driver for WS2812(B) RGB LEDs的更多相关文章
- 【雕爷学编程】Arduino动手做(60)---WS2812直条8位模块
37款传感器与执行器的提法,在网络上广泛流传,其实Arduino能够兼容的传感器模块肯定是不止这37种的.鉴于本人手头积累了一些传感器和执行器模块,依照实践出真知(一定要动手做)的理念,以学习和交流为 ...
- writing a usb driver(在国外的网站上复制下来的)
Writing a Simple USB Driver From Issue #120April 2004 Apr 01, 2004 By Greg Kroah-Hartman in Soft ...
- [Micropython]发光二极管制作炫彩跑马灯
先甩锅 做完后才发现最后一个灯坏了,就坏了一个灯也不好意思去找淘宝店家,大家视频凑合着看把.不过并不影响实验效果.因为这个发光二极管白天不是很明显 晚上炫彩效果就能出来了.本次实验用的是8个灯珠 ...
- [MicroPython]TPYBoard v102炫彩跑马灯WS2812B
一.实验目的 了解ws2812b的工作原理 学习ws2812b的驱动方法 二.实验器材 TPYBoard v102 1块 ws2812b RGB-Ring-8 1个 micro USB数据线 1条 杜 ...
- lcd ram/半反穿技术解析【转】
转自:http://bbs.meizu.cn/viewthread.php?tid=3058847&page=1 我的话题应该会比较长一些.但是大致板块如下:1.LCD RAM;-->此 ...
- Android测试日志文件抓取与分析
1.log文件分类简介 实时打印的主要有:logcat main,logcat radio,logcat events,tcpdump,还有高通平台的还会有QXDM日志 状态信息的有:adb shel ...
- [FPGA] 1、Artix-7 35T Arty FPGA 评估套件学习 + SiFive risc-v 指令集芯片验证
目录 1.简介 2.深入 3.DEMO 4.SiFive基于risc-v指令集的芯片验证 LINKS 时间 作者 版本 备注 2018-10-09 08:38 beautifulzzzz v1.0 到 ...
- wfi彩灯
1 单纯控制颜色 接线 Arduino Uno 共阳三色雾状LED灯 Pin 9 <----------> 红 Pin 10 &l ...
- 【android】[转]Android软件测试的日志文件抓取简介
1 log文件分类简介 实时打印的主要有:logcat main,logcat radio,logcat events,tcpdump,还有高通平台的还会有QXDM日志 状态信息的有:adb s ...
随机推荐
- Anaconda+django写出第一个web app(十)
今天继续学习外键的使用. 当我们有了category.series和很多tutorials时,我们查看某个tutorial,可能需要这样的路径http://127.0.0.1:8000/categor ...
- mysql 获取当月日期天数
本月总共天数:SELECT TIMESTAMPDIFF(day,CURDATE(),(DATE_add(CURDATE(),INTERVAL 1 month)))
- java虚拟机规范(se8)——java虚拟机结构(五)
2.10 异常 java虚拟机中的异常用Throwable类或者它的子类的实例来表示.抛出一个异常会导致立即非本地(an inmediate nolocal)的控制转移,从发生异常的地方跳到处理异常的 ...
- PHP转盘抽奖算法
流程: 1.拼装奖项数组 2.计算概率 3.返回中奖情况 代码如下: 中奖概率 ' v ' 可以在后台设置,传到此方法中,注意传整数 function get_gift(){ //拼装奖项数组 // ...
- Python 协程检测Kubernetes服务端口
一.需求分析 在上一篇文章,链接如下: https://www.cnblogs.com/xiao987334176/p/10237551.html 已经得到了需要的数据,现在需要对这些端口做检测,判断 ...
- Vue 动态组件渲染问题分析
fire 读在最前面: 1.本文适用于有一定基础的vue开发者,需要了解基本的vue渲染流程 2.本文知识点涉及vue构造器以及选项策略合并.<component> 渲染逻辑 问题描述: ...
- 当Python与数模相遇
数模有一个题目要处理杭州自行车在每个站点可用数量和已经借出数量,这数据在www.hzbus.cn上可以获取,它是10分钟更新一次的.这些数据手动获取,需要不停的刷页面,从6:00am到9:00pm,显 ...
- 记一些使用PyQt的问题
本文自用,日常记录,不断更新 环境 1.使用 PyCharm IDE 2.PyQt5 3. 扩展配置 PyUIC转换后的代码处理 PyUIC 用于 将 QtDesigner 生成的 .ui 文件转换为 ...
- 【BZOJ2059】Buying Feed 购买饲料
题面 约翰开车来到镇上,他要带V吨饲料回家.如果他的车上有X吨饲料,每公里就要花费X^2元,开车D公里就需要D* X^2元.约翰可以从N家商店购买饲料,所有商店都在一个坐标轴上,第i家店的位置是Xi, ...
- 【Java】 大话数据结构(2) 线性表之单链表
本文根据<大话数据结构>一书,实现了Java版的单链表. 每个结点中只包含一个指针域的链表,称为单链表. 单链表的结构如图所示: 单链表与顺序存储结构的对比: 实现程序: package ...