Smart Speaker
Objective and Design Philosophy
The smart speaker is intended as one of the primary elements of the SALE system. The Smart TV is one of two components that provide entertainment to the user, and with the Smart Mobile Device, it is also one of two that provides a large number of options to the user to interact with. The speaker is intended to provide high-quality audio playback, and is carefully designed to achieve this goal. It also interfaces with the Smart Hub to provide a voice assistant functionality, with an inbuilt microphone that feeds the speech recognition functions on the hub.
Core Components
The system is built around the following hardware and software elements:
- Microcontroller (Originally Raspberry Pi Pico W (RP2040 core), later Arduino Nano ESP32 (ESP32-S3 core).
- External DAC: LTC1655LCN8
- Audio Amplifier: LM384N
- Microphone with integrated preamp and compressor: Adafruit 1713
- Voltage Converter: LM2675N-5 and associated components.
Implementation Strategy:
The smart speaker is built on a custom single-board PCB design, which integrates several distinct subsystems. It includes a from-scratch 5V power supply circuit, a microphone, a discrete DAC and audio amplifier, and a microcontroller. The board design was originally made for a Raspberry Pi Pico WH, but due to time constraints, an Arduino Nano ESP32 was substituted in its pace. This substitution allowed the use of the pre-existing Voice Assistant module of the open-source ESPHome software suite. The suite includes many existing software modules for microcontrollers to driver various pieces of hardware. However, as I discovered along the way, the primary hardware components I used in my design were not supported directly by the ESPHome software suite. This included the external DAC, which provides the speaker output, and the ADC, integrated into both microcontrollers, which provides the speaker data input. Without these components, the speaker would not work; so we began writing drivers for them in the ESPHome ecosystem.
The ESP32 includes support for driving DAC components via I2S, a protocol where a stream of signed integers are sent to compatible chips. This support has a driver written for it in ESPHome. Unfortunately, the chip I selected does not provide a I2S interface, which meant I had to improvise. I2S has a few variants, one of which is almost identical to the SPI interface that my DAC provides. As the ESPHome doesn’t have any existing drivers for continuously sending SPI data, I decided to build on the existing I2S driver instead. After a number of false starts, I determined the signed/unsigned problem and wrote an appropriate driver. The driver furthermore intercepts and implements volume changes on its own, since multiplication on signed integers yields very different results than on unsigned.
In addition to the custom DAC driver to perform these conversions, as well as carefully configuring the I2S driver to behave in an SPI-compatible way, I also wrote a custom ADC driver for the ADC built into the ESP32. This was surprisingly not yet supported by ESPHome, and was also significantly more complicated to write effectively. The code needed to run this was significantly more complicated, requiring me to understand the interface of the DMA and ADC modules of the ESP32. However, I successfully wrote this custom code, and the data will stream successfully. From here, I used the ESPHome interface to configure and connect the voice assistant, based on other implementations. ESPHome supports over-the-air updates, as well as many other features, which I configured on the device.
The smart speaker functions as a media player, capable of playing any audio saved to the hub. It allows volume control, as well as pausing and playing. It functions as a satellite for the voice assistant, streaming microphone audio to the hub for voice processing and playing back the responses. It has a wide dynamic range, with carefully selected speaker drivers to maximize the audio quality.
Engineering Challenges and Solutions
- Potentiometer having too high of a total resistance, adding noise; swapped potentiometer, though this damaged the PCB and required a second board
- Mis-routed power pins (VCC not connected); fixed with a jumper
- Complexity of home assistant streaming; fixed with ESPHome
- Inadequate ESPHome support for RP2040; migrated to ESP32-S3
- Missing ADC Driver; wrote one
- Signed/Unsigned confusion; wrote custom driver
- Low speaker impedance limiting undistorted sound output; added a second speaker in series
Performance:
Below, the figure shows how a sine wave is distorted by simply ignoring the signed-to-unsigned conversion that was needed to allow both the ADC and DAC drivers to work; while there is still an audible tone with the correct fundamental frequency, it is massively distorted.

The figure below shows the same sine wave, where the conversion to unsigned values is done properly, but where the volume is done as though the values were still signed. In other words, the result of multiplying the wave by a constant as though it were a signed integer, when it has already been converted to be unsigned.

