Bring
choice
to voice
choice to
voice
Your voice never leaves your home and the processing is pretty accurate, but is hardware-intensive.
Your voice is processed on a private cloud, allowing Assist to run fast and very accurately on low-powered hardware.
Your voice never leaves your home and the processing is pretty accurate, but is hardware-intensive.
Your voice is processed privately on Home Assistant Cloud, allowing Assist to run very accurately on low-powered hardware.
Fully open software, firmware, and hardware allows you to make it work best for your needs. It includes a Grove port for connecting sensors and a 3.5mm headphone jack for connecting external speakers. With a dedicated community customizing and adding functionality.
Be part of building Assist and preview the future of voice control in the home.
- 84x84x21 mm
- 96 g
- 94x94x30 mm
- 120 g
- Injection-molded Polycarbonate plastic
- White and semi-transparent
- Multipurpose button
- Rotary dial for volume and other input
- Mute switch that physically cuts power to the microphone
- Internal speaker
- Internal dual-mic array
- 3.5mm audio output
- Grove port to connect sensors or other accessories
- Easy to open - no clips, only screws to access internals
- Exposed pads on PCB for modding
- ESP32-S3 SoC with 16 MB of FLASH storage
- 8 MB octal PSRAM
- XMOS XU316
- Featuring:
Echo cancellation
Stationary noise removal
Auto gain control - Dedicated I2S lines for audio in and out
- USB-C, 5 V DC, 2 A
- 2.4 GHz Wi-Fi
- Bluetooth 5.0
- 3.5 mm (⅛”) stereo headphone jack
- Digital to analog converter (DAC):
TI AIC3204
48 kHz sampling rate
- ESPHome preloaded
- Fully open-source firmware for both the ESP32 and XMOS chip
- Indoor use only
O °C to 30 °C | 32 °F to 86 °F -
Humidity: non-condensing
Keep in dry, not excessively dusty environment as this can cause damage to the unit
questions?
It is our vision to make open, local, and private voice assistants a reality in any language. While we have made great strides in realizing this, it is such a massive undertaking that we need our worldwide community to participate in its development. An essential ingredient for the community to drive the project forward is a standardized hardware platform for voice, built for Home Assistant from the ground up: Home Assistant Voice Preview Edition.
While for some, the current state of our voice assistant may be all they need, we think there is still more to do before it is ready for every home in every country, and until then, we’ll be selling this Preview of the future of voice assistants. Taking back our privacy isn’t for everyone - it’s a journey - and we want as many people as possible to join us and make it better.
Yes, provided your language is supported and you have hardware powerful enough to run local text-to-speech and speech-to-text models at a speed that is acceptable to you. Speech-to-text is the main limiting factor for many languages to run locally, as it has mixed results and often requires powerful hardware.
We recommend using at least an Intel N100 or equivalent processor; this will allow you to use OpenAI’s Whisper Base model for speech-to-text locally. This model runs reasonably fast for languages that have large public datasets to train on, such as English and Spanish. However, for languages with less data available, you will need Whisper’s Small or Large models that require significantly more powerful hardware to run. For some languages, no public datasets exist yet for local models to be trained on by OpenAI, and until they exist and they train models, you will not be able to run those languages fully locally.
You do not need Home Assistant Cloud. However, if you are running less powerful hardware, like Home Assistant Green or a Raspberry Pi 4, we believe this provides the best experience. By doing so, you can leverage our cloud for speech processing, ensuring superior accuracy and faster performance not possible on your low-powered device.
Additionally, some languages have poor or non-existent support by the local speech-to-text software we leverage (OpenAI’s Whisper), but are well-supported by the speech processing used by Home Assistant Cloud.
Home Assistant Cloud has been designed from the ground up to uphold the core values of the Home Assistant project, with privacy being one of our highest priorities. Home Assistant Cloud leverages the enterprise services of Microsoft Azure for its industry-leading speech processing, which unlike many consumer offerings, is bound by commercial terms and conditions and does not retain or store your data. In addition, Home Assistant Cloud itself does not keep any record of your voice, data, or commands.
Three separate parts are needed for a language to be supported in local operation. Reliable, local speech-to-text models must be available to turn what is said into text commands that can be sent to Home Assistant. Home Assistant then needs to have sentence support for that language, so it knows which actions to perform for each command.
Finally, a local text-to-speech model has to be available for your language, so it can reply to you. If any of these three parts are not available locally, your language is not yet supported. Currently, there is one part that holds back our language support more than the others, and that’s local speech-to-text.
There can be a number of reasons why a language is not supported by Home Assistant Cloud, but most often it is because the sentences have not been translated by a member of our community. If you would like to help translate these sentences and have your language added, please visit here.
We need your help to improve or add support for your language. Through the help of our global community, ultimately, we aim to support every language possible. In our documentation, we have a list of various ways you can help us advance our open, local, and private voice assistant. Thank you for helping us make voice better in your language.
Yes, Home Assistant can be configured to use any speech-to-text integration that supports the Wyoming protocol. Currently, the only locally-run speech-to-text add-on available for Home Assistant users is OpenAI’s Whisper. This has mixed results, missing languages, and is hardware-intensive.
That led us to build an alternative - Rhasspy Speech can run locally and accurately on lower-power hardware, though this does not provide full speech-to-text capabilities. Based on the Rhasspy project, it is able to create local speech-to-text models, but is limited to predefined sentences aimed at controlling your home, and will not be able to process general speech. For instance, it could turn on a device, but will not be able to add something to your shopping list. We expect to share the first version of Rhasspy Speech during the next Voice livestream in 2025.
Out of the box, the device can listen for “Okay Nabu,” “Hey Jarvis,” or “Hey Mycroft” as wake words. This is provided by the on-device wake word engine called microWakeWord. Creating these wake words requires very powerful hardware and large datasets to train, which is not realistic for most users.
In time we will work with the community to create more wake words, but currently are focused on improving our current wake words to work for a large variety of accents and voice registers.
A wake word should be uncommon in everyday conversations at home or in media, such as music or TV, to minimize the risk of the device activating unintentionally. “Nabu”, “Jarvis”, and “Mycroft” are fairly unique words, as opposed to generic terms such as “computer” or “assist”. That makes these microWakeWord models perform well for most users.
In the future, we intend to match and then surpass the Big Tech voice assistants, but for now, this Preview Edition can not yet do everything those devices can. For some, the current capabilities of our voice assistant will be all they need; especially those who just want to set timers, manage their shopping list, and control their most used devices. For others, we understand they want to ask their voice assistant to make whale sounds or to tell them how tall Taylor Swift is - our voice assistant doesn’t do those things… yet.
Yes, if you plug an external speaker into the 3.5mm audio port. The built-in speaker is meant for voice feedback and is not optimized for listening to music, but the included DAC is capable of playing lossless audio on a suitable external speaker. We recommend using Music Assistant to control music playback.
Yes, if you have paid access to a supported cloud LLM or have a local LLM running on suitable hardware, it is possible to either fully replace our voice assistant’s conversation agent with an LLM or use it as a fallback for commands that Home Assistant does not understand natively.
The benefit of this is being able to ask nearly any query that comes to mind, and speak commands in natural language. Just note, we consider the use of AI in the smart home to be experimental, and would recommend caution when letting it control important aspects of your home. Get started with AI and Assist.
No, the device does not come with a USB-C charger and cable. Sustainability is a core value of the Home Assistant project, and we do not wish to send more chargers or cables into the world when most users already own enough of these.
You can find an overview of everything you can say to the device in our documentation. You may need to expose some devices manually to Assist, in order to let this device control them.