[HN Gopher] The era of open voice assistants
       ___________________________________________________________________
        
       The era of open voice assistants
        
       Author : _Microft
       Score  : 658 points
       Date   : 2024-12-20 00:29 UTC (22 hours ago)
        
 (HTM) web link (www.home-assistant.io)
 (TXT) w3m dump (www.home-assistant.io)
        
       | jfim wrote:
       | That's a pretty timely release considering Alexa and the Google
       | assistant devices seem to have plateaued or are on the decline.
        
         | IgorPartola wrote:
         | Curious what you mean by that.
        
           | oaththrowaway wrote:
           | For me the Alexa devices I own have gotten worse. Can't do
           | simple things (setting a timer used to be instant, now it
           | takes 10-15 seconds of thinking assuming it heard properly),
           | playing music is a joke (will try to play through Deezer even
           | though I disaled that integration months ago, and then will
           | default to Amazon Music instead of Spotify which is set as
           | the default).
           | 
           | And then even simple skills can't understand what I'm asking
           | 60% of the time. The first maybe 2 years after launch it
           | seemed like everything worked pretty good but since then it's
           | been a frustrating decline.
           | 
           | Currently they are relagated to timers and music, and it
           | can't even manage those half the time anymore.
        
             | interludead wrote:
             | That aligns with some of the frustration I've heard from
             | others. It's surprising (and disappointing) how these
             | platforms, which seemed to have so much potential early on,
             | have started to feel more like a liability
        
             | lelag wrote:
             | It is, I think, a common feeling among Echo/Alexa users.
             | Now that people are getting used to the amazing
             | understanding capabilities of ChatGPT and the likes, it
             | probably increases the frustration level because you get a
             | hint of how good it could be.
             | 
             | I believe it boils down to two main issues:
             | 
             | - The narrow AI systems used for intent inference have not
             | scaled with the product features.
             | 
             | - Amazon is stuck and can't significantly improve it using
             | general AI due to costs.
             | 
             | The first point is that the speech-to-intent algorithms
             | currently in production are quite basic, likely based on
             | the state of the art from 2013. Initially, there were few
             | features available, so the device was fairly effective at
             | inferring what you wanted from a limited set of
             | possibilities. Over time, Amazon introduced more and more
             | features to choose from, but the devices didn't get any
             | smarter. As a result, mismatches between actual intent and
             | inferred intent became more common, giving the impression
             | that the device is getting dumber. In truth, it's probably
             | getting somewhat smarter, but not enough to compensate for
             | the increasing complexity over time.
             | 
             | The second point is that, clearly, it would be relatively
             | straightforward to create a much smarter Alexa: simply
             | delegate the intent detection to an LLM. However, Amazon
             | can't do that. By 2019, there were already over 100 million
             | Alexa devices in circulation, and it's reasonable to assume
             | that number has at least doubled by now. These devices are
             | likely sold at a low margin, and the service is free. If
             | you start requiring GPUs to process millions of daily
             | requests, you would need an enormous, costly
             | infrastructure, which is probably impossible to justify
             | financially--and perhaps even infeasible given the sheer
             | scale of the product.
             | 
             | My prediction is that Amazon cannot save the product, and
             | it will die a slow death. It will probably keep working for
             | years but will likely be relegated by most users to a
             | "dumb" device capable of little more than setting alarms,
             | timers, and providing weather reports.
             | 
             | If you want Jarvis-like intelligence to control your home
             | automation system, the vision of a local assistant using
             | local AI on an efficient GPU, as presented by HA, is the
             | one with the most chance of succeeding. Beyond the privacy
             | benefits of processing everything locally, the primary
             | reason this approach may become common is that it scales
             | linearly with the installation.
             | 
             | If you had a cloud-based solution using Echo-like devices,
             | the problem is that you'd need to scale your cloud
             | infrastructure as you sell more devices. If the service is
             | good, this could become a major challenge. In contrast, if
             | you sell an expensive box with an integrated GPU that does
             | everything locally, you deploy the infrastructure as you
             | sell the product. This eliminates scaling issues and the
             | risks of growing too fast.
        
               | freedomben wrote:
               | It seems ridiculous to me that this comment is so down
               | voted. It's a thoughtful and interesting comment, and
               | contains a reasonable and even likely explanation for
               | what we've seen, once one puts aside the bottom that
               | Amazon is just evil, which isn't a useful way to think of
               | you truly want to understand the world and motivations.
               | 
               | I'm guessing people reflexively down vote because they
               | hate Amazon and it could read like a defense. I hate
               | Amazon too, but emotional voting is unbecoming of HN. If
               | you want emotional voting reddit is available and
               | enormous.
        
               | imiric wrote:
               | I didn't downvote it, but claiming that Echo/Alexa are
               | behind because of financial reasons is misguided at best.
               | 
               | Amazon is one of the richest companies on the planet,
               | with vast datacenters that power large parts of the
               | internet. If they wanted to improve their AI products
               | they certainly have the resources to do so.
        
               | thanksgiving wrote:
               | How do you justify to your manager to spend (and more
               | importantly commit to spending for a long time) hundreds
               | of millions of dollars in aws resources every year? Sure,
               | you already have the hardware but that's a different org,
               | right? You can't expect them to give you those resources
               | for free. Also, voice needs to be instant. You can't say
               | 'Well, the AWS instances are currently expensive. Try
               | again when my spot prices are lower."
               | 
               | I am sure you know this but maybe some don't know that
               | basically only the hot word detection is on device. It
               | needs to be connected to the Internet for basically
               | everything else. It already costs Amazon.com some money
               | to run this infrastructure. What we are asking will cost
               | more and you can't really charge the users more. I
               | personally would definitely not sign up for a paid
               | subscription to use Amazon Alexa.
        
               | baq wrote:
               | Alexa is probably a cool billion under or something. They
               | never figured out how to make money with it.
        
               | gorbachev wrote:
               | Even the richest company in the world doesn't run
               | unprofitable projects forever.
               | 
               | Just see Killed by Google.
        
               | imiric wrote:
               | That depends on the company. There is precedent of large
               | companies keeping unprofitable projects alive because
               | they can make up for it in other ways, or it's good for
               | marketing, etc. I.e. the razor and blades business model.
               | 
               | Perhaps Echo/Alexa entice users to become Prime members,
               | and they're not meant to be market leaders. We can only
               | speculate as outsiders.
               | 
               | My point is that claiming that a product of one the
               | richest companies on Earth is not as subjectively good as
               | the competition because of financial reasons is far-
               | fetched.
        
               | stavros wrote:
               | I think the economics here are wrong by orders of
               | magnitude. It doesn't make sense to deploy to the home an
               | expensive GPU that will sit idle 99% of the time, unless
               | running an LLM gets much cheaper, computationally. It's
               | much cheaper to run it on-premise and charge a
               | subscription, otherwise nobody would pay for ChatGPT and
               | would have an LLM rig at home instead.
        
               | lelag wrote:
               | You are right, but that's not my point. The point is that
               | it's difficult to scale in the cloud products that
               | requires lots of AI workloads.
               | 
               | Here, home assistant is telling you: you can use your own
               | infra (most people won't) or you can use our cloud.
               | 
               | It works because most likely the user base will be rather
               | small and home assistant can get cloud resources as if it
               | was infinite on that scale.
               | 
               | If their product was amazing, and suddenly millions of
               | people wanted to buy the cloud version, they would have a
               | big problem: cloud infrastructure is never infinite at
               | scale. They would be limited by how much compute their
               | cloud provider is able/willing to sell them, rather than
               | how much of that small boxes they could sell, possibly
               | loosing the opportunity to corner the market with a great
               | product.
               | 
               | If you package everything, you don't have that problem
               | (you only have the one to be able to make the product,
               | which I agree is also not small). But in term of energy
               | efficiency, it also does not have to be that bad: the
               | apple silicon line has shown that you can have very
               | efficient hardware with significant AI capabilities, if
               | you design a SOC for that purpose, it can be energy
               | efficient.
               | 
               | Maybe I'm wrong that the approach will get common, but
               | the fact that scaling AI services to millions of users is
               | hard stand.
        
               | stavros wrote:
               | But here you're assuming that your datacenter can't
               | provide you with X GPUs, but you can manufacture 100X,
               | which is dictated by 1% utilization.
        
               | IgorPartola wrote:
               | This is very well thought out but I think your premise is
               | a bit wrong. I have about a dozen Echos of various
               | generations in my house. The oldest one is the very
               | original from the preview stage. They still do everything
               | I want them to and my entire family still uses them daily
               | with zero frustration.
               | 
               | Local GPU doesn't make sense for some of the same reasons
               | you list. First, hardware requirements are changing
               | rapidly. Why would I spend say $500 on a local GPU setup
               | when in two years the LLM running on it will slow to a
               | crawl due to limited resources? Probably would make more
               | sense to rent a GPU on the cloud and upgrade as new
               | generations come out.
               | 
               | Amazon has the opposite situation: their hardware and
               | infra is upgraded en masse so different economies. Also
               | while your GPU is idling at 20-30W while you aren't home
               | they can have 100% utilization of their resources because
               | their GPUs are not limited to one customer at a time.
               | Plus they can always offload the processing by
               | contracting OpenAI or similar. Google is in an even
               | better position to do this. Running a local LLM today
               | doesn't make a lot of sense, but it probably will at some
               | point in like 10 years. I base this on the fact that the
               | requirements for a device like a voice assistant are
               | limited so at some point the hardware and software will
               | catch up. We saw this with smartphones: you can now go 5
               | years without upgrading and things still work fine. But
               | that wasn't the case 10 years ago.
               | 
               | Second, Amazon definitely goofed. They thought people
               | would use the Echos for shopping. They didn't. Literally
               | the only uses for them are alarms and timers, controlling
               | lights and other smart home devices, and answering trivia
               | questions. That's it. What other requirements do you have
               | that don't fall in this category? And the Echos do this
               | stuff incredibly well. They can do complex variations
               | too, including turning off the lights after a timer goes
               | off, scheduling lights, etc. Amazon is basically giving
               | these devices away but the way to pivot this is to
               | release a line of smart devices that connect to the
               | Echos: smart bulbs and switches, smart locks, etc. They
               | do have TVs which you can control with an Echo fairly
               | well (and it is getting better). An ecosystem of smart
               | devices that seamlessly interoperate will dwarf what HA
               | has to offer (and I say this as someone who is firmly on
               | HA's side). And this is Amazon's core competency:
               | consumer devices and sales.
               | 
               | If your requirement is that you want Jarvis, it's not the
               | voice device part of it that you want. You want what it
               | is connected to: a self driving car you can summon,
               | DoorDash you can order by saying "I want a pizza", a
               | phone line so it can call your insurance company and
               | dispute a claim on your behalf.
               | 
               | Now the last piece here is privacy and it's a doozy. The
               | only way to solve this for Amazon is to figure out some
               | form of encrypted computation that allows for your voice
               | prompts to be processed without them ever hearing clear
               | voice versions. Mathematically possible, practically not
               | so much. But clearly consumers don't give a fuck
               | whatsoever about it. They trust Amazon. That's why there
               | are hundreds of millions of these devices. So in effect
               | while people on HN think they are the target market for
               | these devices, they are clearly the opposite. We aren't
               | the thought leaders, we are the Luddites. And again I say
               | this as someone who wishes there was a way to avoid the
               | privacy issue, to have more control over my own tech,
               | etc. I run an extensive HA setup but use Echos for the
               | voice control because at least for now they are be best
               | value. I am excited about TFA because it means there
               | might be a better choice soon. But even here a $59 device
               | is going to have a hard time competing with one that
               | routinely go on sale for $19.
        
             | IgorPartola wrote:
             | That's interesting because I have a bunch of Echos of
             | various types in my house and my timers and answers are
             | instant. Is it possible your internet connection is wonky
             | or you have a slow DNS server or congested Wi-Fi? I don't
             | have the absolute newest devices but the one in my bedroom
             | is the very original Echo that I got during their preview
             | stage, the one in my kitchen is the Echo Show 7" and I have
             | a bunch of puck ones and spherical ones (don't remember the
             | generations) around the house. One did die at one point
             | after years of use and got replaced but it was in my kids
             | room so I suspect it was subject to some abuse.
        
               | creeble wrote:
               | I too get pretty consistent response and answers from
               | Alexa these days. There has been some vague decline in
               | the quality of answers (I think sometime back they
               | removed the ability to ask for Wikipedia data), but have
               | no trouble with timers and the few linked wemo switches I
               | have.
               | 
               | I'm also the author of an Alexa skill for a music player
               | (basic "transport" control mostly) that i use every day,
               | and it still works the same as it always did.
               | 
               | Occasionally I'll get some freakout answer or abject
               | failure to reply, but it's fairly rare. I did notice it
               | was down for a whole weekend once; that's surely related
               | to staffing or priorities.
        
             | mrweasel wrote:
             | Amazon also fired a large number of people from the Alexa
             | team last year. I don't really think Alexa is a major
             | priority for Amazon at this point.
             | 
             | I don't blame them, sure there are millions of devices out
             | there, but some people might own five device. So there
             | aren't as many users as there are devices and they aren't
             | making them any money once bought, not like the Kindle.
             | 
             | Frankly I know shockingly few people who uses
             | Siri/Alexa/Google Assistant/Bixby. It's not that voice
             | assistants don't have a use, be it is a much much small use
             | case than initially envisioned and there's no longer the
             | money to found the development, the funds went into
             | blockchain and LLMs. Partly the decline is because it's not
             | as natural an interface as we expected, secondly: to be
             | actually useful, the assistants need access to control
             | things that we may not be comfortable with, or which may
             | pose a liability to the manufacturers.
        
           | bdavbdav wrote:
           | GH is basically abandonware at this stage it seems. They just
           | seem to break random things, and there haven't been any major
           | updates / features for ages (and Gemini is still a way off
           | for most).
        
             | cachvico wrote:
             | Google Home's Nest integration is recent and top-notch
             | though.
             | 
             | Hopefully in a year they'll have rolled out the Gemini
             | integration and things will be back on track.
        
           | lolinder wrote:
           | On the Google side it's become basically useless for anything
           | beyond interacting with local devices and setting timers and
           | reminders (in other words, the things that FOSS should be
           | able to do very easily). Its only edge over other options
           | used to be answering questions quickly without having to pull
           | out a screen, but now it refuses to answer anything (likely
           | because Google Search has removed their old quick answers in
           | favor of Gemini answers).
        
           | stickfigure wrote:
           | I was an early adopter of google home, have had several
           | generations (including the latest). I quite like the devices,
           | but the voice recognition seems to be getting worse not
           | better. And the Pandora integration crashes frequently.
           | 
           | In addition, it's a moron. I'm not sure it's actually gotten
           | dumber, but in the age of chatgpt, asking google assistant
           | for information is worse than asking my 2nd grader. Maybe it
           | will be able to quote part of a relevant web page, but half
           | the time it screws that up. I just want it to convert my
           | voice to text, submit it to chatgpt or claude, and read the
           | response back to me.
           | 
           | All that said, the audio quality is good and it shows
           | pictures of my kid when idle. If they suddenly disappeared I
           | would replace them.
        
           | throwawayq3423 wrote:
           | Google and Amazon refuse to put GenAI into their existing
           | speakers (which barely function). No doubt they want a new
           | product launch to charge more.
        
       | frognumber wrote:
       | I don't fully understand the cloud upsell. I have a beefy GPU. I
       | would like to run the "more advanced" models locally.
       | 
       | By "I don't fully understand," I mean just that. There's a lot of
       | marketing copy, but there's a lot I'd like to understand better
       | before plopping down $$$ for a unit. The answers might be
       | reasonable.
       | 
       | Ideally, I'd be able to experiment with a headset first, and if
       | it works well, upgrade to the $59 unit.
       | 
       | I'd love to just have a README, with a getting started tutorial,
       | play, and then upgrade if it does what I want.
       | 
       | Again: None of this is a complaint. I assume much of this is
       | coming once we're past preview addition, or is perhaps there and
       | my search skills are failing me.
        
         | trb wrote:
         | Finding microphones that look nice, can pick up voice at high
         | enough quality to extract commands and that cover an entire
         | room is surprisingly hard.
         | 
         | If this device delivers on audio quality it's totally worth it
         | at $59.
        
           | bdavbdav wrote:
           | 100%. For a lot of users that have WAF and time available to
           | contend with, this is a steal.
           | 
           | Bear in mind that a $50 google home or Alexa mini(?) is
           | always going to be whatever google deem it to be. This is an
           | open device which can be whatever you want it to be. That's a
           | lot of value in my eyes.
        
           | alias_neo wrote:
           | I've found it quite hard to find decent hardware with both
           | the input capability needed for wakeword and audio capture at
           | a distance, whilst also having decent speaker quality for
           | music playback.
           | 
           | I started using the Box-3 with heywillow which did amazing
           | input and processing using ML on my GPU, but the speaker is
           | aweful. I build a speaker of my own using a raspberry pi Z2W,
           | dac and some speakers in a 3d printed enclosure I designed,
           | and added a shim to the server so that responses came from my
           | speaker rather than the cheap/tiny speaker in the box-3. I'll
           | likely do the same now with the Voice PE, but I'm hoping that
           | the grove connector can be used to plonk it on top of a
           | higher quality speaker unit and make it into a proper music
           | player too.
           | 
           | As soon as I have it in my hands, I intend to get straight to
           | work looking at a way to modify my speaker design to become
           | an addon "module" for the PE.
        
         | nickthegreek wrote:
         | The cloud sale is easy if you are an HA user already. If you
         | don't use Home Assistant right now, you probably rec it the
         | target audience. I purchase the yearly cloud service as it's an
         | easy way to support HA development. It also gives you remote
         | access to your system without having to do any setup. It
         | provides an https connection which allows you to program esp32
         | devices through Chrome. And now they added the ability to do
         | TTS and STT on someone else's hardware. HA even allows you to
         | setup a local llm for house control commands but route other
         | queries directly to the cloud.
        
           | frognumber wrote:
           | I don't mind paying for hardware. I do mind my privacy, and
           | don't want that kind of information in the cloud, or even
           | traces from encryption I haven't audited myself.
        
         | Jarwain wrote:
         | I can't speak to home assistant specifically, but the last time
         | I looked at voice models, supporting multiple languages and
         | doing it Really Well just happens to require a model with a
         | massive amount of RAM, especially to run at anything resembling
         | real-time.
         | 
         | It's be awesome if they open sourced that model though, or
         | published what models they're using. But I think it unlikely to
         | happen because home assistant is a sorta funnel to nabu casa
         | 
         | That said, from what I can find, it sounds like Assist can be
         | run without the hardware, either with or without the cloud
         | upgrade. So you could definitely use your own hardware,
         | headset, speakers, etc. to play with Assist
        
           | frognumber wrote:
           | _shrug_ whisper seems to do well on my GPU, and faster than
           | realtime.
        
             | Jarwain wrote:
             | Found what I was thinking of [1]
             | 
             | Part of my misremembering is I was thinking of smaller/iot
             | usecase which, alongside the 10GB VRAM requirements for the
             | large multilingual model, felt infeasible -shrug-
             | 
             | [1] https://git.acelerex.com/automation/opcua.ts/-/project_
             | membe...
        
         | antonyt wrote:
         | You can do exactly that - set up an Assist pipeline that glues
         | together services running wherever you want, including a GPU
         | node for faster-whisper. The HA interface even has a screen
         | where you can test your pipeline with your computer's
         | microphone.
         | 
         | It's not exactly batteries-included, and doesn't exercise the
         | on-device wake word detection that satellite hardware would
         | provide, but it's doable.
         | 
         | But I don't know that the unit will be an "upgrade" over most
         | headsets. These devices are designed to be cheap, low-power,
         | and have to function in tougher scenarios than speaking
         | directly into a boom mic.
        
           | ilaksh wrote:
           | Does it use Node-RED for the pipeline?
        
             | haddonist wrote:
             | No, all of the voice parts are either inbuilt or direct
             | addons.
        
           | frognumber wrote:
           | It's an upgrade mostly because putting on a headset to talk
           | to an assistant means it's not worth using the assistant.
        
         | choffee wrote:
         | This device is just the mic/speaker/wakeword part. It connects
         | to home-assistant to do the decoding and automation. You can
         | test it right now by downloading home-assistant and running it
         | on a pi or a VM. You can run all the voice assist stuff locally
         | if you want. There are services for the voice to text, text to
         | voice and what they call intents which are simple things like
         | "turn off the lights in the office". The cloud offering from
         | Nuba Casa, not only funds the development of Home Assistant but
         | also give remote access if you want it. As part of that you can
         | choses to offload some of the voice/text services to their
         | cloud so that if you are just running it on a Pi it will still
         | be fast.
        
       | thumbsup-_- wrote:
       | We need more projects like home assistant. I started using it
       | recently and was amazed. They sell their own hardware but the
       | whole setup is designed to works on any other hardware. There are
       | detailed docs for installation on your own hardware. And, it
       | works amazingly well.
       | 
       | Same for their voice assistant. You can but their hardware and
       | get started right away or you can place your own mics and
       | speakers around home and it will still work. You can but your own
       | beefy hardware and run your own LLM.
       | 
       | The possibilities with home assistant are endless. Thanks to this
       | community for breaking the barriers created by big tech
        
         | mkagenius wrote:
         | I am working on automation of phones (open source) -
         | https://github.com/BandarLabs/clickclickclick
         | 
         | I haven't been able to quite get the Llama vision models
         | working but I suppose with new releases in future, it should
         | work as good as Gemini in finding bounding boxes of UI
         | elements.
        
         | lokar wrote:
         | It's a great project overall, but I've been frustrated by how
         | anti-engineer it has been trending.
        
           | thfuran wrote:
           | How so?
        
           | sofixa wrote:
           | Do you mean the move away from YAML first configs?
           | 
           | I was originally somewhat frustrated, but overall, it's much
           | better (let's be honest, YAML sucks) and more user friendly
           | (by that I mean having a form with pre-filled fields is
           | easier than having to copy paste YAML).
        
             | philjohn wrote:
             | It's worse though when you need to add a ton of custom
             | sensors at once, e.g., for properly automating a Solar PV +
             | Battery solution.
        
               | ncallaway wrote:
               | But like, isn't YAML still available for configuring
               | things?
               | 
               | Have they gotten rid of any YAML configs, with things
               | that are now UI only? My understanding was that they've
               | just been building more UI for configuring things and so
               | now default recommend people away from YAML (which seems
               | like the right choice to me).
        
               | sofixa wrote:
               | > But like, isn't YAML still available for configuring
               | things?
               | 
               | For most, yes. But for some included integrations it's
               | UI-only (all of those I've had to migrate, it's been a
               | single click + comment out lines, and the config has been
               | a breeze (stuff like just an api key/IP address + 1-2
               | optional params).
        
               | lolinder wrote:
               | Where and how are those configs stored? There has to be a
               | backing representation somewhere, right?
        
               | sofixa wrote:
               | In the Home assistant database (which is SQLite IIRC).
        
               | iamjackg wrote:
               | UI-generated configs are not stored in the database, they
               | end up in a collection of JSON files in a .storage
               | directory inside your config directory.
        
               | lokar wrote:
               | And there is no real API for you to interact with it. I
               | would build my own config system if I could, but they
               | don't seem interested.
        
               | lolinder wrote:
               | SQLite is highly automatable if you can deal with
               | downtime to do your migrations.
               | 
               | I'm sure there are things they could do to _better_
               | support the power-user engineer use case, but at the end
               | of the day it 's a self-hosted web app written in Python
               | that has strong support for plugins. There should be very
               | few things that an engineer couldn't figure out how to do
               | between writing a plugin, tweaking source code, and just
               | modifying files in place. And in the meantime I'm glad
               | that it exists and apparently has enough traction to pay
               | for itself.
        
               | ramses0 wrote:
               | For "integrated" stuff, their stance is "UI Must Work".
               | Tracing down the requirements, here:
               | https://design.home-assistant.io/#concepts/home
               | https://developers.home-
               | assistant.io/docs/configuration_yaml_index
               | https://github.com/home-
               | assistant/architecture/blob/master/adr/0010-integration-
               | configuration.md
               | 
               | ...usually there's YAML kicking around the backend, but
               | for normal usage, normal users, the goal is to be able to
               | configure all (most) things via UI.
               | 
               | I've had to drop to YAML to configure (eg) writing stats
               | to indexdb/graphana vs. sqlite (or something), or maybe
               | to drop in or update an API_KEY or non-standard
               | host/port, but 99% of the time the config is baroque, but
               | usable via the web-app.
        
               | philjohn wrote:
               | Yes - for now. I think the ultimate end-goal is to get
               | rid of the YAML config files, which, makes sense for the
               | median user, but not for power users.
               | 
               | For example, I have my config on GitHub and share various
               | YAML blueprints with a friend who also has the same
               | Solar+Battery system as I do.
        
             | lokar wrote:
             | Yes, config is a major part of it. But also a lack of good
             | APIs, very poor dev documentation, not great logging. A
             | general "take it or leave it" attitude, not interesting in
             | enabling engineers to build.
        
             | cryptoegorophy wrote:
             | Oh thank got. Just started using HA few months ago and all
             | these yaml is so confusing when I try to code it with
             | ChatGPT , constant syntax or some other random errors.
        
           | gerdesj wrote:
           | Install the Node-RED add on. I use that to do the tricky
           | stuff.
           | 
           | Install the whole thing on top of stock Debian "supervised"
           | then you get a full OS to use.
           | 
           | You get a fully integrated MQTT broker with full provisioning
           | - you don't need a webby API - you have an IoT one instead!
           | 
           | This is a madly fast moving project with a lot of different
           | audiences. You still have loads of choice all tied up in the
           | web interface.
        
             | paradox460 wrote:
             | Or the Digital alchemy addon. Let's you write your
             | automations using typescript
        
         | interludead wrote:
         | Completely agree! Home Assistant feels like a breath of fresh
         | air in a space dominated by big tech's walled gardens.
        
         | PhilippGille wrote:
         | > We need more projects like home assistant
         | 
         | Isn't openHAB an existing popular alternative?
         | 
         | https://www.openhab.org/
        
           | btreecat wrote:
           | HA long ago blew past OpenHAB in functionality and community.
           | 
           | Unless you have a hard-on for JVM services, HA is the better
           | XP these days.
        
             | diggan wrote:
             | > HA long ago blew past OpenHAB in [...] community.
             | 
             | Home Assistant seems insurmountable to beat at that
             | specific metric, seems to be the single biggest project in
             | terms of contributions from a wide community. Makes sense,
             | Home Assistant tries to do a lot of things, and succeeds at
             | many of them.
        
             | yurishimo wrote:
             | When I was evaluating both projects about 5 years ago, I
             | went with openHAB because they had native apps with native
             | controls (and thus nicer design imo). At the time, HA was
             | still deep in YML config files and needed validation before
             | saving etc etc. Not great UX.
             | 
             | Nowadays, HA has more of the features I would want and
             | other external projects exist to create your own dashboards
             | that take advantage of native controls.
             | 
             | Today I'm using Homey because I'm still a sucker for design
             | and UX after a long day of coding boring admin panels in
             | the day job, but I think in another few years when the
             | hardware starts to show its age that I will move to home
             | assistant. Hell, there exists an integration to bring HA
             | devices into Homey but that would require running two hubs
             | and potentially duplicating functionality. We shall see.
        
           | tedivm wrote:
           | I think they meant "projects with a culture and mindset like
           | homeassistant", not just a competitor to the existing
           | project.
        
       | Jarwain wrote:
       | I'm actually really excited for this!
       | 
       | I noticed recently there weren't any good open source hardware
       | projects for voice assistants with a focus on privacy. There's
       | another project I've been thinking about where I think the
       | privacy aspect is Important, and figuring out a good hardware
       | stack has been a Process. The project I want to work on isn't
       | exactly a voice assistant, but same ultimate hardware
       | requirements
       | 
       | Something I'm kinda curious about: it sounds like they're
       | planning on a sorta batch manufacturing by resellers type of
       | model. Which I guess is pretty standard for hardware sales. But
       | why not do a sorta "group buy" approach? I guess there's nothing
       | stopping it from happening in conjunction
       | 
       | I've had an idea floating around for a site that enables group
       | buys for open source hardware (or 3d printed items), that also
       | acts like or integrates with github wrt forking/remixing
        
         | Brendinooo wrote:
         | I invested in Mycroft and it flopped. Here's hoping some others
         | can go where they couldn't.
        
           | bdavbdav wrote:
           | I guess the difference here is that HA has a huge community
           | already. I believe the estimate was around 250k installations
           | running actively. I suspect a huge chunk of the HA users venn
           | diagram slice fits within the voice users slice.
        
             | balloob wrote:
             | Our estimates are more than a million active instances
             | https://analytics.home-assistant.io/
        
               | emsixteen wrote:
               | More than a million? It says on the page: "424,548 Active
               | Home Assistant Installations"
               | 
               | Am I missing something? Is it that these are just those
               | you know are sharing details, and you can scale that up
               | by a known percentage? :)
        
               | schnapsidee wrote:
               | > Analytics in Home Assistant are opt-in and do not
               | reflect the entire Home Assistant userbase. We estimate
               | that a third of all Home Assistant users opt in.
        
               | alias_neo wrote:
               | I'm a big fan of home assistant, and use it to control a
               | LOT of my home, have done for years, have tonnes of
               | hardware dedicated to and for it, and I've also ordered
               | some of these Voice devices.
               | 
               | I'm also opted OUT of the analytics.
        
           | bronco21016 wrote:
           | I think Mycroft was unfortunately just ahead of its time. STT
           | was just becoming good enough but NLU wasn't quite there yet.
           | Add in you're up against Apple Google and Amazon who were
           | able to add integrations like music and subsidize the crap
           | out of their products.
           | 
           | I just think this time around is different. Open Whisper
           | gives them amazing STT and LLMs can far more easily be
           | adapted for the NLU portion. The hardware is also dirt cheap
           | which makes it better suited to a narrow use case.
        
           | geerlingguy wrote:
           | IIRC one of the main devs behind this device came from
           | Mycroft.
        
             | dole wrote:
             | OP's username checks out.
        
             | robotfelix wrote:
             | Yep, Mike Hansen was on the live stream launching the new
             | device. He also notably created Rhasspy [1], which is open-
             | source voice assistant software for Raspberry Pi (when
             | connected to a microphone and speaker).
             | 
             | [1] https://rhasspy.readthedocs.io/en/latest/
        
           | tacticalturtle wrote:
           | I believe Mycroft was killed in part due to a patent troll:
           | 
           | https://www.theregister.com/AMP/2023/02/13/linux_ai_assistan.
           | ..
           | 
           | Hopefully the troll is no longer around
        
             | NoNotTheDuo wrote:
             | I think another part is that there is a failure mechanism
             | on their boards that was recently identified: https://commu
             | nity.openconversational.ai/t/sj-201-sj201-failu...
             | 
             | The short version, from the post, is that there are 4
             | capacitors that are only rated for 6.3v, but the power
             | supply is 12v. Eventually one of these capacitors will
             | fail, causing the board to stop working entirely.
             | 
             | It would be hard for a company to stay in business when
             | they are fighting a patent troll lawsuit and having to
             | handle returns on every device they sold through
             | kickstarter.
        
         | IgorPartola wrote:
         | A group buy for an existing product makes sense. Want to buy a
         | 24TB Western Digital hard drive? It's $350. But if you and your
         | 1000 closest friends get together the price can be $275.
         | 
         | But for a first time unknown product? You get a lot fewer
         | interested parties. Lots of people want to wait for tech
         | reviews and blog posts before committing to it. And group buys
         | being the only way to get them means availability will be
         | inconsistent for the foreseeable future. I don't want one voice
         | assistant. I want 5-20, one for every space in my house. But I
         | am not prepared to commit to 20 devices of a first run and I am
         | not prepared to buy one and hope I'll get the opportunity to
         | buy more later if it doesn't flop. Stability of the supply
         | chain is an important signal to consumers that the device won't
         | be abandoned.
        
           | bhaney wrote:
           | > I am not prepared to buy one and hope I'll get the
           | opportunity to buy more later
           | 
           | As long as this thing works and there's demand for it, I
           | doubt we'll ever run out of people willing to connect an
           | XU316 and some mics to an ESP32-S3 and sell it to you with
           | HA's open source firmware flashed to it, whether or not HA
           | themselves are still willing to.
        
             | Jarwain wrote:
             | I agree! I mean, just look at the market for Meshtastic
             | devices! So many options! Or devices with WLED pre-
             | installed! It'll take a Lot for Esp32 to go out of style
        
           | ascorbic wrote:
           | Kickstarter shows that a lot of people feel different.
        
             | IgorPartola wrote:
             | Kickstarter isn't a group buy. Similar, but not the same.
        
           | esperent wrote:
           | > But for a first time unknown product? You get a lot fewer
           | interested parties. Lots of people want to wait for tech
           | reviews and blog posts before committing to it.
           | 
           | I used to think so too. But then Kickstarter proved that
           | actually, as long as you have a good advertising style,
           | communicate well, and get lucky, you can get people to
           | contribute literal millions for a product that hasn't even
           | reached the blueprints stage yet.
        
             | IgorPartola wrote:
             | Kickstarter isn't a group buy.
        
               | yunohn wrote:
               | Kickstarter is often basically a group buy. Project
               | owners make MVPs and market/pitch it, get funding from
               | the public, and then commission a large batch run.
        
           | burningChrome wrote:
           | >> I want 5-20, one for every space in my house.
           | 
           | I don't have a small house, but I'm trying to think why I
           | would need even 5 of these, let alone 20. The majority of the
           | time my family spends together is in the open layout on our
           | main floor where the kitchen flows into the living room with
           | an adjacent sun room off the living room.
           | 
           | I'm genuinely curious why you need so many of these.
           | 
           | I do agree that if you do have a legit use case for so many,
           | buying so many in essentially a first run is a risky thing.
           | Coupled with the ability for this to be supported for more
           | than a fleeting couple of years is also a huge risk.
        
             | Jarwain wrote:
             | Just using where I might want it in childhood home as an
             | example - master bedroom - master bathroom - grandma's room
             | - my room - brother's room - upstairs bathroom - upstairs
             | loft? - office room - living room/diningroom -
             | kitchen/kitchentable/familyroom - garage?
             | 
             | 9-14 devices for a 5 person household. May be a stretch
             | since I'm not sure if my grandma could even really use it.
             | Bathroom's a stretch but I'm imagining being in the shower
             | and wanting to note multiple showerthoughts
        
         | interludead wrote:
         | Your idea about group buys is really intriguing. I wonder if
         | the community might organically set something like that up once
         | there's enough interest
        
         | choffee wrote:
         | Not really sure what the benefit of group buy would be here.
         | Nuba Casa, the company that supports the development of home
         | assistant and developed this product, already has a few
         | products they sell. They had this stocked all over the world
         | for the announcement and it sold out. I assume they had already
         | made a few thousand. They will get more stock now and it will
         | sell just like the other things they make. Any profit from this
         | will go back into development of Home Assistant.
        
           | Jarwain wrote:
           | Heh thus far I've been an excited spectator of HomeAssistant,
           | and wasn't aware of Nuba Casa until doing research for a
           | different comment on the thread. I do love and appreciate
           | their model here
           | 
           | I guess the benefits that came to mind are - alternative
           | crowdsourced route for sourcing hardware, to avoid things
           | like that raspberry pi shortage (although if it's due to
           | broader supply chain issues then this doesn't necessarily
           | help) - hardware forks! If someone wanted a version with a
           | more powerful ESP32, or a GPS, or another mic, or an
           | enclosure for a battery and charging and all that, took the
           | time to fork the design to add these features, and found X
           | other users interested in the fork to get it produced... (of
           | course I might be betraying my ignorance on how easy it is to
           | set up this sort of alternative manufacturing chain or what
           | unit amounts are necessary to make this kind of forking
           | economical)
        
         | pimeys wrote:
         | I'm also very excited. I've had some ESP32 microphones before,
         | but they were not really able to understand the wake word,
         | sometimes even when it was quiet and you were sitting next to
         | the mic.
         | 
         | This one looks like it can recognize your voice very well, even
         | when music is playing.
         | 
         | Because... when it works, it's amazing. You get that Star Trek
         | wake word (KHUM-PUTER!), you can connect your favorite LLM to
         | it (ChatGPT, Claude Sonnet, Ollama), you can control your home
         | automation with it and it's as private as you want.
         | 
         | I ordered two of these, if they are great, I will order two
         | more. I've been waiting for this product for years, it's
         | hopefully finally here.
        
           | nine_k wrote:
           | As a side note, it always slightly puzzles me when I see
           | "voice interface" and "private" used together. Maybe it takes
           | living alone to issue voice commands and feel some privacy.
           | 
           | (Yes, I do understand that "privacy" here is mostly about not
           | sending it for processing to third parties.)
        
             | staunton wrote:
             | > Yes, I do understand that "privacy" here is mostly about
             | not sending it for processing to third parties.
             | 
             | Then why does it puzzle you?
        
               | entropicdrifter wrote:
               | Because you wouldn't ask it deeply private questions in
               | front of your mom, for instance
        
               | xandrius wrote:
               | There are levels of privacy. Because I'm not going to ask
               | deeply private questions, it doesn't mean that I want
               | everyone to be snooping into what I'm planning to eat
               | tonight.
        
             | iteria wrote:
             | I don't like these interaces because unless they are button
             | activated or something, they must be always listening and
             | sending sound from where you are to a 3rd party server. No
             | thanks. Of course this could be happening with my phone,
             | but at least it have to be a malicious action to record me
             | 24/7
        
               | pimeys wrote:
               | How these ESP32-systems work is that you send a wake word
               | to the device itself. It can detect the word without an
               | internet connection, the device itself understands it and
               | wakes up. After the device is woken up, it sends your
               | speech to home assistant, which either                 -
               | handles it locally, if you have fast enough computer
               | - sends it to home assistant cloud, if you set it up
               | - sends it to chatgpt, claude sonnet etc. if you set it
               | up
               | 
               | I'm planning on building a proxmox rack server next year,
               | so I'm probably going to just handle all the discussions
               | locally. The home assistant cloud is quite private too,
               | at least that's what they say (and they're in EU, so I
               | think there might be truth in what they say)...
        
             | pimeys wrote:
             | Private meaning that a big American corporation is not
             | listening and using my voice to either track me or teach
             | their own AI service with it.
        
       | jauntywundrkind wrote:
       | Not super convinced the XMOS audio processing chip is really
       | gonna buy a lot. Trying to do audio input processing feels like a
       | dynamic task, requiring such adaption. XMOS is the most well
       | known audio processor and a beast, but not sure it's really gonna
       | help here!
       | 
       | I really hope we see some open-source machine -learned systems
       | emerge.
       | 
       | I saw Insta360 announce their video conferencing solution today.
       | Optics looks pretty medium, nothing wild, but Insta360 is so good
       | at video that I expect it'll be great. But there's a huge 14
       | microphone array on it, and that's the hard job; figuring out how
       | to get good audio from speakers in a variety of locations around
       | a room. It really made me wish for more open source footing here,
       | some promising start, be it the conference room or open living
       | space. I've given all of 60s to look through this, and was kinda
       | hopeful because heck yeah Home Assistant, but my initial read
       | isn't super promising, isn't that this is starting the proper
       | software base needed to listen well to the world.
       | 
       | https://petapixel.com/2024/12/17/the-insta360-connect-is-a-2...
        
         | choffee wrote:
         | They showed a video at the end of their broadcast last night
         | comparing what the raw microphone hears and what comes out of
         | the XMOS chip and you can hear a much clearer voice all the
         | time even when there is noise or you are far away from the
         | device. It is also used to cancel out the music if you are
         | using it's speaker output. I don't think it's doing any voice
         | processing but it's cleaning up the audio a lot which makes the
         | job of the wake word processor and the speach to text a lot
         | easier. Up until now this was missing from a lot of the home
         | made voice assistance and I think why Alexa can understand you
         | from the next room but my home made one struggles with all but
         | quiet conditions.
        
           | summm wrote:
           | Alexa Echo Dot has 6 or 7 microphones. I'd expect that makes
           | it much easier to filter out voices directionally than only
           | the 2 microphone this hardware has. I hope they release a
           | version with more microphones.
        
       | nickthegreek wrote:
       | And on back order everywhere. I just spent the last 2 weeks
       | getting a esp32-s3-box setup to do this but its lack of audio out
       | really irks me.
        
         | joshstrange wrote:
         | And the mic is not all that great either. I have a couple of
         | them but they just weren't reliably picking up my voice and I
         | couldn't hear the reply either (when it did hear me). I figured
         | it would be easy to add a speaker to them but that sent me down
         | a rabbit hole that I gave up on and put them in a drawer. I'll
         | buy this for sure though because when the ESP32 box thing
         | worked it worked really well and I loved being able to swap out
         | parts of the assist pipeline.
        
           | nickthegreek wrote:
           | I ended up moddng the s3 yaml to turn off the internal
           | speaker and to forward all voice responses to a google hub.
        
           | alias_neo wrote:
           | To be fair, the issue with the Box-3 is HA's implementation;
           | I used it with heywillow.io and it was incredible, I could
           | speak to it from another room and it would pick up perfectly.
           | 
           | The audio out is terrible so I wrote a shim-server that
           | captures the request to the TTS server for heywillow and sent
           | it to a speaker I build myself running MPD on a Pi with a
           | nice DAC and have it play the responses instead of the
           | box-3's tiny speaker.
           | 
           | I don't expect the audio-out on this to be much better with
           | its tiny speaker, but at least it has a 3.5mm jack.
           | 
           | I'm going to look into what that Grove port can do too and
           | perhaps build a new speaker "module" that the Voice PE can
           | sit on top of to make it a proper music device.
        
         | yzydserd wrote:
         | > And on back order everywhere.
         | 
         | I just clicked through to my large country and the first vendor
         | and was able to buy 2 for delivery tomorrow. So it says. So
         | maybe not on back order everywhere.
        
         | sofixa wrote:
         | If it's an ESP32-S3-BOX-3, there is audio out (assuming you
         | mean being able to send arbitrary audio to it to play). Due to
         | the framework used it's not available, but there's an
         | alternative firmware available on GitHub that uses the newer
         | framework and it exposes a media player entity you can send any
         | audio to.
        
           | nickthegreek wrote:
           | I didn't have the -3 version. Learned the hard way after
           | loading up that alt framework last week and the screen went
           | blank I did end up implementing that same solution on my
           | hardware though.
        
       | mkagenius wrote:
       | Though a separate hardware helps - I believe voice and automation
       | can be integrated more seamlessly to our existing devices
       | (phones/laptops) with high compute built in.
       | 
       | Llama and whisper are already public so that should help
       | innovation in this area.
        
         | antonyt wrote:
         | With existing phones and laptops, there's either activation
         | friction (pressing the "listen to me" button) or the device has
         | to be always listening, which requires a lot of trust in your
         | hardware vendors.
         | 
         | With an open source and potentially local-only device, you can
         | have your voice assistant and keep your privacy.
        
         | throwawaymaths wrote:
         | last i checked open source whisper does not support streaming
         | or diarization out of the box. you really need both for a good
         | voice assistant experience
        
         | joshstrange wrote:
         | You can use your phone to text or talk to HA's assistant. I've
         | done that a number of times when Alexa fails. Having dedicated
         | hardware is a huge step up for me. I've tried their ESP32 mini
         | cube assistant thing before and it showed a lot of promise but
         | the hardware (speaker and mic, processor was fine) was lacking.
         | This seems to be a good mic and speaker wrapped around a
         | similar core so I'm super excited for it.
        
         | alias_neo wrote:
         | The voice input can really be done however you like, the
         | benefit of a device like the Voice PE is the wake word
         | detection on-device.
         | 
         | I have an office-style desk-phone (SNOM) connected to a SIP
         | server and I can pick the receiver up and talk to the
         | Assistant, but you can plug in any way you like to get the
         | audio to/from HA.
         | 
         | With your phone, wake words are usually locked down by
         | Apple/Google so you can't really have it hands-free, and that's
         | the problem this device is solving; not the audio input itself,
         | but the wake-word/handfree input.
         | 
         | On an Android phone, you can replace the Google Assistant with
         | the Home Assistant one, but you still have to activate it the
         | usual way, press a button or launch the app etc.
        
       | shepherdjerred wrote:
       | Home Assistant is such a fantastic project. I've been waiting for
       | something like this for a long time; I just pre-ordered three.
       | 
       | My only remaining wish is that I can replace Siri with this
       | (without needing some workaround)
        
       | hoppp wrote:
       | If it runs fully on premise that would be great. Im still not
       | comfortable buying a device that records everything I say and
       | uploads it to a cloud
        
         | haddonist wrote:
         | Fully on-prem can be done if you've got the LLM compute power
         | in place.
        
       | catmanjan wrote:
       | All I want is a voice assistant that I can call "computer" like
       | Star Trek, I don't want to have to say a brand name thankyou!
        
         | dartos wrote:
         | You could've always set Alexa to respond to "Computer" instead.
        
           | catmanjan wrote:
           | Ah I admit I haven't looked into it for several years, good
           | to see they added the feature - I might have to grab one
        
             | bigstrat2003 wrote:
             | The problem is that it will go off every single time you
             | watch Star Trek.
        
           | imp0cat wrote:
           | Can confirm, this works fabulously!
        
         | antonyt wrote:
         | If you run openWakeWord, "computer" is one of very many
         | pretrained models the community has made:
         | https://github.com/fwartner/home-assistant-wakewords-collect...
        
       | lxe wrote:
       | Here's what I'm looking for in a voice assistant:
       | 
       | - Full privacy: nothing goes to the "cloud"
       | 
       | - Non-shitty microphones and processing: i want to be able to be
       | heard without having to yell, repeat, or correct
       | 
       | - No wake words: it should listen to everything, process it, and
       | understand when it's being addressed. Since everything is private
       | and local, this is now doable
       | 
       | - Conversational: it should understand when I finished talking,
       | have ability to be interrupted, all with low latency
       | 
       | - Non-stupid: it's 2024, and alexa and siri and google are
       | somehow absolutely abysmal at doing even the basics
       | 
       | - Complete: i don't want to use an app to get stuff configured. I
       | want everything to be controlled via voice
        
         | danparsonson wrote:
         | > No wake words: it should listen to everything, process it,
         | and understand when it's being addressed
         | 
         | Even humans struggle with this one - that's what names are for!
        
           | antonyt wrote:
           | Yeah, I'm having a hard time imagining how no-wake-word could
           | work in practice.
        
             | fragmede wrote:
             | after setting up the system, if I say "turn the ceiling
             | lights to 20%", who else would be changing the lights?
             | 
             | But also, post-fix wake word would also be natural if it
             | was recording all the time. "turn on the lights, Google",
             | for instance
        
               | TheCoelacanth wrote:
               | Someone in a TV show that you're watching?
        
             | ethbr1 wrote:
             | Like that really annoying friend who jumps in every other
             | sentence with "Well actually..."
        
               | marcosdumay wrote:
               | I have a coworker that set up an Alexa an year or so ago,
               | I don't know what was the issue, but it would jump into
               | Teams meetings after every noise in his house.
        
             | lukifer wrote:
             | This is one advantage of a system with a constrained set of
             | commands/grammars, as opposed to the Alexa/Siri model of
             | trying to process all arbitrary text while in active mode.
             | It can simply ignore/discard any invocations which don't
             | match those specific grammars (and no need to wait to
             | confirm that the device is awake).
             | 
             | "Computer, turn lights to 50%" -> "turn lights to fifty
             | percent" -> {action: "lights", value: 50}
             | 
             | "My new computer has a really beefy graphics card" -> "has
             | a really beefy graphics card" -> {action: null}
        
         | wild_egg wrote:
         | How much are you willing to pay though? Full privacy means
         | powerful enough hardware to do everything else on the list on-
         | device and _quickly_. I don't know that most people have the
         | budget for that
        
         | nissarup wrote:
         | Looks like you are in the market for a butler.
         | 
         | Especially your last point will, IMO, not be possible for a
         | long time.
        
         | Lanolderen wrote:
         | I'd imagine with 1-2 TVs constantly talking, general
         | conversations and other random noises it'd get expensive quick.
         | Definitely closer to a rack than a RaspPi or old laptop
         | hardware wise. Also add to that more/better mics for coverage
         | and the complexity of it guessing when you're asking _it_ to
         | remind you to buy toothpaste or your SO... It can probably be
         | done by tracking who 's home, who's in the room with the
         | speaker, who the speaker is, etc but it's all cost..
        
         | micromacrofoot wrote:
         | without a wake word that's a lot of compute unless you live
         | alone and don't watch tv or listen to music
         | 
         | they even used a wake word in star trek fwiw
        
       | IG_Semmelweiss wrote:
       | Can someone describe the use case here? I don't quite understand
       | what its purpose is.
       | 
       | Is this a fully-private, open source alternative to Alexa, that
       | by definition requires a CPU locally to run ?
       | 
       | Is the device supposed to be the nerve center of IoT devices ?
       | 
       | Can it access the Wifi to do web crawls on command (music,
       | google, etc)?
        
         | IvyMike wrote:
         | If you have home automation, surely you've run into this
         | situation when Comcast flakes (or similar):
         | 
         | "OK, Google, turn lights on" "Check your connection and try
         | again"
         | 
         | As far as I can tell, if you have Home Assistant + this new
         | device, you've fixed that problem.
        
         | antonyt wrote:
         | The nerve center would be your Home Assistant instance, which
         | is not this device. You can run Home Assistant on whatever
         | hardware you like, including options sold by Nabu Casa.
         | 
         | This device provides the microphone, speaker, and WiFi to do
         | wake-word detection, capture your input, send it off to your HA
         | instance, and reply to you with HA's processed response.
         | Whether your HA instance phones out to the internet to produce
         | the response is up to you and how you've configured it.
        
       | jve wrote:
       | While we are getting shoveled AI keyword everywhere, I'm actually
       | disappointed I don't see it here.
       | 
       | The first thought I had when encountering LLM was that it can
       | finally make these devices understand you and make them finally
       | useful... and I don't need to know some presceipted keywords.
        
         | antonyt wrote:
         | You can actually integrate LLMs with Assist pipelines, it's
         | just orthogonal to this hardware announcement. Check out
         | https://www.home-assistant.io/blog/2024/06/05/release-20246/...
        
           | pimeys wrote:
           | It's also really cool. You can make it so that the home
           | assistant itself first tries to understand what you do, like
           | turning on the living room lights or setting the bathroom
           | temperature to 21.5 degrees celsius. If the assistant
           | pipeline does not understand what you are asking for, it can
           | send your question to the LLM of your choice. You can also
           | make the LLM to control the lights, heat etc, but at least
           | for now ChatGPT is pretty bad with that. So let home
           | assistant do the home automation, and then let ChatGPT to
           | answer your questions about the most popular ruler in the
           | 19th century France.
        
       | joshstrange wrote:
       | It's too bad it's sold out everywhere. I've tried the ESP32
       | projects (little cube guy) for voice assistants in HA but it's
       | mic/speaker weren't good enough. When it did hear me (and I heard
       | it) it did an amazing job. For the first time I talked to a voice
       | assistant that understood "Turn off office lights" to mean "Turn
       | off all the lights in the office" without me giving it any
       | special grouping (like I have to do in Alexa and then it randomly
       | breaks). It handled a ton of requests that are easy for any human
       | but Alexa/Siri trip up on.
       | 
       | I cannot wait to buy 5 or more of these to replace Alexa. HA is
       | the brain of my house and up till now Alexa provided the best
       | hardware to interact with HA (IMHO) but I'd love something first-
       | party.
        
         | bdavbdav wrote:
         | How did you find it for music tasks?
        
           | joshstrange wrote:
           | I didn't test that. I normally just manually play through my
           | Sonos speaker groups on my phone. I don't like the sound from
           | the Echos so I'm not in the habit of asking them to do
           | anything related to music.
           | 
           | Right now I only use Alexa for smart house control and
           | setting timers
        
         | moffkalast wrote:
         | I'm definitely buying one for robotics, having a dedicated unit
         | for both STT and TTS that actually works and integrates well
         | would make a lot of social robots more usable and far easier to
         | set up and maintain. Hopefully there's a ROS driver for it
         | eventually too.
        
       | shaklee3 wrote:
       | As someone not that familiar with haas, can someone explain why
       | there's not a clear path to replace Alexa or Google home? I
       | considered using haas recently to get a gpt like response after
       | being frustrated with Google home, but it seems this is a
       | complete mess. is there a way to get this yet?
        
         | joshstrange wrote:
         | > explain why there's not a clear path to replace Alexa or
         | Google home?
         | 
         | There is. I've used HA with their default assist pipeline
         | (Cloud HA STT, Cloud HA LLM, Cloud HA TTS) and I've also
         | plugged in different providers at each step (both remote and
         | local for each part: STT/LLM/TTS) and it's super cool. Their
         | default LLM isn't great but it works, plugging in OpenAI made
         | it work way better. My local models weren't great in speed but
         | I don't have hardware dedicated for this purpose (currently),
         | seeing an entire local pipeline was amazing for the promise of
         | it in the future. It's too slow (on my hardware) but we are so
         | close to local models (SST/TTS could be improved as well but
         | they are much easier to do already locally).
         | 
         | If this new HA hardware comes even close to performing as well
         | as the Echo's in my house (low bar) I'll replace them all.
        
           | jazzyjackson wrote:
           | What does it use LLMs for?
        
             | joshstrange wrote:
             | Taking the text of what you said and figuring out what you
             | want to do. It sends what you said plus a list of
             | devices/states and a list of functions (to turn off/on, set
             | temp, etc of devices). The LLM takes "Turn off basement
             | lights" and turns that into "{function: "call_service",
             | args: ['lights.on', 'entity-id-123']}" (<- Completely made
             | up but it's something like that) that it passes back to HA
             | along with what to say back to the user ("Lights turned
             | off" or whatever) and HA will run the function and then do
             | TTS to respond to you.
        
       | tomqueue wrote:
       | I am very excited for this. One question I couldn't find an
       | answer for though is whether the hardware is open enough to be
       | usable with other home automation systems. I am using OpenHAB and
       | they too have an integrated voice assistant. I looked into
       | migrating to HA a couple times but eventually gave up, primarily
       | because it felt like such a waste of time to migrate a fully
       | working environment with dozens of rules and scripts to yaml
       | files.
        
         | interludead wrote:
         | Moving a fully functional setup with complex rules and scripts
         | is a daunting task
        
         | choffee wrote:
         | It's all open and so should be able to work with OpenHAB as
         | well but it would need somebody to either write a firmware
         | that's compatibale with the OpenHAB endpoints or add ESPHome
         | interegeation into OpenHAB. Somebody might have already done
         | that for their voice stuff. There is not much yaml in home
         | assistant now unless you want it. I'd give it a go in a VM and
         | see what it finds on your network :)
        
       | interludead wrote:
       | I think in some ways it could redefine how we think about voice
       | control... taking it from the cloud and putting it back into
       | users' hands, like literally
        
       | Simon_O_Rourke wrote:
       | A good emphasis in the summary, that certain other companies will
       | only focus on monetization at the expense of features and
       | functionality.
        
       | delijati wrote:
       | Perfect will dig more into it. Currently i like to have an
       | spotify client without ui for my kids ;)
        
       | ahaucnx wrote:
       | It's not clear to me from the description if this is also
       | completely open source hardware. Are the schematics, BoM,
       | firmware published under a permissible license? If so, where are
       | they accessible?
       | 
       | And if not, I would be curious to know why it haven't been fully
       | open sourced.
        
         | choffee wrote:
         | I would think so in the end. They talked about the case design
         | being open. The software and firmware are all open already and
         | they said that they really wanted people to be able to take
         | these components and make new devices.
         | 
         | They have relesased the designs for the yellow so I assume it
         | will all come. https://github.com/NabuCasa/yellow
        
       | fx1994 wrote:
       | What I don't like is that most voice assistances perform really
       | bad on my native language so I don't use them at all. For english
       | speakers yes, but for all other not so much. I guess it will get
       | better.
        
         | choffee wrote:
         | That is one of the major things that Home Assistant are trying
         | to fix. They have groups working on most languages and are
         | adding them to their open as they improve. https://www.home-
         | assistant.io/voice_control/contribute-voice
        
       | leeoniya wrote:
       | anyone tried https://getleon.ai/ ?
        
         | lukifer wrote:
         | I tried years ago, I don't think I got it working, ended up
         | using Rhasspy/voice2json instead (TIL: the creator of both is
         | now the Voice Eng Lead for Home Assistant).
         | 
         | Looks like the GitHub is still somewhat active, although their
         | roadmap links to a dead Trello: https://github.com/leon-ai/leon
        
       | solarkraft wrote:
       | RIP Mycroft. A tad too early.
        
         | choffee wrote:
         | Nabu Casa employ one of the Mycroft devs now and i think some
         | of the tech came from that project so it's not all gone :)
        
       | singularity2001 wrote:
       | sorry if this question takes away from the great strives the team
       | went through but wouldn't it be much easier (hardware wise) to
       | jailbreak one of the existing great hardware thingies like Apple
       | HomePod or the Google one or Alexa?
        
         | choffee wrote:
         | I don't think they are that easy to jail break but I may be
         | wrong. I think they wanted to create an open device that people
         | could build from rather than just a hacked up alexa.
        
         | robotfelix wrote:
         | I've picked up an Echo Dot a few years ago when Amazon were
         | practically giving them away, thinking that surely someone
         | would have jailbroken it by now to allow it to be used with
         | Home Assistant.
         | 
         | It was only after researching later that I discovered that this
         | wasn't currently possible and recommended approach was to buy
         | some replacement internals that cost more than the device
         | itself (and if I recall correctly, more than the new Home
         | Assistant Voice Preview Edition).
        
         | alias_neo wrote:
         | The fact that it hasn't (widely?) been done yet suggests the
         | answer is "no".
         | 
         | The hardware in those devices is generally better, most of them
         | have much better speakers, but they're locked down, the wake-
         | word detection hardware isn't open or accessible so changing it
         | to do what we need would be difficult, and you're just hoping
         | there's a way in.
         | 
         | Existing examples of opening them (as in freedom) replace the
         | PCB entirely, which puts you back to square one of needing open
         | hardware.
         | 
         | This feels like the right approach to me; I've been building my
         | own devices for this purpose with off-the-shelf parts, and
         | designing enclosures, but this is much sleeker; I just hope an
         | add-on or future version comes with much better audio out
         | (speakers) because that's where it and things like it (e.g. the
         | S3-Box-3) are really lacking.
        
         | singularity2001 wrote:
         | or maybe find cheap Chinese smart speaker which is hackable?
        
       | Havoc wrote:
       | Had to laugh a bit at the caveat about powerful hardware. Was
       | bracing myself for GPU and then it says N100 lol
        
         | moooo99 wrote:
         | I mean, comparatively many people are hosting their home
         | Assistant on an raspberry Pi so it is relatively powerful :D
        
           | geerlingguy wrote:
           | And the CM5 is nearly equivalent in terms of the small models
           | you run. Latency is nearly the same, though you can get a
           | little more fancy if you have an N100 system with more RAM,
           | and "unlocked" thermals (many N100 systems cap the power draw
           | because they don't have the thermal capacity to run the chip
           | at max turbo).
        
             | moffkalast wrote:
             | If we're being fair you can more like, walk models, not run
             | them :)
             | 
             | An 125H box may be three times the price of an N100 box,
             | but the power draw is about the same (6W idle, 28W max,
             | with turbo off anyway) and with the Arc iGPU the prompt
             | processing is in the hundreds, so near instant replies to
             | longer queries are doable.
        
       | fons wrote:
       | I wonder how this compares to the Respeaker 2
       | https://wiki.seeedstudio.com/ReSpeaker_Mic_Array_v2.0/
       | 
       | The respeaker has 4 mics and can easily cancel out the noise
       | introduced by a custom external speaker
        
         | stavros wrote:
         | I don't just want the hardware, I want the software too. I want
         | something that will do STT on my speech, send the text to an
         | API endpoint I control, and be able to either speak the text I
         | give it, or live stream an audio response to the speakers.
         | 
         | That's the part I can't do on my own, and then I'll take care
         | of the LLMs myself.
        
           | alias_neo wrote:
           | All of these components are available separately or as add-
           | ons for Home Assistant.
           | 
           | I currently do STT with heywillow[0] and an S3-Box-3 which
           | uses an LLM running on a server I have to do incredibly fast,
           | incredibly accurate STT. It uses Coqui XTTS for TTS, with
           | very high quality LLM based voice; you can also clone a voice
           | by supplying it with a few seconds of audio (I tested cloning
           | my own with frightening results).
           | 
           | Playback to a decent speaker can be done in a bunch of ways;
           | I wrote a shim that captures the TTS request to Coqui and
           | forwards it to a Pi based speaker I built, running MPD which
           | then requests the audio from the STT server (Coqui) and plays
           | it back on my higher quality speaker than the crappy ones
           | built in to the voice-input devices.
           | 
           | If you just want to use what's available HA, there's all of
           | the Wyoming stuff, openWakeword (not necessary if you're
           | using this new Voice PE because it does on-device wakeword),
           | Piper for TTS, or MaryTTS (or others) and Whisper (faster-
           | whisper) for STT, or hook in something else you want to use.
           | You can additionally use the Ollama integration to hook it
           | into an Ollama model running on higher end hardware for
           | proper LLM based reasoning.
           | 
           | [0]heywillow.io
        
             | stavros wrote:
             | I do the same, Willow has been unmaintained for close to a
             | year, and calling it "incredibly fast" and "incredibly
             | accurate" tells me that we have very different experiences.
        
               | alias_neo wrote:
               | It's a shame it's been getting no updates, I noticed
               | that, but their secret sauce is all open stuff anyway so
               | just replace them with the upstream components; their
               | box-3 firmware and the application server is really the
               | bit they built (as well as the "correction" service).
               | 
               | If it wasn't fast or accurate for you, what were you
               | running it on? I'm using the large model on a Tesla GPU
               | in a Ryzen 9 server, using the XTTS-2 (Coqui) branch.
               | 
               | The thing about ML based STT/TTS and the
               | reasoning/processing is that you get better performance
               | the more hardware you throw at it; I'm using nearly PS4k
               | worth of hardware to do it; is it worth it? No, is it
               | reasonable? Also no, but I already had the hardware and
               | it's doing other things.
               | 
               | I'll switch over to Assist and run Ollama instead now
               | there's some better hardware with on-device wake-word
               | from Nabu.
        
         | robotfelix wrote:
         | It's worth noting that product is listed in the "Discontinued
         | Products" section of the linked wiki.
         | 
         | Both of the ReSpeaker products in the non-discontinued section
         | (ReSpeaker Lite, ReSpeaker 2-Mics Pi HAT) have only 2 mics, so
         | it appears that things are converging in that direction.
        
           | alias_neo wrote:
           | The S3-Box-3 also only has two mics, and I found I can talk
           | to that from another room of the house and it detects what I
           | said perfectly fine.
        
       | cranberryturkey wrote:
       | https://linuxvoice.ai
        
       | hamilyon2 wrote:
       | I had great trouble simply connecting Bluetooth speaker to use it
       | as voice input and for sound output. The overall state of sound
       | subsystem for diy voice assistant feels third-class at best.
        
       | albybisy wrote:
       | i don't wanna talk to a computer
        
         | cheema33 wrote:
         | > i don't wanna talk to a computer
         | 
         | You are in luck. You can get a human butler. But not for $59.
        
       | bsdice wrote:
       | Majel Barrett voice please.
        
       | ragmondo wrote:
       | My plea / request : Make a home assistant a DROP IN replacement
       | for a standard light switch. It has power, its adds functionality
       | from the get-go (smart lighting), it's placed in a convenient
       | position for the room and no extra wires etc required.
        
         | Carrok wrote:
         | Look at Shelly light switches.
        
           | NegativeK wrote:
           | Agreed.
           | 
           | They sell UL rated models, have an option for cloud
           | connectivity but zero requirement, your switch still works if
           | the Shelly loses connectivity with whatever home automation
           | server you have, and it's a small box that you wire in behind
           | the switch.
        
             | Carrok wrote:
             | They also make drop in replacement dimmer switches. Even
             | easier than the small box style.
             | https://us.shelly.com/products/shelly-plus-wall-dimmer
        
           | timdiggerm wrote:
           | You've misunderstood what they're asking for. They're asking
           | for Home Assistant hardware (microphone, speaker, wifi) that,
           | instead of being a standalone box taking up space on the
           | counter/table/etc, fits into the hole in the hall where they
           | currently have a lightswitch.
        
             | Carrok wrote:
             | I guess I did misunderstand, because that request seems
             | strange to me. I'm assuming they have more than one switch.
             | Which one should have Home Assistant on it? Seems like an
             | odd deployment strategy. A pi isn't that big..
        
               | hn92726819 wrote:
               | No I don't think that's it either. Home assistant runs on
               | a server somewhere still.
               | 
               | What the top level comment is asking for, completely
               | unrelated to the article mind you, is to have a smart
               | device in the form factor of a light switch that you can
               | hook into your home assistant system.
               | 
               | The problem they likely have (I have it too) is that you
               | set HA up and it can control smart plugs, smart
               | thermostats, etc, but it can't control 99% of the
               | existing lights in your house because they are wired to
               | dumb lightswitches. Instead of some mechanical finger
               | flicking a switch or something, why not uninstall the
               | existing light switch and replace it with a smart one.
        
               | Carrok wrote:
               | So my original comment was not a misunderstanding. They
               | are smart switch drop in replacements.
        
               | hn92726819 wrote:
               | Yeah, you're right. That is a weird request then, or I
               | don't understand it either. I didn't realize something
               | like [1] goes _inside_ your switch. I was expecting a
               | switch with a faceplate combined.
               | 
               | 1: https://us.shelly.com/products/shelly-1-gen3
        
               | Carrok wrote:
               | They also make what you're describing.
        
               | moffkalast wrote:
               | Not the home assistant controller, but a peripheral. A
               | light switch you can toggle manually or through the
               | assistant.
               | 
               | I think the problem with this setup is that it needs to
               | be wifi connected, and if you embed an esp32 inside a
               | wall it will get exactly zero signal. Maybe with external
               | antennas hidden in the switch outer case.
        
               | jazzyjackson wrote:
               | ? I have my house packed to the brim with tplink Wi-Fi
               | smart switches, they work fine.
               | 
               | https://www.tp-link.com/us/home-networking/smart-switch/
        
               | moffkalast wrote:
               | Ah right I forget I'm talking to Americans on an American
               | site, who all have walls made out of wood and gypsum. Try
               | that with brick and steel reinforced concrete lol.
        
               | jazzyjackson wrote:
               | :)
        
               | Carrok wrote:
               | The switches I linked are esp32. They live inside the
               | wall. They get great signal.
        
               | jazzyjackson wrote:
               | Not OP but if I have to have a CPU and microphone for
               | voice commands anyway it doesn't sound crazy to throw a
               | whole pi/relay node into every room of the house that I
               | want to have control of. Pi zero 2 is fifteen bucks and
               | can run Whisper speech2text iirc, throw ChatScript on
               | there for local command decoding and call it a day. I
               | think I'd pay 50 to 100 per room for the convenience,
               | paying a premium to not have my voice surveilled by Alexa
               | just to set timers.
        
               | ragmondo wrote:
               | Without trying to digress, but why not make it modular
               | too ? I.e. base model is a smart switch, one unit is the
               | "base" unit and the rest talk to that. Possibly even add
               | further switches, dials (thermostat or dimmer etc).
               | Perfect placement in my opinion.
        
               | jazzyjackson wrote:
               | Suppose I have a bias for meshnet vs hub and spoke. Seems
               | to me having full power cpu on every mic is going to be
               | better experience latency and glitchwise than streaming
               | audio feeds around. Of course they would still talk to
               | each other to pass commands around.
        
             | ragmondo wrote:
             | Yes - exactly this. If there are multiple needed, then some
             | can be smarter/ more capable than others, but this removes
             | the "just another box and cable(s)" issue.
        
         | throwaway4220 wrote:
         | Would a zigbee or z wave switch fit your needs? It's "offline"
         | but does need a hub
        
         | sirtaj wrote:
         | The now 8-year-old blog post titled "Perfect Home
         | Automation"[1] on the HA website agrees with you from the first
         | heading, and is borne out by my personal experience too. Nobody
         | in your house should need to retrain to do things they are
         | already doing.
         | 
         | 1. https://www.home-assistant.io/blog/2016/01/19/perfect-
         | home-a...
        
       | bradly wrote:
       | Are there any MacOS software versions of this? I've been looking
       | for opensource wake-work for a "Hey Siri"-like integration, but
       | I'm very apprehensive of anything, malicious or not, monitoring
       | the sound input for a specific word in an efficient way.
        
         | silentOpen wrote:
         | OpenWakeWord has worked well for me especially using well-
         | trained models like "Hey, Mycroft".
        
       | unshavedyak wrote:
       | Well shoot. Now i want to record everything in my house and
       | transcribe it for logs. I already wanted to do that but didn't
       | think there was a sane way.. assuming this lets me create a
       | custom pipeline, that's wicked
        
       | dboreham wrote:
       | It isn't even one year since the press stories about how dumb a
       | product Alexa was and how it makes no money and all the devs are
       | getting laid off. Something changed now?
        
         | eightysixfour wrote:
         | It was a bad product at making money for Amazon, but they are
         | useful for smart homes. Home Assistant is pretty squarely in
         | the smart home category.
         | 
         | I bought two the second they were announced, I already use the
         | software stack with the m5 atoms and they are terrible devices,
         | but the software works well enough for me.
        
         | iamjackg wrote:
         | Well, the various Echo devices were allegedly built as loss
         | leaders in the hope people would use them to make orders on
         | Amazon. This is backed by the most active open source project
         | on GitHub, which already has extensive support for voice
         | pipelines both with and without LLMs, and is likely priced
         | sensibly.
         | 
         | A lot has changed in the open source ecosystem since commercial
         | assistants were first launched. We have reliable open source
         | wakeword detectors, and cheap/free LLMs can do the intent
         | parsing, response generation, and even action calling.
        
         | weird-eye-issue wrote:
         | Huh? Being able to do things like turn off lights or change the
         | TV volume with your voice is actually quite a nice convenience
        
         | marcosdumay wrote:
         | If it's not clear, the Home Assistant business plan is
         | different from the Amazon one for Alexa... and the Home
         | Assistant open source project is even more different.
        
         | sirtaj wrote:
         | I've been using the HA cloud voice assistant on my phone for
         | the past few weeks, and it's such a great change from Alexa,
         | because integrating new services and adding sentences is
         | actually possible.
         | 
         | Alexa, on the other hand, won't even allow a third party app to
         | read its shopping list. It's no longer clear to me why Alexa
         | even exists any more except as a kitchen timer.
        
           | baq wrote:
           | They _must_ be working on a LLM backend for it so it isn 't
           | dumb as a rock.
           | 
           | Nothing makes sense otherwise, agreed.
        
         | jmuguy wrote:
         | Amazon lost 25 billion dollars on Alexa (between 2017 and 2021,
         | from WSJ https://archive.is/uMTOB). Selling the hardware at a
         | loss and I imagine a bigger portion was the thousands of people
         | they had working in that division.
         | 
         | So yeah, Alexa is a dumb product... for Amazon. No one uses
         | Alexa to buy anything from Amazon because the only way you can
         | be sure of what you're ordering from Amazon is to be looking at
         | the site. Otherwise you might get cat food from "JOYFUNG BEST
         | Brand 2024" and not Purina.
         | 
         | Voice Assistants for Home Automation, like what Home Assistant
         | is offering, are awesome. And this in particular is exciting
         | exactly because of Alexa's failure as a product. Amazon clearly
         | does not care about Alexa now, its been getting worse as they
         | try to shoehorn in more and more monetization strategies.
        
           | causal wrote:
           | > "We worried we've hired 10,000 people and we've built a
           | smart timer," said a former senior employee.
           | 
           | How the hell did Amazon hire that many people to develop such
           | low-tech devices.
        
           | drewcoo wrote:
           | > the only way you can be sure of what you're ordering from
           | Amazon is to be looking at the site
           | 
           | Ah . . . an optimist!
        
       | nailer wrote:
       | You should talk to Sonos about partnering with them. They
       | currently have a very limited Sonos voice assist, plus Google
       | Voice and Alexa, but the latter two are limited pre-LLM
       | assistants.
       | 
       | I'm assuming they eventually want to create their own LLM and
       | something privacy focused would be good match for their
       | customers. I don't know how they feel about open source though
        
       | skyde wrote:
       | how does this compare to ESP32-S3-BOX-3B ?
        
       | sreejithr wrote:
       | Genuine question - How hackable is this? Can I have the voice
       | commands redirected to my backend server where I can process it
       | as I please?
        
         | balloob wrote:
         | This is Home Assistant. Everything is hackable.
         | 
         | Inside Home Assistant the processing is delegated to
         | integrations providing Speech-to-Text, command processing,
         | Text-to-Speech. You can make custom integrations for all of
         | them
        
         | entropicdrifter wrote:
         | It's fully open-source. I think the default use-case is to have
         | the voice commands processed locally
        
         | throwawayq3423 wrote:
         | Probably as much as any other smart speaker without having to
         | give your data away.
        
       | zbrozek wrote:
       | Is anyone aware of an effort to repurpose Echo hardware to do HA
       | voice?
        
         | drdaeman wrote:
         | I've looked into this, and found nothing. One can surely
         | repurpose the case and speakers, but the microphones are
         | soldered on-board, and the board is not hackable and needs to
         | go. To best of my awareness, there are no ways to load a custom
         | firmware on a newer Echo device - they're locked down pretty
         | tight.
        
       | IshKebab wrote:
       | Looks great! The biggest issue I see is music. 90% of my use is
       | "play some music" but none of the major streaming music providers
       | offer APIs for obvious reasons. I'm not sure how you can get
       | around that really.
        
         | antonyt wrote:
         | To do this in Home Assistant, you'd probably want to run Music
         | Assistant and integrate it in. Looks like they manage to
         | support some streaming providers, not entirely sure how:
         | https://music-assistant.io/music-providers/
         | 
         | Getting it to play the right thing from voice commands is a bit
         | of a rabbit hole: https://music-assistant.io/integration/voice/
        
       | steelframe wrote:
       | If it's possible for the hardware to facilitate a use case, the
       | employees working on the product will try to push the limits as
       | far as they possibly can in order to manufacture interesting and
       | challenging problems that will get them higher performance
       | ratings and promotions. They will rationalize away privacy
       | violations by appealing to their "good intentions" and their
       | amazing ability to protect information from nefarious actors. In
       | their minds they are working for "the good guys" who will surely
       | "do the right thing."
       | 
       | At various times in the past, the teams involved in such projects
       | have at least prototyped extremely invasive features with those
       | in-home devices. For example, one engineer I've visited with from
       | a well-known in-home device manufacturer worked on classifiers
       | that could distinguish between two people having sex and one
       | person attacking another in audio captured passively by the
       | microphones.
       | 
       | As the corporate culture and leadership shifts over time I have
       | marginal confidence that these prototypes will perpetually remain
       | undeveloped or on-device only. Apple, for instance, has decided
       | to send a significant amount of personal data to their "Private
       | Cloud" and is taking the tactic of opening "enough" if its
       | infrastructure for third-party audit to make an argument that the
       | data they collect will only be used in a way that the user is
       | aware and approves of. Maybe Apple can get something like that to
       | a good enough state, at least for a time. However, they're
       | inevitably normalizing the practice. I wonder how many
       | competitors will be as equally disciplined in their
       | implementations.
       | 
       | So my takeaway is this: If there exists a pathway between a
       | microphone and the Internet that you are not in 100% control
       | over, it's not at all unreasonable to expect that anything and
       | everything that microphone picks up at any time will be captured
       | and stored by someone else. What happens with that audio will --
       | in general -- be kept out of your knowledge and control so long
       | as there is insufficient regulatory oversight.
        
         | comradesmith wrote:
         | Open source
        
           | gh02t wrote:
           | Yeah, OP is comparing this to Google/Amazon/Apple/etc devices
           | but this is being developed by the nonprofit that manages
           | development on Home Assistant and in cooperation with their
           | large community of users. It's a _very_ different attitude
           | driving development of voice remotes for Home Assistant vs.
           | large corporations. They 've been around for a while now and
           | have a proven track record of being actual, serious advocates
           | for data privacy and user autonomy. Maybe they won't be
           | forever, but then this thing is open source.
           | 
           | The whole point is that you control what these things do, and
           | that you can run these things fully locally if you want with
           | no internet access, and run your own custom software on them
           | if that's what you want to do. This is a product for the Home
           | Assistant community that will probably never turn much of a
           | profit, nor do I expect it is intended to.
        
       | gigel82 wrote:
       | What is a good GPU to put in a home server that can run the TTS /
       | STT and the local LLM required to make this shine?
       | 
       | A 3090 is too expensive and power hungry. Maybe a 3060 12Gb? Is
       | there anything in the "workstation" lineup that is more efficient
       | especially since I don't need the video outs?
        
       | afh1 wrote:
       | My experience with home assistance voice pipeline is nothing
       | works and stt is terrible. I'll have to wait and see the reviews.
        
       | ryukoposting wrote:
       | My wife and I have been very happy with Home Assistant so far.
       | The one thing we're missing is voice control, and until now it
       | seemed like there just wasn't a clean solution for HA voice
       | control. You were stuck doing some hobbyist shenanigans and hand-
       | writing boatloads of YAML, or you were hooking up a HomeKit/Alexa
       | which defeats the purpose of HA. This is a game-changer.
       | 
       | They recommend an N100 in the blog post, but I might buy one
       | anyway to see if my HA box's Celeron J3455 will do the job.
        
       | Animats wrote:
       | Nice. A totally local voice assistant.
       | 
       | This makes sense for cars, where there's much local stuff to
       | control. But for a home unit, what do you want to do that is
       | entirely local? Turning the heat up and down gets boring after a
       | while. If it does entertainment selection or shopping, it needs
       | outside world connections.
       | 
       | (Today's rant: I recently purchased a humidifier. It's just a
       | little unit with a water tank, a water-softening filter, and an
       | ultrasonic vaporizer. That part works fine. Then there are the
       | controls.
       | 
       | All this thing really needs is an on-off switch and a humidity
       | knob, and maybe lights for power, humidification, and water tank
       | empty. But no. It has five touch buttons and a round display
       | about four inches across. The display is on even if the unit is
       | off. Pressing the on/off button turns it on. If it's humidifying,
       | there's a whole light show. The tank lights up purple. Swooping
       | arcs of blue run up both edges of the round display. It's very
       | impressive, especially in a dark bedroom. If you press and hold
       | the second button for two seconds, about half the light show is
       | suppressed.
       | 
       | There are three fan speeds, and a button for that. Only the
       | highest one will propel the water vapor high enough to avoid it
       | hitting the floor and uselessly condensing before it mixes with
       | the air. So that feature was not necessary.
       | 
       | The display shows one number. It's usually the current humidity,
       | but if you press the humidity set button, the number displayed
       | becomes the setting, which is changed upwards by successive
       | presses until it wraps around. After a few seconds, the display
       | reverts to current humidity.
       | 
       | Turning the unit off or removing the water tank resets all
       | settings to the default.
       | 
       | This is the low-end unit. The next step up comes with an IR
       | remote. It's one way - the remote has buttons but no display.
       | Since you have to be close to the display to use the buttons
       | effectively, that doesn't help much. The step up after that is,
       | inevitably, a cloud-based phone app.
       | 
       | So this thing could potentially be interfaced to a voice
       | assistant. That's only useful if there's enough information
       | coming back from the device that the assistant software knows
       | what the device is doing, and the assistant software understands
       | that device status. If all it does is send remote button pushes,
       | the result will be frustration.
       | 
       | So you need some degree of intelligence at both ends - the end
       | that talks to the human, and the end that talks to the device. If
       | the user says "House, it's too dry in here", the assistant system
       | needs to be able to check the status of the humidifier. Has
       | power? Talking? On? Humidity setting reasonable? Fan running?
       | Tank not empty? If it can't do that, it's part of the problem,
       | not part of the solution.)
        
         | meragrin_ wrote:
         | > what do you want to do that is entirely local?
         | 
         | Keeping my daily life from becoming public? These companies
         | can't be trusted with the information they have. Why should I
         | give them more that they can leak?
        
         | acidburnNSA wrote:
         | Self hosted isn't always local only. I have a vpn server on my
         | home router and control my home assistant worldwide. No
         | corporation controls my access or data.
        
       ___________________________________________________________________
       (page generated 2024-12-20 23:00 UTC)