[HN Gopher] Voice assistants are not doing it for big tech
       ___________________________________________________________________
        
       Voice assistants are not doing it for big tech
        
       Author : rntn
       Score  : 241 points
       Date   : 2022-11-23 09:33 UTC (13 hours ago)
        
 (HTM) web link (www.theregister.com)
 (TXT) w3m dump (www.theregister.com)
        
       | derN3rd wrote:
       | I personally use Alexa for some light control as well as smart
       | blinds I have installed. Sometimes some basic questions ("What is
       | XY?") and besides that nothing more.
       | 
       | The amount of wrong answers or just telling me something it has
       | found on the web, which has nothing today with the topic is just
       | annoying. Even basic questions seem to not work reliably and
       | looking at the whole Alexa ecosystem it doesn't even feel close
       | to "smart" home.
       | 
       | Lack of context awareness, connection to basic things like parcel
       | delivery, washing machine and similar that would make my life a
       | little easier (because I'm to lazy to open an app to look when my
       | parcels will arrive or when the washing machine will finish) just
       | ruin the whole thing for me.
       | 
       | Besides that, I've tried setting the male voice, but for some
       | responses it just switches back to female Alexa??
        
       | jbverschoor wrote:
       | Well there are so any problems with it (Siri).
       | 
       | 1) Doesn't always understand what I'm saying (voice recognition)
       | 
       | 2) If it does understand (voice recognition), it actually changes
       | it it something else (super dumb, no, idiotic "AI", changes
       | someone I call almost daily into someone I never call)
       | 
       | 3) If it asks me what I want to do with a contact, it doesn't
       | understand "call"
       | 
       | 4) There's no way to 'correct' what has been said.
       | 
       | 5) I can't open "podcasts", because it will open Deezer (I don't
       | use Deezer anymore, but it's still on my phone).
       | 
       | It's really just really really a dumb command line interface. And
       | if it would just be that without the 'smart' things, it would be
       | a lot more helpful. There are only a few things I use it for:
       | 
       | 1) Call XXX
       | 
       | 2) Alarm at XXX
       | 
       | 3) Open google maps. Sometimes I say "route to xxx using google
       | maps", and that works
       | 
       | Even these three common use cases fail about 10-20% of the time.
       | 
       | When I'm in the car and want to change route or add a poi, want
       | to open an app, or text something, I try, but I get so frustrated
       | which is more distracting than just typing it in your phone. So
       | that's what I do.
       | 
       | Google's voice recognition is a lot better and faster. Also their
       | search understands more context.
       | 
       | Voice assistants are like a skinner experiment. Solution: make it
       | very rigid in terms of operations. People are quick to adapt
       | this. Somehow the ai-crowd doesn't understand that a UI works
       | best when the things you operate with are at the same location
       | and always respond and work in the same way.
       | 
       | I'd compare the voice assistant experiment as a GUI where the
       | buttons, their function, and labels always change.
        
       | eschneider wrote:
       | I find Siri mostly useful for things like taking messages and
       | playing music when my hands and eyes are occupied with other
       | things, but as many people have pointed out, figuring out what
       | exactly one has to say to get Siri to do it's thing is often
       | hard, even when you realize you've got a parsing problem. I mean,
       | ever try to get Siri to play a song by "Them"? Unpossible by band
       | name. :/ What I wouldn't do for a manual or keyword guide...
        
       | shellfishgene wrote:
       | I've had Google home for quite a while now, and, apparently like
       | everyone else, I only really use it to start and stop music, as a
       | kitchen timer, and to make animal noises when kids are visiting.
       | There was no obvious improvement over the years. However with
       | GPT3, DALL-E and all the other amazing stuff coming out I had
       | just assumed Google/Amazon must be working on a big update that
       | really makes Assistant/Alexa 10 fold better. Is this hope in
       | vain?
        
       | jfoster wrote:
       | They're half-baked and I sense consumers starting to lose
       | interest. Third party developers have to do a big deal with
       | Amazon to have a "skill", rather than having an open marketplace
       | where consumers can choose what their device will do? The devices
       | don't really cover all the things consumers might want from them.
       | Years ago, I had assumed Google or Amazon would launch a
       | marketplace akin to Google Play and whoever got there first would
       | be the category winner. Instead, the category became a race to
       | the bottom. Big tech cynically chasing the big deals rather than
       | opening up the platforms.
        
       | stephc_int13 wrote:
       | The harsh reality is that the AI powering those assistants is
       | simply not smart enough to converse with.
       | 
       | Using voice to send commands to a machine is not practical or
       | pleasant for everyone.
       | 
       | This should not be a surprise and it could have been easily
       | predicted, but hype and FOMO pushed big companies to sink
       | billions into this tech.
       | 
       | This is not the first time, and it will happen again, especially
       | because anything touching AI leads to highly inflated/magical
       | expectations.
        
       | ozzythecat wrote:
       | This is an interesting thread :)
       | 
       | I see lots of conjecture on the nature of voice assistants, or
       | why they haven't taken off.
       | 
       | As someone who was at Amazon, and close to this area, let me
       | offer a simpler explanation.
       | 
       | Alexa has stagnated because its leadership has little to no
       | direction. The incentive structure driving the product teams and
       | tech falls under two categories: 1. rest and vest 2. or write
       | documents to build your promotion portfolio and get promoted.
       | Then leave the org to find a job in AWS.
       | 
       | In fact, Alexa could be used as a text book study in empire
       | building.
       | 
       | * Myriad teams that maintain or increase head count each year,
       | with absolutely no meaningful deliverables.
       | 
       | * Services that could be maintained by a 2-3 person on call that
       | have an entire team of 6-8 engineers.
       | 
       | * Tech directors, Sr. SDMs, and SDMs in a race to build their
       | empire to get promoted to the next level. SDMs solemnly hiring
       | head count for the sake of having more reports. Other SDMs hiring
       | SDMs below them, even though there's no need for additional
       | management (team is idle), so they can show a larger footprint
       | and get promoted to the next level.
       | 
       | Voice assistant tech may as well be hitting a brick wall for
       | technical or human computer interaction gaps that need to be
       | thought much more in depth. But saying this was just an insanely
       | hard technical problem is misleading and missing the bigger
       | picture.
       | 
       | If you build an organization as a pet project and then throw
       | money at it to do whatever the hell it likes, with practically no
       | accountability, of course you won't get meaningful results.
       | 
       | In recent years, Alexa became a place to chill and wait for your
       | RSUs to vest. I personally know great engineers in Alexa, who are
       | now stressed out of their mind about their work and visa
       | situation because of the layoffs. But look - you decided to join
       | a team where you basically just hang out at work, have a series
       | of useless meetings, or half the time, your manager is
       | complaining for some reason that his team doesn't all show up to
       | daily stand ups. A ton of extremely talented folks became
       | extremely lazy, and so here we are.
       | 
       | To root cause: misaligned incentives, lack of accountability and
       | a poor work ethic, shockingly poor given at least some other
       | parts of the company are still on the other, opposite extreme.
        
       | aflag wrote:
       | The main issue for me is that they are not stateful. Perhaps the
       | main thing in the role of an assistant is to keep state. You want
       | someone or something that understands you and what you want, so
       | that you don't have to put too much thought into it.
       | 
       | If you tell it you want more coffee, it should know what you like
       | and suggest a mixture of brands you bought before and new ones
       | you may enjoy. If you tell it you're hungry, depending on the
       | time of day it could suggest you some takeaway you've ordered
       | previously or something else you may like. If you say the same
       | some other time it may suggest recipes based on what you have at
       | home or it may suggest nearby restaurants. It should keep track
       | of your friends and otherwise and tell you when their birthdays
       | are coming and it would be nice if it could even suggest some
       | presents based on things you've told the assistant before, or
       | their wishlist on amazon or something else.
       | 
       | There are a lot of things assistants could do, but it needs to
       | know you. The model where everyone has the same assistant doesn't
       | quite work out.
        
         | somehnacct3757 wrote:
         | Complicating the situation is that I don't trust any of the
         | companies making virtual assistants with this level of personal
         | data; so the first thing I do on a new device is block the
         | assistant's access to my location or any other behavior
         | learning functionalities.
        
           | boplicity wrote:
           | I think that lack of trust is the single biggest factor
           | holding back this technology.
           | 
           | I imagine that a truly top-notch virtual assistant would
           | always be listening and aware of your behavior and context.
           | However, the level of trust required for such monitoring is
           | usually reserved only for one or two people in our lives, and
           | even then, it's quite incomplete awareness on their part.
           | 
           | I don't know how a for-profit company can reconcile this
           | disconnect, though I imagine that someone will eventually
           | try.
        
           | rubyfan wrote:
           | I believe tech companies are aware of the creepiness factor
           | associated with surprising customers with too much context
           | about them. I think they cripple some of the more context
           | aware features that they could be doing because it draws a
           | lot of attention on just how much data they have about you.
        
             | aflag wrote:
             | That could maybe be gradually introduced like that story
             | about the frog which doesn't feel the water getting warm.
             | What put people off is having it knowing everything up
             | ahead through inference on data you were not even aware you
             | had shared.
             | 
             | Instead, you could go easy, by first suggestint the user to
             | set up your assistant by linking it to amazon,
             | deliveroo/uber eats/etc, facebook, and verbally sharing
             | information as it asks. Then, over time, it could spread
             | its inference further and further and you won't be sure or
             | not if you already shared that information or not, but you
             | will just assume you did, as you share everything anyway.
             | For people who are not so open, it could stick to inferring
             | less and being less useful.
        
               | Workaccount2 wrote:
               | I would want a company where I pay them for their
               | assistant, knowing that everything it holds is not shared
               | with anyone and completely under my control.
               | 
               | The issue is of course, like paying for youtube, is
               | consumers will wonder why they would want to pay for an
               | assistant.
        
               | rubyfan wrote:
               | The market for something you pay for is small. The market
               | for free things is seemingly never ending.
        
             | ghaff wrote:
             | >creepiness factor
             | 
             | This is a complicated social construct even with people. If
             | a public relations person (or whoever in a professional
             | context) reaches out to me, I hope they've done some basic
             | research on what my interests are. I saw you were at
             | $CONFERENCE last month? Sure, probably. Start asking me
             | about my vacation last month that they found photos from on
             | Flickr? Probably getting over a line if I don't know them.
        
         | hadrien01 wrote:
         | Cortana on Windows Phone had a "notebook" of everything it had
         | learned and you could modify it to your liking. For example it
         | detected my home and work locations and transit hours and
         | displayed every weekday at 17h15 bus information before I left,
         | and I could modify that info if it was wrong.
        
           | aflag wrote:
           | Google also identifies commute, it's actually a maps feature
           | and not the assistant's, though.
        
         | est wrote:
         | Yeah I wanted something like a "voice notes" like I gave my
         | kids credit scores to see who earns the most each weak, I want
         | my smart speaker to have a simple voice activated k-v store
         | ready, can increase or decrease, get and reset.
        
       | pipeline_peak wrote:
       | Yeah, cuz it's not 2011 anymore...
       | 
       | Kids in India have smartphones now, no one's impressed
        
       | omega3 wrote:
       | "Someone has to say it:" It's weird that they are pretending to
       | do some cutting edge/against-the-grain journalism/research in
       | this article.
        
       | [deleted]
        
       | anothernewdude wrote:
       | I'd be full on board for a voice assistant, but I control the
       | server and the traffic, so Amazon is a non-starter.
        
         | draugadrotten wrote:
         | Try https://mycroft.ai/
        
       | rndmio wrote:
       | Conversational interfaces just aren't any good, and until they're
       | perfect they won't be useful. The primary issue I've seen (having
       | worked with slack bots) is discoverability, how do you find out
       | what a bot can do without asking lots of open questions? The most
       | useful bots I saw were those that didn't try to be
       | conservsational but had a fixed grammar for commands and
       | questions (along with a good help response). And at least with
       | chatbots you've got the textual context you can scroll back on
       | and read as opposed to trying to keep track of what's happening
       | in a conversation with a non human who you can't see.
        
       | robg wrote:
       | Love the utility, but there's clearly little business value.
       | Great example where "making something people want" doesn't alone
       | support the bottomline.
        
       | rmac wrote:
       | They did take off. Everyone in here uses them. People all over
       | are using transcription (speech to text) to dictate instead of
       | typing on the terribly small mobile phone keyboards.
       | 
       | The issue apparently is no one knows how to make money
        
       | pgrote wrote:
       | Alexa is an important tool for our family. My hope is amazon
       | doesn't abandon Alex, but limit what it does and charge a
       | subscription. We use it for alarms, timers, drop ins (inside home
       | and outside), announcements, weather, news, music on Amazon prime
       | and making calls. Simple things that don't require add on skills
       | or advanced commands. Works solid for our needs and it would be
       | difficult to replace. We never buy anything using voice commands.
       | 
       | The problem is a family like us. We haven't bought a new device
       | in 2 years and still use some of the original pucks. Amazon will
       | need to make money somehow so I think a yearly subscription would
       | work. Then again, if it isn't a growth segment I cannot see a
       | tech company keeping something in today's world.
        
         | ghaff wrote:
         | For me, subscriptions need to really earn their keep. I suppose
         | I'd miss my devices if Amazon started charging a subscription,
         | but I certainly wouldn't pay and maybe I'd replace them with
         | something else simpler.
        
       | nunodonato wrote:
       | https://news.ycombinator.com/item?id=33662020
        
       | Robotbeat wrote:
       | I wish voice assistants like Siri could just do functions that
       | are obvious, even if it requires learning commands. Actually
       | especially for in-car voice assistants. I should be able to do
       | any of the critical commands like turning on wipers, etc, via
       | voice commands without looking at a screen or even feeling around
       | for tactile controls with a free hand. If this was reliable and
       | consistent, even if being reliable required learning a list of
       | commands, that'd be a big help while driving.
        
       | jacooper wrote:
       | Good, I hate these spyware devices.
        
       | pmontra wrote:
       | I'm really not surprised. Some sample items I bought in the last
       | months in order of difficulty both for me and (I think) for a
       | voice assistant:
       | 
       | 1. Hey Alexa/Google/Whatever, buy me a saddle, Selle Italia SMP
       | Extra, white. I expect to get a proposal for the best price +
       | shipment and that's it. Easy.
       | 
       | 2. Buy me a replacement head for my Parktool pump. Oops, it's a
       | PFP-3 pump. Ah, the shop selling the saddle doesn't have one... I
       | eventually discovered that that shop has an identical part for a
       | pump of a different brand, probably the same Chinese
       | manufacturer. I think this would be hard for an assistant,
       | borderline to impossible.
       | 
       | 3. Buy me a magnetic mosquito screen of at least X x Y cm, not
       | adhesive or velcro. Oops, I need a frame to mount it on? I spare
       | you with the discoveries I made along the process. Either the
       | voice assistant is equivalent to a professional installer and can
       | see my door or I'll always do a search with a browser, and watch
       | many videos too.
        
       | userabchn wrote:
       | They may not be very useful at the moment, but if progress in
       | large language models continues at the current pace then I can
       | imagine that conversational AI might be useful in several years.
       | Investments by companies to establish and maintain voice
       | assistant market share might thus eventually pay off.
        
       | tinus_hn wrote:
       | Perhaps it would be a useful experiment to pick a representative
       | group, couple them to real human assistants with complete control
       | to complete requests and collect all the requests that are made.
        
       | ggm wrote:
       | Voice assistants are shit. The number of times my friends have
       | got alexa to turn the light on first time is functionally zero.
       | 
       | And, they don't really explain the syntax constraints. Which are
       | massive.
       | 
       | Try ab initio without knowing how to do it, to get OK google to
       | open an arbitrary google authored app and direct it to do
       | something. Compared to learning how to use the OS UI keyboard
       | shortcuts or applescript. (Which btw like Windows is basically
       | fully documented because all the libraries are self documenting
       | for their call structures)
       | 
       | The voice interfaces are universally badly designed because
       | spoken command sentences are not well understood as a modality of
       | command, distinct from mouse, gesture, touch or keyboard.
       | 
       | Until voice is baked in with a documented syntax in "man" format,
       | i won't believe its first class.
       | 
       | How do I even know for any arbitrary app what voice directives it
       | uses? How do they correlate to any other command input? How
       | consistent is this with other commands in other apps? Does "stop
       | now" always mean the same thing between a mapping routing app,
       | and a tape backup app? Isn't "stop" contextually defined in a way
       | ^C isn't?
        
         | eternityforest wrote:
         | I bet you could sell a lot of fully offline voice assistants
         | that just did timers and maybe reminders that sync to the phone
         | or other smart speakers with bluetooth. No privacy concern if
         | there's no WiFi at all!
         | 
         | I'll stick to Google assistant for the extra occasionally
         | useful stuff, but the idea of a device that won't stop working
         | if the server goes away is pretty cool.
        
         | zamalek wrote:
         | > The number of times my friends have got alexa to turn the
         | light on first time is functionally zero.
         | 
         | Do your friends have accents? Alexa frequently fucks up due to
         | my South African/Zimbabwean accent. I joke that Alexa is
         | racist. Google seems to handle it _way_ better. Here 's hoping
         | that Mycroft does better (which I plan to switch to in the
         | coming months).
        
         | dmitriid wrote:
         | > The number of times my friends have got alexa to turn the
         | light on first time is functionally zero.
         | 
         | I have a room called "Study" and the only lights there are
         | called "Main Lights".
         | 
         | - Hey, Siri, study lights 100%
         | 
         | - Did you mean these lights: "Study: Main Lights"
         | 
         | :facepalm:
        
           | r0m4n0 wrote:
           | I don't disagree with the stupidity sometimes but I have
           | probably upwards of 50 lightsbulbs/plugs connected to my echo
           | speakers and use them exclusively for turning on my lamps,
           | dimming lights etc. so I can say confidently, it makes
           | mistakes sometimes but once you learn the quirks and set
           | things up with proper names it's actually pretty impressive.
           | And you can learn the oddities, you just have to be
           | dedicated, sorta like learning anything else.
           | 
           | I've learned that I can't say "what's the temperature"
           | because it will tell me the temperature my Dyson fan is set
           | to lol. It's not wrong, it's just missing some context (me
           | holding my jacket wondering if I should wear it) that I can
           | provide going forward. Maybe I should just ask it "should I
           | wear my jacket today"
        
             | dmitriid wrote:
             | That's the issue though: there are too many quirks even for
             | the most simple of interactions.You have to be constantly
             | aware of quirks, and issues, and workarounds. It's like
             | walking on broken glass.
        
             | spookthesunset wrote:
             | > set things up with proper names
             | 
             | The hard part isn't naming the stuff, its remembering what
             | each thing is named. I could call the lamp next to the
             | couch several different things depending on the context. If
             | you don't remember _exactly_ what it is named you are gonna
             | have a rough time. I 've been tempted to just put a label
             | with the device name on everything controlled by Alexa.
             | 
             | And good luck getting a guest to turn anything on or off.
        
               | dmitriid wrote:
               | > I could call the lamp next to the couch several
               | different things depending on the context.
               | 
               | Or just call it dude :) https://reddit.com/r/tumblr/comme
               | nts/548b3p/everything_is_du...
        
           | ggm wrote:
           | Yea, a single member set shouldn't require full enumeration,
           | it's detectably the only operating device in context.
           | 
           | If it was the only "main light" across all rooms I'd hesitate
           | to say the same mind you, nesting scopes should be respected.
           | 
           | What would you want it to do when you add a second light
           | bulb? Tell you the old command is no longer unique or turn on
           | both?
        
             | dmitriid wrote:
             | > What would you want it to do when you add a second light
             | bulb? Tell you the old command is no longer unique or turn
             | on both?
             | 
             | I think when the command is generic "lights" it should turn
             | on all lights. But both are valid behaviours.
        
       | kozikow wrote:
       | I have voice assistant for a few years (Google Home). I tried to
       | use it for many things, but in the end I settled only for playing
       | music, setting alarm, find my phone, what's the time, and "what
       | sound does a monkey make" when I am holding a baby.
        
       | wiradikusuma wrote:
       | "consumers were just as happy to sit down and click away until
       | they had the basket they wanted" -- same reason why although many
       | people hate lifting their butt to go shopping, some would happily
       | do. Nobody likes checkout lines though, that's universal.
        
       | harshaw wrote:
       | There is a monetization path, but not likely easy like selling
       | direct to consumers. As others have commented - voice (and google
       | glass!!) are useful when you don't have hands. For example, a
       | dentist frequently needs to look at the computer screen or have
       | the computer screen change a display while they are in the chair
       | working on a patient. Making a command via voice seems useful
       | (change to the next screen, etc) and/or also potentially showing
       | some data in a heads up display. Another vertical could be
       | hospital - being able to use a generic voice UI to turn on the
       | lights and summon help seems useful.
       | 
       | Is the monetization path hard - yes. Trying to break into
       | specific industries takes a ton of work. Additionally - you
       | already have staff in these cases. Would a dentist get rid of her
       | assistant for some voice commands? not likely. A hospital still
       | needs nurses.
        
       | totalhack wrote:
       | My favorite part is when my Google Home stops the music upon my
       | request and then tells me in a complete and unnecessary sentence
       | that it has stopped the music.
        
       | thrwawy74 wrote:
       | What annoys me the most is this:
       | 
       | "Hey Google, turn off the den light switch in 30 minutes"
       | 
       | "Sorry, for safety reasons we cannot...."
       | 
       | It's a light. Because it heard "switch" it thinks there might be
       | some power tool connected to it and won't let me set delayed
       | actions. I want to be intelligent with it like "Hey Google, turn
       | turn the lights on when the sun comes up everday" but no one has
       | gone to that next step.
       | 
       | Or how about "Hey Google, turn off the tone played when you say
       | Hey Google". These settings aren't accessible from the voice
       | interface itself.
       | 
       | Can't wait for Alexa to fail so my SmartTV will stop nagging me
       | to use integrations I will never use. Anti-competitive but
       | whatevs.
        
         | joncrocks wrote:
         | I think you can set 'routines' now, might be able to help with
         | sunrise actions -
         | https://support.google.com/assistant/answer/7672035?hl=en-GB...
        
         | thethethethe wrote:
         | You can change the device type in Google home to work around
         | the scheduling issue
        
         | twelvedogs wrote:
         | some of the stuff google don't put effort into is super weird,
         | like the whole if the device is in a different "room" it has to
         | loudly announce what it's doing but if it's in the same room it
         | turns down the volume on all of the devices for 10 seconds
         | 
         | really makes me think they almost want to discontinue it but
         | it's so integrated with android and the chromecast they can't
         | really kill it
        
       | JamesGriffin wrote:
       | U
        
       | licebmi__at__ wrote:
       | At least for my family, the voice recognition seems to be getting
       | worse and worse with each update. Nowadays google doesn't seem to
       | turn off the alarms or the music after several tries so my son
       | just comes and unplugs it (touch controls are also a fucking
       | mystery), my wife tries to make her phone ring so many times she
       | can find without help in the meantime. Alexa doesn't understand
       | what I try to order and it's easier to just find my phone and
       | type it. I thought that somehow the devices were just having
       | physical issues with time, but even a new one has the same
       | problems. So far it seems like I would get a better experience by
       | rolling my own stuff and training it with our specific voices.
        
       | OtomotO wrote:
       | I feel like voice assistants sound super awesome in theory and a
       | lot of nerds, me included,first think about "Computer, replicate
       | me some Wiener Schnitzel", but reality is really something
       | completely different, even when ignoring the replication part.
        
       | albertzeyer wrote:
       | Following the argument that the biggest problem is natural
       | language understanding, I wonder what happens when you put in
       | some GPT3-like model, give it all relevant context like past
       | conversations, other user information, etc. That should be much
       | more capable in understanding what the user wants than current
       | systems.
       | 
       | There are obviously still many open research questions. Like how
       | this is then combined with actually performing the commands. But
       | there are also solutions to this.
       | 
       | This is still somewhat open active research. But given such
       | powerful models, I think in principle we could make such devices
       | much more useful.
        
       | OtomotO wrote:
       | I use voice assistants for one task:
       | 
       | Setting a timer.
       | 
       | The oven is in another area of the house, so when I come back to
       | squeeze in some work, I often just said " _Voice Assistant_ , set
       | a timer for 10 minutes!"
       | 
       | And that's about it.
       | 
       | Apart from that I worked on some chat-bots in the past and it's
       | the very same thing to me, just even a bit worse because of
       | audio.
       | 
       | Natural language processing simply isn't there AND there is just
       | a few very niche use cases in my eyes.
       | 
       | So if they go away again, I won't cry or rather my tears will dry
       | fast.
        
       | headsoup wrote:
       | Perhaps what value they're getting out of voice services is not
       | what you think it is.
       | 
       | Data, training, pattern recognition, language.
        
         | Sakos wrote:
         | But then you have news articles talking about how Alexa lost
         | $10 billion and I'm wondering if these companies even know what
         | they're getting out of it.
        
       | JCM9 wrote:
       | I'd argue this space was technically successfully but not
       | commercially so. We love our "smart speakers" and have integrated
       | them into home automation. Does anyone make any money when we use
       | them? No.
        
       | jleyank wrote:
       | Admittedly I only skimmed the comments but "how will they make
       | money from it?" wasn't acknowledged. Hell of a lot of effort to
       | build up an inference engine for natural language processing and
       | they're going to want to profit from it. Unless, as some have
       | noted, surveillance will be their funding.
        
       | zmmmmm wrote:
       | I find it perplexing that voice assistants actually seem worse at
       | basic functions than they were ten years ago. I remember being
       | amazed how well it could interpret requests to set reminders with
       | flexible language options. Now google gets it wrong so often I've
       | nearly stopped using it.
        
       | 2sk21 wrote:
       | Despite the tremendous amount of effort that's gone into creating
       | large language models, there is still no way to hold a goal-
       | directed conversation with voice assistants. There is a lot of
       | implicit context in normal human speech that needs to be inferred
       | or clarified. None of these speech assistants can do handle
       | anything more than the most rudimentary clarification dialog.
        
       | djhworld wrote:
       | I bought the original Echo back when it first came out.
       | 
       | It was a neat internet radio device and Bluetooth speaker but
       | outside of making it play the radio I never really used the voice
       | functionality much.
       | 
       | I remember one of the places where I buy groceries from offering
       | an incentive if you ordered your shopping via the Alexa skill, it
       | was so painful to use that I added everything to my basket using
       | the website and then just added 1 unambiguous item using the
       | skill and checked out. They gave me the discount/incentive but I
       | never used it again.
       | 
       | I moved house but never plugged it back in and don't really miss
       | it.
       | 
       | I also remember going to AWS re:invent in 2016 and they gave
       | every attendee an echo dot, with a lot of fanfare to encourage
       | Devs to make skills. I tried to make one but it was so
       | convoluted, I gave up in the end and just gave the Dot to my
       | parents. It broke after about a year, they never replaced it
       | either
        
       | [deleted]
        
       | nmca wrote:
       | What's entertaining is that they were just too soon. Give it at
       | the very most a couple more years and the state of the field will
       | be very, very interesting.
        
       | asim wrote:
       | They were too early. A decade from now when NLP is bullet proof
       | and AI has made significant advances we'll revisit this.
       | Pragmatically it makes sense at this juncture to focus on the API
       | behind the voice interface that lets you issue commands. That's
       | the true power. That's effectively a global catalog of services
       | that can do anything, driven through one API. We can revisit
       | voice but this is where the starting line is IMO.
        
       | jhoelzel wrote:
       | I think, or at least for me, one of the major frustration points
       | with voice ai is how dumb it has been for the last two decades.
       | 
       | When you tell a mercedes navigate me to "exact road and adress"
       | it barely gets half of it right. and usually that just the
       | number.
       | 
       | Now i don't have a speech impairement and have tried in 3
       | languages, none of them work to my sophistication.
       | 
       | The worst thing though, personally, is all the shitty patents
       | around voice activation keywords. It's disgusting.
       | 
       | Finally i am also an IOT guy with a lot of toys, and i would have
       | written my own personal assistant if it would not be for the
       | patents. After all the hardware is surprisingly cheap.
       | 
       | All the problems are solved too. but if you believe i would write
       | free code for megacorp amazon without pay, you might be heavily
       | mistaken.
        
       | imdsm wrote:
       | As Shank said, there needs to be a GPT element to voice
       | assistants. At the moment, Alexa is a voice interface, but is
       | "dumb" essentially, there's no understanding, intonation, memory,
       | etc. A voice assistant that would would be a fully fledged AI
       | that grows over time, as with most Sci Fi smart houses. I think
       | it's possible -- that we'll receive units that have base AIs that
       | then grow with time, but we're not there yet, not close. The
       | current problem with GPT-like AIs is that you can't trust what
       | they're going to say. They're interesting, useful, but there
       | still lacks that feeling that it's anything except mathematical
       | probability based auto-completion.
        
       | kybernetyk wrote:
       | Voice assistants are infuriating. Every day I feel like Siri and
       | Alexa would understand less and becoming worse.
        
         | taviso wrote:
         | I got into the habit of asking things like "what is the weather
         | like today" every morning before work, it worked fine for
         | months and I quite liked it - while I'm occupied with something
         | else it can give me some useful information. Then one day it
         | started giving me the dictionary definition of the word
         | "weather". That continued for over a week, so I just lost the
         | habit and don't think I've really used it since.
        
       | eunos wrote:
       | Now this is not the most politically correct statement. But I
       | wonder what is the similar situation say in China or other major
       | non west countries? I think in China voice AI is more hyped.
        
       | TEP_Kim_Il_Sung wrote:
       | I don't like the centralized nature of them.
       | 
       | If I could download it and run it on a private server in the
       | basement, without any ties to the cloud, and with my own settings
       | for privacy, then I'd be more willing to use the things.
        
         | encryptluks2 wrote:
         | Same. I actually really enjoy using it for basic functions but
         | I am too concerned about privacy to use one for anything more
         | advanced. I don't know if I would ever trust a Siri or Alexa or
         | Google Assistant even if they claimed to work offline because
         | all providers have had a terrible track record. If the Linux
         | Foundation or reputable open source project had a solid open
         | source solution that promised to work 80% as well and had a
         | fairly straightforward installer, then I'd be more apt to
         | interact with it regularly. One other issue I've found with
         | Alexa and others is their integration with proprietary lock-in
         | ecosystems for calendars, reminders, lists, etc. Ideally with
         | an open source solution you could do things like set a write
         | only calendar that uses your CalDav calendar and CardDav for
         | contacts, or some non-invasive solution for messaging and
         | calling.
        
           | troyvit wrote:
           | Mycroft.ai has some of what you need. They still host some
           | functionality but it's on their road map to make their voice
           | assistant offline capable.
        
         | sramsay wrote:
         | This is exactly how I feel. On the one hand, this technology
         | seems completely miraculous (I remember watching Star Trek TNG
         | back in the day and thinking the "computer" was about the
         | coolest thing imaginable).
         | 
         | Now it's here, it works amazing well (considering what it's
         | doing), and . . . I'm talking to Apple, Google, and Amazon? No
         | thanks . . .
        
       | craigmcnamara wrote:
       | No matter how free they make it to use, I just do not want a
       | robot listening to me all time.
        
         | twstdzppr wrote:
         | Muahahaha, there's no escape! Maybe dial down the paranoia.
         | It's a bit unhealthy.
        
       | corobo wrote:
       | I hope this means people will be looking at jailbreaking these
       | devices. I'd do it myself but that stuff is beyond me at this
       | time.
       | 
       | I'd love to repurpose my Alexas into satellite Rhasspy[1] devices
       | if Alexa retires.
       | 
       | 1: on HN the other day
       | https://news.ycombinator.com/item?id=33705938
       | 
       | & direct link to the satellite info
       | https://rhasspy.readthedocs.io/en/latest/tutorials/#server-w...
        
       | lifeisstillgood wrote:
       | My hot take is this is the start of vastly (better/worse)
       | surveillance- that can be a force for good or bad. Let's start
       | with business meetings - the tech is close enough to be able to
       | record, transcribe and summarise all decisions and action points
       | - which sounds great and may get put into place. Then it can
       | assess how well a manager was "coaching" the meeting - getting
       | appropriate feedback from all participants - then were all
       | necessary participants there ? Then we can see training sessions
       | geared specifically towards manager A and their problem in
       | encouraging positive performance from their team with vague
       | goals.
       | 
       | Wouldn't we all like better managers with clearer goals?
       | 
       | Roll this out to a doctors bedside manner, or the mistaken
       | command given to a nurse that was a perhaps misheard and the
       | wrong medicine given
       | 
       | or ...
       | 
       | We can surveil all our waking lives - and almost the only thing
       | social sciences get right is large population statistics- we can
       | find out who leads happiest best lives and be taught how to do it
       | better.
       | 
       | Or we can stomp out individuality and live on a totalitarian
       | nightmare.
       | 
       | But we don't get to not play the game
        
       | cik wrote:
       | I've been using these ever since they came out. I've tried..
       | enough of them, including the open source, self hosted things. I
       | have a tonne of them all over the place, and after almost 8 years
       | of this, here are all of my use cases. To be clear, I just don't
       | get why people care:
       | 
       | 1. Play me music or a podcast (this fails a shockingly high
       | amount)
       | 
       | 2. Call person X
       | 
       | 3. Covert unit to unit
       | 
       | 4. Turn devices on or off
       | 
       | 5. Timer set
       | 
       | 6. Reminder set
       | 
       | 7. What time does Shabbat (my Sabbath) start.
       | 
       | All of the above can be done with my phone, so there's really no
       | point in them - for me.
        
       | locusofself wrote:
       | I have a stereo pair of OG homepods, and it never ceases to amaze
       | me how stupid they are. Songs I played yesterday are mysteriously
       | unavailable, or I'll ask for a song I've played a million times
       | for my daughter and it will decide to play a totally different,
       | highly innapropriate rap song just because it also has the word
       | "toothbrush" in the title or something.
       | 
       | I use my Alexa as a timer only, and when it started saying things
       | to me like I have notifications or "did you know you can use
       | Alexa to order such and such" when I would say "Alexa, stop the
       | timer", I almost threw it out the window.
        
       | transfire wrote:
       | Well guess what happens without open platforms? We need open tech
       | to allow anyone to create voice driven "skills" -- like websites.
       | 
       | Not everything can live on ad dollars.
        
       | PeterCorless wrote:
       | Has anyone also considered that people have voice recognition
       | systems in their smartphone and thus a standalone voice
       | recognition device isn't necessary?
       | 
       | Or that the results on a smartphone can be visual, whereas these
       | home devices don't have direct IO apart from voice? [And yes,
       | while they can be hooked to exogenous devices like a television,
       | that's an extra step of configuration to finagle...]
        
       | antirez wrote:
       | It's not the technology. Voice transcription works great, and
       | from the point of view of extracting meaning, even pattern
       | matching would do better than the embarassing failure of today's
       | voice assistants. It's a matter of product. We are living in an
       | IT world where there are no great people able to turn the
       | technological potential into useful things.
        
         | ggerganov wrote:
         | I agree, but I think things will likely change soon. The main
         | thing holding progress in this field seems to be privacy issues
         | and sending data to the cloud. However, we already have human-
         | level voice transcription that you can run on a low-to-moderate
         | hardware at home, so all it takes is for the average developer
         | to be able to run GPT-like LLMs on their machine. At this point
         | I think the quality will improve really fast.
        
       | deafpolygon wrote:
       | Computer, earl grey, hot.
        
       | ericol wrote:
       | This (for me) just shows a normal trend in tech.
       | 
       | Somebody develops something, gets a lot of buzz, doesn't deliver,
       | buzz dies slowly, and then several years in the future somebody
       | else actually builds the tech stack needed for it, and it takes
       | off again.
       | 
       | Voice assistance, chat robots [1]... the metaverse.
       | 
       | [1] Several years back I attended a chat in my city where some
       | people from IBM showed us how to implement a chatbot I think with
       | Watson.
       | 
       | Beyond the trivial examples, it was nearly impossible to
       | implement anything (Or the people giving the chat didn't know how
       | to implement those) but the documentation was nowhere to be found
       | beyond those trivial examples.
        
       | nailer wrote:
       | I use Alexa on a house filled with Sonos speakers every day. She
       | barely works and is the worst advertisement for Amazon.
       | 
       | "Alexa turn on my FireTV" (doesn't work, weirdly FireTV and Alexa
       | speakers don't seem to get along)
       | 
       | "Alexa play The Killers" (responds with "Playing the killers on
       | Spotify" then silence)
       | 
       | "Alexa play The Killers on Spotify" (works)
       | 
       | "Alexa play The Killers" (sometimes tells me to start scanning
       | devices and install a skill for Sonos)
       | 
       | "Alexa wikipedia 'history of Bulgaria'" (works but asks me if I
       | want to continue after every sentence, so I give up)
       | 
       | "Alexa stop" (doesn't stop for some apps, Amazon should have
       | support for silencing as a minimum requirement for apps and
       | implement worst-case scenarios if an app doesn't respond to stop
       | in time)
       | 
       | "Alexa rewind 30 seconds" (doesn't rewind for some apps, Amazon
       | should have support for rewinding played recordings as a minimum
       | requirement for apps)
        
       | Crosseye_Jack wrote:
       | My issue with trying to use voice assistants to purchase items
       | for me is that I've found the experience not that great when
       | doing so and had to resort to using my phone/computer anyway, so
       | it wasn't really worth it for something other than adding an item
       | to my cart as a reminder that I wanted to purchase that item when
       | I wasn't at my desk.
       | 
       | For me, the utility I get out of my echos are home automation
       | control to control heating, lighting, audio when my hands are
       | full or their control are out of reach. But you are never going
       | to make a fortune out of house voice control when you sell the
       | voice assistants at or near cost (don't think I've ever paid full
       | price of any of my Echo's, and I tell people considering getting
       | on to wait till they go on sale as Amazon always seem to be
       | offering a deal/discount on them every few weeks.
       | 
       | But what does frustrate me if that Amazon don't seem to uniformly
       | "monetise" things across the Alexa platform, for example when I
       | found out about the Samuel Jackson voice for Alexa I wanted to
       | buy it even if it was just a novelty, but it's not available in
       | the U.K.
        
       | whywhywhywhy wrote:
       | Just don't think they have improved in any meaningful way since
       | launch and in fact some of the experiences have gotten worse.
       | 
       | At launch I could reliably "Hey Siri" a timer across the room,
       | now it just doesn't work because presumably Apple have downgraded
       | the long range microphone tech at some point to save costs.
       | 
       | Eventually just stopped bothering and just set them manually.
        
       | meindnoch wrote:
       | Speaking to a voice assistant is like speaking to a toddler. They
       | have limited vocabulary, limited comprehension and limited
       | ability to perform what you've asked them. The only difference is
       | that a toddler stops being a toddler after a year or so, while
       | voice assistants stay perpetually dumb.
        
       | mordymoop wrote:
       | A huge source of subjective user frustration with voice
       | assistants is the lack of responsiveness to both commands and
       | interruptions.
       | 
       | Siri in particular will wait as long as a full second to "ding"
       | letting you know she's ready to process a command ... often
       | interrupting your attempt to give the command. Alexa will very
       | often fail to hear me say "Alexa" despite the audio conditions
       | being quite favorable.
       | 
       | Both Siri and Alexa will misunderstand what you asked for and be
       | completely indifferent to your attempts to correct them unless
       | you interrupt them _correctly._ Afaict Siri can't be interrupted
       | by voice alone and Alexa can only be interrupted by something
       | starting with "Alexa". Once you are already "in" an interaction
       | session with either of these tools you should be able to simply
       | say "no, not that" or "no,  <correction>" and the assistant
       | should stop talking immediately, instantly, and not continue to
       | blather on with the previous wrong interpretation.
       | 
       | No matter how useful these tools are in the hands of a power
       | user, Siri and Alexa give the impression of being morons largely
       | because of unresponsiveness, not lack of capability.
        
       | [deleted]
        
       | silon42 wrote:
       | I would never choose to talk to the machine unless there's no
       | alternative...
       | 
       | There's a reason voicemail has largely been replaced by email,
       | etc...
       | 
       | Then there's the spying issue.
        
       | andrewstuart wrote:
       | They managed to control the tech so tightly that no potential is
       | fulfilled and they're boring.
       | 
       | Truly opened to developers they could have been really
       | interesting and fun.
       | 
       | This is what happens when big companies develop technologies and
       | think they are too valuable to share.
        
         | lixtra wrote:
         | This also pissed me off. Why not start the day with a weather
         | report, short news and some trivia. When I checked long time
         | ago I didn't find any way to integrate stuff like that
         | together.
        
           | laidoffamazon wrote:
           | It's possible to do that with a morning routine in Alexa. I
           | know this because it tells me the weather (and other things)
           | after I kill my alarm in the morning.
        
           | stirlo wrote:
           | If you check again now this is a out of the box feature on
           | Google Assistant. You just say "good morning" and it tells
           | you the time, weather, news, and more.
        
       | MarkLowenstein wrote:
       | Voice assistants _do_ have a lot of potential due to the
       | fantastic ergonomics.  "Voice command line" is used as a
       | pejorative here but it's true and a _good_ thing. The original
       | visionaries were right to pursue this idea.
       | 
       | To me where it's gone wrong is that like many other things the A
       | team conceives it but the B team is responsible for its
       | development. The A team will ask "what will users want to say to
       | it"? But Team B says "well if they say 'blah' how do we _know_
       | that they mean  'blah'"? Mediocrity creeps in.
       | 
       | As an example, here is a typical dialog between me and my Alexa.
       | 
       | "Alexa, play Brahms C Minor Piano Quartet."
       | 
       | "Playing: music like Coldplay."
       | 
       | "NO"
       | 
       | [music like Coldplay]
       | 
       | "Alexa NO"
       | 
       | [music like Coldplay]
       | 
       | "STOP"
       | 
       | [music like Coldplay]
       | 
       | "ALEXA STOP STOP"
        
       | galaxyLogic wrote:
       | The problem is simply that voice is a 1-D (linear) interface
       | whereas computer screen is 2-D.
       | 
       | So when I interact with the screen it can show me all my possible
       | choices at the same time at each situation. I click a menu-item
       | and a sub-menu pops up.
       | 
       | With a voice-assistant I would have to wait for the assistant to
       | list all possible choices but if they are many that takes a long
       | time. You can only hear one word at a time. Combine that with
       | inaccuracies of voice-recognition and it is clear that a computer
       | screen + mouse is a much better interface.
       | 
       | Hey it's the same difference as with Radio vs. TV.
       | 
       | A computer screen-interface might be usefully augmented with
       | voice recognition. But working with voice alone is like working
       | blind. Gee who could've thought voice is not the killer-tech of
       | 3rd millennium.
        
       | butterNaN wrote:
       | What are some privacy respecting voice assistants?
        
       | aaron695 wrote:
        
       | schnitzelstoat wrote:
       | It reminds me of the chatbot fever there was a few years back, it
       | seems like every company wanted a chatbot. Even though as a user
       | I much prefer a normal menu.
       | 
       | I think voice assistants will continue to have good use cases in
       | cars, kitchens etc. where you can't use your hands so the trade-
       | off is worthwhile.
        
       | yashg wrote:
       | I own a 3rd gen Echo Dot. I am not sure where it is right now. It
       | needs to be plugged in, so useless as a portable speaker. I don't
       | control anything via voice in my home. The only time I use voice
       | commands is while driving to tell my phone to play some specific
       | song or call someone. I have a few smart lights in my home and I
       | always find it much quicker to change color etc by using the app
       | since the phone is always within reach than call out a voice
       | command.
        
       | mark_l_watson wrote:
       | While I get it why Amazon, Microsoft, and Google (?) might want
       | to reduce development costs for voice interfaces, it seems like
       | Apple is locked in to supporting Siri since one of its products,
       | the Apple Watch, really needs Siri to get full value from it. I
       | like to go about my day without carrying a phone (if I don't need
       | to take pictures) and the Apple Watch is a great compromise that
       | allows getting calls and text messages, but at the same time is
       | not intrusive as an iPhone. No one sits in a restaurant staring
       | lovingly at content on their Apple Watch, ignoring nearby people
       | and their environment.
       | 
       | Pardon my getting a bit off topic, but it is interesting to see
       | the "belt tightening" by FANGs: it seems like everyone is cutting
       | excess staff and looking hard at which products make money and
       | which are money losers. This may seem like a good thing except
       | for newer product categories like AR/VR that will need a lot of
       | experimentation to get right.
        
       | samwillis wrote:
       | I think it's fairly clear now that the only time a voice based UI
       | is better is when the user is unable to use their hands. Driving
       | or in the kitchen when cooking seem the be the most successful.
       | The are barely any other strong use cases.
       | 
       | On top of that the general distrust of the privacy of these
       | systems has stoped a significant number of people (myself
       | included) from wanting to us them at all. I don't have an in home
       | device, and have turned off Siri on my Apple devices.
        
         | red_Seashell_32 wrote:
         | Or in countries where we get snow during Autumn. Getting
         | messages read-out loud, and responding with voice-to-speech is
         | great too
        
         | goosedragons wrote:
         | I don't think that's true. It's way faster for things like
         | smart lights and playing music. The problem is it's still so
         | limited for other things. For example I can tell Alexa to play
         | The Simpsons on Fire TV, it will do so on Disney+ but always
         | the first episode even though the last watched one was in
         | Season 15. It also can't seem to find my purchased episodes
         | from iTunes (watchable with the Apple TV app on Fire TV).
         | Simple searches also have a high chance of being misunderstood
         | still with poor results.
         | 
         | I think if the accuracy was better and more content/things were
         | available through voice it would be a pretty good input method
         | for any scenario where you don't need visual feedback.
        
           | phh wrote:
           | This is a real problem, though the problem is not technical,
           | it's purely contract/legal: Disney+, AppleTV (all content
           | providers actually) refuse to allow third parties to know
           | what you've watched, because viewer data is closely protected
           | 
           | That's why an opensource ToS-violating assistant has chances
           | to work better than legal ones, they can just scrape all
           | those infos off internet. But then, once you go into that
           | grey area, you just end up pirating content already.
        
           | samwillis wrote:
           | Selecting content via voice only works when you can either
           | name exactly the content you want, or you are choosing a
           | general category to play.
           | 
           | Browsing content, or looking it up, via voice is slow and
           | playful, it always will be. Who what's to be saying "next",
           | "scroll down", or have a full on conversation with an AI to
           | try and work out what you want to play? Our fingers on our
           | hands have evolved to be incredible at interacting with
           | things, we are good at using them. Touch screens or physical
           | UIs will never be superseded by voice.
           | 
           | So yes, there is a small use case for voice for controlling
           | music/tv, or controlling a few things in the home (heating,
           | blinds, lights) but thats it, I don't believe there is this
           | massive opportunity to expand it into our everyday lives
           | where we are constantly interacting with devices via voice.
        
             | goosedragons wrote:
             | You don't need to name content exactly. Saying "Alexa, play
             | the sandwich song from Frozen" for example will correctly
             | play Love is an Open Door. Ideally this kind of thing would
             | work for TV and movies too, web searches as well.
             | 
             | Humans evolved language to communicate ideas, wants and
             | desires to others for thousands of years. Obviously voice
             | UI is not there now but maybe someday the experience won't
             | be much different than asking the movie rental store clerk
             | for their recommendations for a romantic comedy.
        
               | samwillis wrote:
               | > asking the movie rental store clerk for their
               | recommendations for a romantic comedy
               | 
               | The only people who ever did that were in a romantic
               | comedy.
        
               | simiones wrote:
               | That is still an example of knowing the content that you
               | want to play exactly, you just don't remember its
               | official name - and yes, search engines have gotten
               | pretty good at this.
               | 
               | The more relevant use case is "hmm, I'd like to listen to
               | some prog rock, let me browse what Spotify has and see
               | what takes my fancy". Sure, I could say "VA, play prog
               | rock", but I don't want it to choose for me: I want to
               | browse the available content to remind myself what are my
               | options, and choose one when I see one that looks
               | interesting.
        
               | dwighttk wrote:
               | >Saying "Alexa, play the sandwich song from Frozen" for
               | example will correctly play Love is an Open Door.
               | 
               | I bet ya some engineer at Amazon hooked that up manually
               | when they saw a bunch of requests failing, so that's only
               | gonna work for popular fuzzy naming conventions. I don't
               | want to have to think "is this a way lots of people are
               | gonna request this song?" before saying it that way.
        
           | red_Seashell_32 wrote:
           | To be fair, that's an issue with siri integration, not siri
           | itself. Kinda sad that Apple's own products doesn't implement
           | that properly :/
        
         | dmitriid wrote:
         | > Driving or in the kitchen when cooking seem the be the most
         | successful.
         | 
         | Since the voice assistants are incredibly stupid I find it
         | extremely stressful and distracting to ask them for anything
         | while driving.
        
           | enobrev wrote:
           | Google (android auto) was significantly better at this early
           | on than it is now. I used to be able to search random topics
           | by voice while driving, and it would read me excerpts and
           | results. I used it often. Now it's map-specific, messaging,
           | or music-specific and nothing else.
        
           | swores wrote:
           | Figuring out what works well or not while driving isn't a
           | great idea, but using the ones that work well seems fine for
           | most people.
           | 
           | Saying "Hey Siri, text Fred <pause> I'm on my way but stuck
           | in traffic, eta 4 o'clock" or something along those lines
           | nearly always works fine for me and is no more distracting
           | than having a conversation with somebody in the car with me.
           | If Siri gets some of the message wrong I'll either send a new
           | one using clearer speech or wait until I'm not driving to fix
           | it if the mistake isn't important.
           | 
           | Sure, it would be possible to then allow myself to get
           | distracted by focussing too much on some weird aspect of it,
           | but equally it would be possible to get so emotional in a
           | conversation with somebody sat next to you that you stop
           | paying attention to the road. And we (most people at least)
           | don't say "it's not safe to talk at all while driving", we
           | just make sure not to go over that line of getting too
           | distracted by the conversation.
        
             | dmitriid wrote:
             | > or something along those lines nearly always works fine
             | for me and is no more distracting than having a
             | conversation with somebody in the car with me.
             | 
             | Until you have several Freds in your contact list. Until
             | you have friends with foreign/uncommon names. As long as
             | you have near-perfect American pronunciation. As long as...
             | 
             | There are too many variables to consider and think of.
             | Sometimes I can't get Siri to reliably understand what
             | music I want (and my Engilsh is pretty darn good), much
             | less anything more advanced.
        
               | scarface74 wrote:
               | > As long as you have near-perfect American
               | pronunciation.
               | 
               | There isn't a such thing as an American accent. Ask
               | anyone who is not a native speaker and either hasn't been
               | to US that long and tries to understand my natural deep
               | southern accent. I can adjust my accent if needed and if
               | I think about it.
        
               | simiones wrote:
               | There is an accent that is typically called "standard
               | American" or something along those lines, which is what
               | you'll hear on things like national news programs. I'm
               | not sure how many Americans actually speak with this
               | accent, but it's usually the one that all of these
               | devices target initially.
        
               | dmitriid wrote:
               | Yeah, I was thinking along the lines of "General
               | American"/"California English":
               | 
               | - General American
               | https://www.babbel.com/en/magazine/united-states-of-
               | accents-...
               | 
               | - California English
               | https://www.babbel.com/en/magazine/the-united-states-of-
               | acce...
        
               | JohnFen wrote:
               | The "Standard American" accent is the west coast (mostly
               | California) accent. It is far from universal.
        
               | whimsicalism wrote:
               | It is Midwestern and, while not universal, it is the
               | majority of speakers on the coasts and people I have met
               | from the Midwest.
        
               | ghaff wrote:
               | It's essentially how a homogenized middle+ class of
               | people born/raised in/around metropolitan areas that
               | aren't in the South/Texas speak. (In general,
               | stereotypical accents associated with various cities are
               | mostly more of a working class thing, e.g. Southie in
               | Boston.) The South is the main outlier. Colleagues I work
               | with from and living in North Carolina generally have a
               | distinct southern accent albeit a mostly slight one. But,
               | yeah, historically we'd have called it Midwestern.
        
               | ghaff wrote:
               | I think historically we'd probably have said it was
               | Midwestern. But, yeah, however the network news anchors
               | speak.
        
             | simiones wrote:
             | > Hey Siri, text Fred <pause> I'm on my way but stuck in
             | traffic, eta 4 o'clock
             | 
             | That only works well if you have an accent it recognizes,
             | if you're speech is clear (not slurred, not lisping etc),
             | if you don't stammer, if you don't have any verbal tics
             | that you don't want to show up in the message, and if
             | "Fred" is actually a simple unambigous name.
             | 
             | Otherwise, at best when you want to send a message to
             | "Ioana" it may end up sending a message to "Anna" that says
             | "I'm, ummm, oh my way! and stalking traffic ate a what was
             | it like 4 like maybe 4 and you know what <pause>" (followed
             | by the "4 o'clock" that will no longer be included).
        
         | wintermutestwin wrote:
         | >Driving
         | 
         | While driving, I wanted to have Siri read a lengthy webpage to
         | me. I pulled up the page, got in the car and asked Siri to
         | "speak screen." Siri says it can't do that when I am driving!
         | What idiot thought that was a necessary safety measure? What if
         | I were the passenger?
         | 
         | Overall, I am stunned at how bad Siri is at things that don't
         | even require AI. It's almost as if this insanely profitable
         | company failed to invest a tiny bit of money into researching
         | ways that people would like to use Siri.
        
         | kevsim wrote:
         | Totally agree. "Hey Siri, start a timer for X minutes" and "Hey
         | Siri, play Y on Spotify" are the entire extent of my voice
         | assistant interactions.
        
           | aflag wrote:
           | And I'm always annoyed that it doesn't tell me how long I set
           | each timer to when I have multiple timers, just the remaining
           | time, which makes it hard for me to know which is which.
        
             | cmckn wrote:
             | It's not encouraging that even the most common use cases
             | for these systems still have rough edges.
             | 
             | The Google ones do support named timers, so you can say
             | "start a pasta timer" and later ask "what's left on my
             | pasta timer" etc. I thought Siri added this at some point,
             | but I wouldn't be surprised if not.
        
               | sbuk wrote:
               | It does. "Hey Siri, set a timer for 45 minutes called
               | pasta bake", "Hey Siri, how long left on pasta bake?"
               | works.
        
               | aflag wrote:
               | Actually, it seems that google is now able to cope with
               | things a bit better. I've just tried and I can now ask
               | for "how long left on my 45-minute timer?" and it does
               | finally answer. It actually also shows up in the UI. That
               | is sort of recent, though, as I've had that problem
               | earlier this year. It seems that siri is now unable to
               | set multiple timers, though.
               | 
               | Anyway, it does seem to have improved, but I wonder why
               | that stuff wasn't in from day one. It seems pretty
               | obvious to me.
        
               | sbuk wrote:
               | Nope, Siri can also do multiple timers. It asks for an
               | name if an unnamed timer of the same length already
               | exists.
        
               | Tagbert wrote:
               | Weirdly, the ability to set multiple timers varies by the
               | device you are using.
               | 
               | A HomePod can set multiple timers. A Watch, iPhone, or
               | iPad can only set one timer. There is no obvious
               | technical reason for it. It just seems like only the
               | HomePod team thought it was an important feature.
               | 
               | This becomes annoying if you have multiple devices set to
               | respond to "hey Siri" and the wrong one picks up the
               | request and then refuses to comply.
        
               | s1mon wrote:
               | This used to be true, but at least the Apple Watch lets
               | me do this all the time. I forget which OS update added
               | the capability.
        
               | aflag wrote:
               | Oddly enough, in the iPadOS 15.7.1 at least if I say "Hey
               | siri, set a timer for 20 minutes" it will say "20
               | minutes, starting now". Then if I say "Hey siri, set a
               | timer for 5 minutes" it says "there's already a 20 minute
               | timer. Replace it?"
               | 
               | If I say "set a timer for 20 minutes called A" it just
               | ignores the "called A" part.
        
               | amake wrote:
               | I do "Hey Siri set a timer for Foo for 5 minutes" I get a
               | timer named Foo, and I can then set concurrent timers
               | named Bar, Baz, etc. without replacement.
        
               | ghaff wrote:
               | Yep, that seems to work which brings us back to the need
               | to memorize fairly precise incantations.
        
               | noelsusman wrote:
               | It even plays a little Italian jingle if you start a
               | pasta timer.
        
           | nonanonymo wrote:
           | I discovered recently that you can set a timer just by saying
           | "Hey Siri, four minutes," or however long you want, and she
           | will set a timer to that length. Not that I'm doing anything
           | with the extra second I've saved myself, but it feels good
           | anyway.
        
         | algesten wrote:
         | And then it typically interacts and fails without feedback.
         | I've tried so many times to tell Siri "Send a message to x that
         | I'm 10 mins away", only to realize much later that "message
         | delivery failed".
         | 
         | No clear feedback, a weird timing issues where it just stalls
         | and show the message it' about to send in case it got it wrong.
         | 
         | It's just a terrible UX all around.
        
           | causi wrote:
           | I've had to stop using Google Assistant to send messages. It
           | used to ask the user to choose the correct word when it
           | misheard something. Now it just makes a wild-ass guess and
           | sends it on. It's caused me to send some very odd messages to
           | people and/or look like an idiot.
        
         | bryanrasmussen wrote:
         | >the only time a voice based UI is better is when the user is
         | unable to use their hands. Driving
         | 
         | my observation of people on the road has led me to conclude
         | that Driving is an activity where people think they can do
         | absolutely anything else while engaged in it.
        
           | bluGill wrote:
           | When you put it like that you will get a lot of upvotes.
           | However for almost all specifics you will get a ton of
           | downvotes when you name them, and probably someone saying no
           | that is okay. (these days if you say "you can't drink alcohol
           | while driving" you are safe, 30 years ago if there had been
           | an internet people would have said you are wrong)
        
         | thiht wrote:
         | > Driving
         | 
         | Siri never triggers when I'm driving, it just doesn't hear me.
         | I think it's because of the noise of the car or because of my
         | music, but it doesn't work. I have to move my face closer to my
         | phone so that it can hear me, but that's even more dangerous
         | than using the controls.
         | 
         | Same when I'm in the shower and I ask it to change the music,
         | it doesn't hear me, I have to shout and get angry every time.
        
           | balfirevic wrote:
           | Is your phone mounted close to the air vent? Siri hears me
           | perfectly in my car, even when the phone is in my pocket but
           | it doesn't hear me at all when I mount the phone on the air
           | vent.
        
             | thiht wrote:
             | Yes, it is. Maybe it doesn't hear me because of that.
             | 
             | For what it's worth, it doesn't work either when it's my
             | pocket. When I come home and ask it to turn the lights on,
             | it doesn't answer if it's not in my hand.
        
               | Tagbert wrote:
               | You might try blowing the microphone holes with a can of
               | air. It may be clogged with pocket lint.
        
               | balfirevic wrote:
               | That might be a separate issue - you have to specifically
               | enable a setting which will make the phone listen for Hey
               | Siri when it's face down (or in the pocket).
               | 
               | See here: https://support.apple.com/en-
               | gb/guide/iphone/iphaff1d606/ios...
        
         | genocidicbunny wrote:
         | Even while driving, it's useless past basic commands.
         | 
         | My most egregious example of this for me is that there's a
         | grocery store near me that the Google assistant is incapable of
         | finding because of a few people in my contacts list. Whenever I
         | try to ask it for directions to that store, it picks (at pretty
         | much random) one of three of my contacts instead. This is
         | despite the only common part of said contacts' names and the
         | grocery store is that their names all start with the same
         | letter.
         | 
         | Basically, imagine asking for directions to Albertsons, and the
         | assistant giving you directions to Andrew.
        
         | college_physics wrote:
         | > when the user is unable to use their hands
         | 
         | this is still potentially a huge domain. one _could_ imagine a
         | benign scenario where voice assistants enhance people 's
         | abilities to interact with each other (and digital devices)
         | when a more potent UI is not within reach
         | 
         | privacy concerns (->controversial business models) and
         | technical ability to deliver a desirable service (that people
         | would pay for) might indeed prevent this vision from catching
         | on in the short term
         | 
         | another factor that may complicate adoption might be just
         | cultural / perceptions. It is a somewhat odd thing to be
         | shouting at devices - especially in the presence of other
         | people. User interfaces that interfere strongly with
         | communication habits and behaviors established over millennia
         | (see also wearing VR goggles) might have a harder time seeing
         | adoption outside very specific scenarios
        
         | fxtentacle wrote:
         | Fully agree on the privacy distrust.
         | 
         | BTW, another use case for speech recognition is when you're
         | carrying a baby around.
        
           | newaccount74 wrote:
           | I've gotten pretty good at doing chores one handed...
        
           | maweki wrote:
           | I'd say my baby preferred me playing around with my phone
           | than speaking up. Doubly so when they were carried around
           | sleeping.
        
       | kwash wrote:
       | I worked in a project where some funny guy was trying to sell
       | voice assistant for a company selling LPG for housewives in
       | Brazil, and of course nobody ever used this.
        
       | bhouston wrote:
       | I love Google Assistant but it doesn't recognize well my wife or
       | daughter voices. Thus they find it annoying and do not use it.
       | 
       | I use it primarily to turn on and off the lights (multiple times
       | per day), play music, turn off the TV (but not to play things on
       | TV, too unreliable) and to raise/lower the temperature on the
       | Nest.
       | 
       | We have tried to use the Nest Cams with the Hubs as a baby
       | monitor but the cameras feeds freeze and don't tell you so it is
       | actually dangerous.
        
       | kwash wrote:
       | I worked in a project where some funny guy was trying to sell
       | voice assistant for a company selling LPG for housewives in
       | Brazil. Nobody ever used this
        
       | Semaphor wrote:
       | I think Amazon is one of the worst platforms possible for voice
       | shopping. For something that can barely understand what you want,
       | you want a small store, that doesn't have multiple products or
       | sellers per type. I could imagine using voice ordering at my low-
       | carb store, when I ask for "white almond flour", there will be
       | exactly one product they can deliver. At Amazon? I might as well
       | order from AliExpress, it's a marketplace, you need human-level
       | intelligence to order something, not a robot that can't even
       | figure out what band to play if there are 2 similar-ish sounding
       | ones.
        
       | inferense wrote:
       | The reason for voice assistants not taking off is quite simple
       | imo. You don't know what it's capabilities are beyond the well
       | known use cases ("set up a timer for 15min") but are well aware
       | of that it's not going to understand everything. It's limitations
       | are obscure and you've been misunderstood in the past, why waste
       | more time. Compared to a human who understands everything you
       | say, you have the full confidence of using voice.
        
         | iandanforth wrote:
         | The only thing missing from the experience is occasionally
         | being eaten by a grue.
        
       | Andrew_nenakhov wrote:
       | Not surprising.
       | 
       | What we were promised: a personal well-trained butler.
       | 
       | What we got: a voice input field that you _know_ will be
       | incorrectly interpreting your intents far too often.
        
       | psychomugs wrote:
       | I've been dabbling with Talon to reduce the burden on my precious
       | arm tendons. It's the most useful voice interface I've used due
       | to its customizability, but with that great customizability comes
       | a great learning curve that it seems only techies would bother
       | with.
        
       | billbrown wrote:
       | I liked Gary Marcus' post about this -
       | https://garymarcus.substack.com/p/how-come-smart-assistants-...
        
       | gw98 wrote:
       | It's useful for trivial unambiguous tasks where you have your
       | hands full or don't want to touch your device or it's dangerous
       | to. That's all I can muster mine for.
       | 
       | "Hey Siri, add more toilet paper to the shopping list" (while
       | pooping)
       | 
       | "Hey Siri, shuffle my music" (while driving)
       | 
       | "Hey Siri, countdown 10 minutes" (while shoving a pizza in the
       | oven)
       | 
       | Anything else is a shit show. Anything where trust or accuracy is
       | involved i.e. mutating data, spending money, absolutely no way
       | can I trust it at all and never will.
        
         | Shank wrote:
         | > "Hey Siri, add more toilet paper to the shopping list" (while
         | pooping)
         | 
         | This is the main reason why I have an Echo in my bathroom! The
         | one advantage Alexa has over everything else is that you can
         | voice shop -- "alexa buy more toilet paper" solves the problem
         | _that much faster_ than a reminder for later.
        
           | gw98 wrote:
           | I don't want that to happen because the price variation in
           | toilet paper is huge based on deals and offers available, and
           | Amazon is rarely the cheapest provider these days, so it's
           | actually worth me spending a few minutes on it to save some
           | money.
           | 
           | The reason Alexa exists is to sell you Amazon's prices, not
           | necessarily a good deal.
        
             | aflag wrote:
             | Also, I think I'd rather just add stuff to my shopping list
             | so that I can at a later time order everything together,
             | rather than have multiple deliveries.
        
             | ghaff wrote:
             | The sort of consumables I might order on Amazon on a
             | regular basis--like those that the Amazon Dash buttons were
             | intended to address--can vary a fair bit in price and
             | quantity. I'm not going to have Amazon just ship whatever.
             | 
             | And it's not even a very frequent thing. Mostly, every few
             | months, I look through what consumables need replenishing
             | and I fill up the car with plus-size packages from Walmart.
        
         | ciaron wrote:
         | Agreed, but I find for even these simple tasks it's hit-and-
         | miss for accuracy. My Google device will randomly not know what
         | a "shopping list" is, or the interactions go something like
         | this:
         | 
         | "Hey Google, put dishwasher salt on the shopping list" "OK, I
         | added 'put dishwasher salt'" (strangely, this particular bug
         | only manifests for dishwasher salt).
         | 
         | Timers are useful, but sometimes they can't be shut off by
         | voice command.
        
           | gw98 wrote:
           | Yeah it doesn't always work well. I say _" hey siri add green
           | milk to the shopping list"_. I want _" green milk"_ added to
           | the shopping list which in the UK is semi-skimmed milk. What
           | does it do? Adds _" green"_ and _" milk"_ because it thinks
           | I'm a weed smoker...
        
         | sambeau wrote:
         | Lights on lights off is also useful, especially when in bed or
         | carrying a basket-full of washing.
        
         | jayelbe wrote:
         | Mmhmm, I never handle my phone while pooping, no siree.
        
         | sanitycheck wrote:
         | Trust and accuracy is involved in the first and last of your
         | examples - I'd end up having to check that the TP was actually
         | added to the list, and that the timer had actually begun and
         | was set to 10 mins.
         | 
         | Shuffling music, turning lights on, yes fine - because
         | confirmation that the right thing has happened is instant and
         | effortless. Anything else, I'll use a button or a screen.
        
           | muspimerol wrote:
           | Not really - adding toilet paper to a shopping list is not
           | clicking the "buy" button. And if you set up a timer you get
           | quick confirmation that it has been set. If the timer is
           | accidentally set for 100 mins it's easily corrected.
        
             | sanitycheck wrote:
             | I just asked Alexa to set a timer for 2 mins, and you're
             | right - she did then ponderously state that a timer for 2
             | mins was starting. Then she asked me if I'd like to hear
             | tips about using timers? No. Then she told me I had two
             | notifications, would I like to hear them? No.
             | 
             | Then I timed myself setting a timer on my phone, which took
             | 9 seconds from pocket to running.
             | 
             | Adding to a shopping list isn't clicking the "buy" button,
             | no - but if it's not on the list I won't buy it and then I
             | will have no toilet paper. I would not need a list if I
             | could simply remember everything.
        
               | mikestew wrote:
               | _Then she asked me if I 'd like to hear tips about using
               | timers? No. Then she told me I had two notifications,
               | would I like to hear them? No._
               | 
               | Are you saying this for comedic effect, or does the Alexa
               | really do this? (I'd look it up myself, but good luck
               | with _that_ query...) To each their own, but I 'd throw
               | the device into the street if it pulled a stunt like
               | that.
               | 
               |  _Then I timed myself setting a timer on my phone, which
               | took 9 seconds from pocket to running._
               | 
               | To the Homepod or my Apple Watch: "hey, siri, tea timer
               | for three minutes".
               | 
               | "Three minute tea timer, starting now."
               | 
               | I didn't think a product could screw that up. I would
               | suppose it's a design decision between "assistant" and
               | "servant that carries out my command without backtalk".
               | There are times that I wish the Apple product were more
               | "assistant" than "servant", but the Alexa product just
               | sounds pushy.
        
               | Crosseye_Jack wrote:
               | I use Alexa for shopping lists, I get a "toilet paper
               | added to your shopping list" confirmation after adding
               | items to my list.
               | 
               | It's not perfect though, for example when trying to add
               | fruit and fibre cereal it will often add two items,
               | "fruit" and "fibre". But its close enough that when I get
               | to the store and check the list I know what I intended to
               | add to the list.
        
             | ht_th wrote:
             | I think the parent meant that you need to check if these
             | commands are executed properly, otherwise you get into
             | trouble later. For example, if the toilet paper isn't added
             | to the shopping list, and you go shopping with this list
             | the next day trusting it contains everything you need,
             | you're not buying the toilet paper. Similarly, if the timer
             | is accidentally set to 100, you only notice it after, say,
             | 20 minutes when there's black smoke coming out of the oven.
        
           | [deleted]
        
           | gw98 wrote:
           | Definitely agree with this. You get that confirmation with
           | siri. I mostly use my watch for it and it will show me what
           | it did on the screen without having to touch anything.
           | 
           | Confirmation is required when dealing with humans as well ...
           | https://www.youtube.com/watch?v=11fCIGcCa9c (this reminds me
           | of Alexa)
        
           | eternityforest wrote:
           | Google is pretty good about that. It will say "Ok, your alarm
           | is set for 7 hours and 40 minutes from now" and similar.
        
       | martythemaniak wrote:
       | It's really shocking how bad all voice assistants are considering
       | how amazing LLMs are. There must be major effort underway in all
       | companies to upgrade all of them to LLMs in the backend.
       | 
       | OpenAI charges 2 cents per about 750 words (for their best
       | model), or roughly 1 cent per minute of talking. Maybe they can
       | add LLMs as a premium feature, $3 a month for an actually smart
       | home assistant seems like a deal.
        
       | jaysinn_420 wrote:
       | i have seen a lot of people with physical challenges and
       | disabilities discussing how critical these devices have become.
       | they may not solve mainstream problems, but they have contributed
       | materially to the quality of life for a marginalized community,
       | and i hate the idea that they will disappear because big tech
       | hasn't found a way to monetize them. i hope that the remnants of
       | these projects can be open-sourced or sold to companies that will
       | run with them and build and maintain products to support those
       | who may have mobility or physical manipulation challenges, that
       | are benefiting from voice control.
        
       | woeirua wrote:
       | Can we lump chatbots in with this too? Chatbots have totally
       | failed to live up to their hype. It seems like we spent billions
       | of dollars chasing chatbots and voice assistants only to realize
       | that you need strong AI to have something truly useful.
        
       | exabrial wrote:
       | Voice assistants are useless tech that has been shoved down the
       | users throat to attempt to gain adoption. Google attempted to
       | hold me hostage to this with android auto. Previously: voice
       | recognition worked just fine. At some point they decided to
       | integrate their stupid voice assistant, which probably was the
       | same back end anyway... but now it required all sorts of
       | permissions (like web and search history) just to do basic things
       | like send a text message.
       | 
       | Lost yet another another use to ios.
        
       | lvl102 wrote:
       | I think it has to do with privacy and the limitations on training
       | data. Or am I being naive?
        
       | scotty79 wrote:
       | No wonder it's not doing great. There's zero progress in this
       | sphere. Google Assistant if anything, feels worse today than many
       | years ago.
        
       | booleandilemma wrote:
       | The technology just isn't there yet. A lot of progress has been
       | made, but I still need to repeat myself too often. It's more
       | frustrating than using an app. We'll nail it in another 20 years
       | though, and the voices will be indistinguishable from human ones.
        
       | DevX101 wrote:
       | A lot of the more useful information retrieval tasks involve a
       | feedback loop. If I'm shopping for a product, I may enter a
       | generic term. Then the system sends me images of products
       | matching that term. Then I tell the system which product image is
       | closest to what I want. Then the system sends me reviews of that
       | product. Then I read the reviews and realize this is not what I
       | actually want...repeat until I'm satisfied.
       | 
       | I can't do this loop with modern voice assistants.
       | 
       | A voice assistant with contextual conversation skills, and access
       | to an "always on" visual monitor (home projector or AR glasses)
       | would definitely increase utility by 10x or more
        
       | CrypticShift wrote:
       | Conversational Computing is just not ready. Even a better AI,
       | alone, will not solve it 100%.
       | 
       | But because of the 2010s meteoric rise of FAANG (in the public
       | imagination and stock market) they think they can quickly push
       | into mainstream all these immature paradigms like IVA, AR and now
       | VR.
       | 
       | All these technologies exist since decades, but they are not
       | ready! billions of dollars of investment and marketing are not
       | enough to make them so.
       | 
       | As small a Leap as a tactile portable device took us almost 20
       | years to reach a conclusive mainstream form (iPhone)
       | 
       | We need to accept that IVA/AR/VR are exponentially larger leaps
       | and should remain side-shows for a very long time to come.
       | 
       | For example, Microsoft is finally acknowledging this, with
       | HoloLens being now just "to help you solve real business
       | problems".
        
       | anon20221123 wrote:
       | Voice assistants are the most useful new UI to me since
       | smartphones.
       | 
       | I use them to get travel time estimates, reference facts, music
       | and podcasts, rewind / forward / pause / resume / next /
       | previous, timers, lights, and occasionally broadcast messages on
       | home speaker devices.
       | 
       | I use voice UI while cooking, running, mowing, raking, driving,
       | changing diapers, putting on my kids' clothes, and while sitting
       | with my wife.
       | 
       | My two young children (5 and almost 2) love voice UI. We ask
       | about the animal of the day, what does this animal sound like,
       | play that song. My older child is beginning to set timers with
       | it. My wife recently said having a nearby voice assistant is
       | important in a home configuration discussion.
       | 
       | My family and I sometimes have frustrating experiences with voice
       | UI, like failed hotwords, shifting syntax, answering on an
       | unexpected device, and slow responses. But we still use it
       | frequently, and our overall sentiment with voice UI is positive.
        
         | bambax wrote:
         | It's quite funny that none of the things you mention involve
         | shopping, or any activity that would bring revenue to the voice
         | assistant provider.
        
           | anon20221123 wrote:
           | Voice UI could be a decisive feature for profitable
           | equipment, or generate advertising revenue.
           | 
           | Because it lacks VUI, my family recently didn't buy a new
           | ~$400 product. We bought an older version that was marked-
           | down and otherwise inferior, but has built-in voice
           | assistant.
        
       | ageitgey wrote:
       | They still just don't work very well unless you memorize very
       | specific exact commands.
       | 
       | The other day, we had to remember to book a school thing for the
       | kid. I said "Hey Google, set a reminder for 9pm to [book the
       | thing]".
       | 
       | Google replied "Here are web search results for set a
       | reminder...."
       | 
       | When they fail constantly at the most basic tasks, usage is going
       | to drop way off.
        
         | hengheng wrote:
         | They also refuse to ever show a reference so that you can learn
         | what you can say, or teach you any other way. Refusing to show
         | and teach your users feels arrogant to me. They had to promise
         | that you can "just speak naturally" and now they can't roll it
         | back.
         | 
         | Meanwhile my success rate for the way I speak is below 20%.
        
           | willhinsa wrote:
           | they do chime in and tell you all sorts of things "by the
           | way, i can also play music while you're getting ready for
           | bed" while i'm trying to concentrate and live my life lol.
           | 
           | just compile a list of all commands into an email and send
           | that to me. i don't actually want to hear you talk, siri
        
       | Shank wrote:
       | Voice assistants are basically just mainstream non-visual
       | command-lines, and it's unsurprising to me that something that
       | relies heavily on memorization and extremely specialized "skills"
       | isn't quite taking off in the way it was imagined. A voice system
       | that can do literally everything one can do with a keyboard and a
       | mouse would be magical, but no system offers that.
       | 
       | Instead, it's a guessing game about syntax and semantics, and
       | frequently a source of frustration. There are many failure
       | points: it can "hear" you wrong, it can miss the wake word, it
       | can hear correctly but interpret wrong, miss context clues, or
       | simply be unable to process whatever the request is. In my
       | experience, most normal people either relegate voice commands to
       | ultra-specific tasks, like timers, weather, and music, and that's
       | that. Google and Alexa are relatively good at "trivia" questions,
       | but Siri is a complete failure. All systems have edge cases that
       | make them brittle.
       | 
       | I think there's potential here. Cortana was the most promising:
       | an assistant that's integrated into the OS and can change any
       | setting or perform anything on-screen would, again, be really
       | awesome. We just don't have that. I think maybe OS-wide + GPT 4
       | (or later) might get closer to what we expect, but it's just not
       | great right now. I really want to be able to say something as
       | unstructured as "hey siri, create alarms every 5 minutes starting
       | at 6am tomorrow" or "hey siri, when I get home every day, turn on
       | all of the lights, change my focus to personal, and turn on the
       | news". There /is/ power to-be-had, but nobody has really tapped
       | it.
        
         | phkahler wrote:
         | >> A voice system that can do literally everything one can do
         | with a keyboard and a mouse would be magical, but no system
         | offers that.
         | 
         | And even then, a voice assistant is essentially a user
         | interface, not a product or service.
         | 
         | It could be a service if you could reliably say "Alexa, plan my
         | trip to customer X the week of the 30th and send me my
         | itinerary". But for now they are an alternative to a phone UI.
        
           | ghaff wrote:
           | The reality is that even a human personal assistant can
           | rapidly devolve to being more of a hindrance than a help if
           | they're not very good once you get beyond simple mechanical
           | tasks. Even with all the knowledge about the world that most
           | adults carry around in their heads. Yes, a poor human
           | assistant can fall down in other ways such as forgetting to
           | do something--but they have a _lot_ of context.
           | 
           | This seems a really high bar for voice assistants aspiring to
           | do much more than set alarms or turn the odd light etc. on or
           | off.
        
             | bluGill wrote:
             | These days few people have personal secretaries, but back
             | when they were common they really were personal - once you
             | got a personal secretary she (nearly always she, I feel
             | like we should acknowledge sexism even though it is
             | irrelevant my point) would follow you (nearly always male),
             | as you moved job to job and up the ladder. She went with
             | the you because once you spent a few years training her to
             | how you worked, a new secretary would greatly limit your
             | effectiveness.
             | 
             | These days a large part of what people relied on
             | secretaries for a computer can do faster, so only at the
             | highest levels do you see them. There are still secretaries
             | at the low levels, but not nearly as many, and they are not
             | doing the same tasks.
        
               | ghaff wrote:
               | That's pretty much it. We call them executive admins
               | these days where they exist.
               | 
               | And, yeah, assistants shared with a bunch of other people
               | --as with travel agents in general--aren't really all
               | that useful. If I'm mostly just giving fairly mechanical
               | instructions to execute, it's probably easier for me to
               | go online and figure out the options myself.
               | 
               | A secretary made a lot more sense when you dictated memos
               | for inter-office mail and retrieving information often
               | involved making multiple phone calls.
        
             | phkahler wrote:
             | >> This seems a really high bar for voice assistants
             | aspiring to do much more than set alarms or turn the odd
             | light etc. on or off.
             | 
             | That's kind of my point. A voice assistant is just a fancy
             | UI until they reach the level of AGI, and I don't see the
             | point in spending billions of dollars on them to be a
             | simple UI as Amazon seems to be doing.
        
             | enobrev wrote:
             | If that voice assistant were self hosted in the little
             | device, I agree. But those simple interfaces are connected
             | directly to a significantly larger machine that literally
             | knows everything about you and half of everyone you know.
             | It's not unheard of to expect it to be more useful than
             | setting timers and playing music.
        
               | ghaff wrote:
               | They "know" a bunch of discrete facts. They don't know
               | that if you book me on a red-eye unnecessarily to save
               | $100 I'll be hunting you down. Or any of a zillion other
               | flexible preferences--some of which I'm not even very
               | consistent about.
        
               | enobrev wrote:
               | I don't know about you personally, but google definitely
               | knows I've never booked a red-eye and that I haven't
               | booked a layover since the early aughts. I'm fairly sure
               | Google could easily figure out not only where I'd be
               | interested in flying to in the next few months, but when
               | and for how long, and at what price points I'd consider
               | upgrading my flight.
               | 
               | I know they know this about me not only because of my
               | Gmail account but also because I use Google flights to
               | find the flights before I book them.
               | 
               | Unfortunately they're not using this data to help me.
               | Rather they're using it to target advertising to me. But
               | they definitely have the data and the machinery to be
               | more useful to me with more than just a few facts
        
               | ghaff wrote:
               | Maybe my travel is more complicated but I even not
               | infrequently get annoyed with "past me" for various
               | travel-related decisions. I avoid red-eyes but at some
               | price point I won't--or maybe only if it's someone else's
               | money. And maybe I don't have a choice based on my
               | schedule or just what flights are available. Normally I
               | won't do an unnecessary layover but maybe I will to fly
               | my preferred airline.
               | 
               | It gets complicated in a hurry and for the cases where it
               | is relatively simple (and when it gets into very complex
               | international travel a voice interface is going to be
               | completely useless), I can look up my options pretty
               | quickly on a computer.
        
         | 7952 wrote:
         | I think the biggest potential is with Microsoft Teams in
         | business. It is ubiquitousness in people's work life, has
         | access to data and has integrated with everything. And adding
         | cortana to calls would be an easy step for people to understand
         | and learn. People would say "cortana share my screen". People
         | would learn phrases from each other.
        
           | happymellon wrote:
           | But teams hasn't figured out how to send text in a coherent
           | way.
           | 
           | It's used because companies can cheap out on buying a license
           | for other communication applications, it is fundamentally
           | worse than anything else in any other metric. If voice lets
           | me respond to a message without hunting for the hidden reply
           | because Teams shoves it below the bottom of the screen then
           | it could be a win. Considering UX is so low for Teams I doubt
           | it will.
        
         | PurpleRamen wrote:
         | The potential would be there, if they would focus on the
         | assistant-part, and take the voice just as a mean to interact
         | with the assistant, besides other means like clicking, typing,
         | showing complex information on a screen, etc.
         | 
         | Voice alone sucks, it's just too limited to be useful on a
         | grand scale. Similarly, command lines suck too. The shell in
         | general has the same problems that Voice assistants have, just
         | that they have more value and had decades to mature into
         | something actually useful. And toady we have unix-shells which
         | reduce the problematic parts by many levels, and still receive
         | constant improvements. This is missing for voice assistants,
         | because unix-shells are growing and improving in an open space,
         | where everyone can add their own things. This is not happening
         | in big tech.
        
         | bogdanstanciu wrote:
         | I think these assistants just need to give the user a way to
         | edit interpretations.
         | 
         | A 'debug' area that lets you ask a command, see what was
         | interpreted - and immediately edit or click "that's not what I
         | wanted". But not an afterthought and not a cumbersome process
         | like setting up an automation that is triggered by specific
         | commands.
         | 
         | Imagine telling your voice assistant "You're wrong, as usual"
         | and instead of it giving you the boiler plate "I'm sorry ", it
         | actually offered a way to improve itself.
        
         | spookthesunset wrote:
         | To me the hardest problem is simply remembering what every
         | light on my network is named. Did I call the light next to my
         | desk "desk light" or did I call it "office light"? If I don't
         | get the name exactly right, I cannot control the light.
         | Multiply that by every other light in the house and it becomes
         | a lot to remember. I have probably 15 lights controlled by
         | Alexa and I can only remember the name of like three of them.
         | Thus most of the time it is just "Alexa turn on the lights" so
         | it can turn everything on in a room.
         | 
         | If these voice assistants were smarter about "alternative"
         | names for every device it might be easier to use. But as it
         | stands, it's kind of a pain because the way you phrase each
         | request is so unforgiving...
         | 
         | Oh yeah, and god help you if your device name is similar to
         | your room name. If your room is "office" (or did I name it "the
         | office"?) and your light is "office light" Alexa is gonna have
         | a bad time figuring the two apart.
         | 
         | I have no clue how to fix this...
         | 
         | PS: this is why I question steering wheel free self driving
         | cars. How will we tell these things exactly where to go when we
         | cannot even reliably tell our voice assistants exactly what
         | light to turn on?
        
         | bambax wrote:
         | > _I think there 's potential here._
         | 
         | But how? Even if those interfaces were actually working, it's
         | still extremely inconvenient to talk when you can click. You
         | have to be somewhere where talking out loud doesn't disturb the
         | people around you. That excludes most situations: open space
         | offices, restaurants, coffee shops, public transport, cars with
         | passengers, and most places in the home except maybe the
         | bathroom.
         | 
         | And even if you're all alone in a silent place, giving
         | instructions out loud takes more time than configuring a
         | screen, and will always be error prone, because the feedback
         | will always be ambiguous and imprecise.
         | 
         | Except maybe if the feedback is on a screen, but then if
         | there's already a screen, why not use it.
        
           | stephc_int13 wrote:
           | If the assistant AI was advanced enough for pleasant
           | conversations to occur, it would be useful.
           | 
           | The would be trivial to use the interface on screen when
           | appropriate, and a truly smart assistant should be able to
           | follow the context and be aware of your preferences and mood.
           | 
           | This is not fundamentally impossible, we're simply not there
           | yet.
        
           | pmontra wrote:
           | > But how? Even if those interfaces were actually working,
           | it's still extremely inconvenient to talk when you can click
           | 
           | Working from home changes that. I can see many more
           | opportunities for a multimodal input interface. Examples:
           | 
           | 1. My fingertips now are closer to the "reply" button below
           | this text area than they are even to the touchpad. Touching
           | "reply" is half a second, moving one hand to the touchpad,
           | aiming the pointer at the button and clicking takes longer.
           | With a mouse: much longer. Anyway, my screen is not a
           | touchscreen. I'll click.
           | 
           | 2. Or, with an assistant, I could have said "Click reply",
           | provided that the assistant knows where the focus is and that
           | it can read the form I'm typing in.
        
             | jorams wrote:
             | Your fingertips while typing are even closer to the Tab and
             | Enter keys on your keyboard, which, if pressed in sequence,
             | have the exact same effect. Much simpler and much faster
             | than either of your options.
        
               | pmontra wrote:
               | Faster, don't know. Simpler, I didn't even think about
               | it. However I'm doing it now. Thanks.
        
           | papito wrote:
           | Well, you are not trying to operate heavy machinery with
           | Amazon Echo - hopefully. Voice as a common interface - I
           | agree with all of that, but to me the everyday utility of
           | being able to add something to my shopping list or my TODO
           | list without having to fire up an APP greatly increases my
           | quality of life. That part is magical, but I don't expect a
           | lot more from it.
        
             | ghaff wrote:
             | I used to use Alexa for my shopping list. I guess over time
             | I came to the conclusion that adding something to a steno
             | pad or my whiteboard was even easier.
        
           | t-sauer wrote:
           | I think the best use cases for voice assistants are when you
           | don't have free hands. I have two scenarios where I use voice
           | assistants: setting a timer while cooking and changing the
           | music while showering. Both could be done by other means as
           | well but they wouldn't be more convenient.
        
             | jfoster wrote:
             | Asking the time whilst getting ready.
        
               | rightbyte wrote:
               | Seems like a perfect fit for a clock?
        
               | [deleted]
        
               | bambax wrote:
               | Or a watch?
        
               | HPsquared wrote:
               | Apple watch does have Siri, I suppose. They could be
               | really bold and remove the screen.
        
               | rightbyte wrote:
               | Both or either would suffice.
        
             | teambob wrote:
             | Also when driving but Siri / Google assistant are more
             | applicable for that use case
        
             | palebluedot wrote:
             | Exactly. For instance, in the mornings Google Assistant has
             | been really useful for when I say "OK Google, Good
             | Morning". It then runs through and tells me:
             | 
             | * Current time, and weather forecast for the day
             | 
             | * Upcoming meetings today
             | 
             | * My current commute time to work, including traffic
             | 
             | * NPR news podcast
             | 
             | So during my routine of letting the dogs out, starting the
             | coffee, etc. in the morning, I get the daily "essential"
             | info.
        
           | ClumsyPilot wrote:
           | > But how? Even if those interfaces were actually working,
           | it's still extremely inconvenient to talk when you can click.
           | 
           | Smart home light/etc while hands are occupied like with a
           | baby. But usecases are quite limited
        
           | Shank wrote:
           | > But how? Even if those interfaces were actually working,
           | it's still extremely inconvenient to talk when you can click.
           | You have to be somewhere where talking out loud doesn't
           | disturb the people around you. That excludes most situations:
           | open space offices, restaurants, coffee shops, public
           | transport, cars with passengers, and most places in the home
           | except maybe the bathroom.
           | 
           | I would separate out the two, actually. There's a "natural
           | language control system for the entire OS" and then there's
           | the actual voice part. Voice is often mostly useful for
           | accessibility purposes -- hands full, running, driving, etc.
           | However, the other side is that a text-based NL assistant
           | would also be profoundly useful. On iOS, you can enable
           | "Type-to-siri" and you can just type sentences and Siri will
           | respond back in text.
           | 
           | If we make progress on NL-driven command-lines, we can
           | actually make progress on voice-assistants, and vice versa.
           | The catch is that the voice side still needs recognition
           | work.
        
         | matthewmacleod wrote:
         | _" hey siri, when I get home every day, turn on all of the
         | lights, change my focus to personal, and turn on the news"_
         | 
         | I think the problem with that is that even I, as a human,
         | struggle to know for sure what you want.
         | 
         | You want to turn all the lights on in the house? Does that
         | include the lamps in the bedroom? How about new lights that you
         | add later? Or the ones in the garden? It's full of ambiguity.
         | What device do you want to watch the news on? Or did you mean
         | the radio? Do you want this to apply when you get back at 2am
         | one night, meaning your family gets woken up when you turn on
         | all the lights and start playing the news in their bedrooms?
         | 
         | I think that's probably why voice interfaces aren't likely to
         | work well for anything beyond direct, specific, well-scoped
         | requests: turn on the lights in the bedroom; turn off the
         | heating at home; roll up the blinds; what's the weather like
         | today; what's the remaining range on my car. They really
         | struggle to deal with anything more complex - not so bad in
         | theory, but really incredibly irritating when they make the
         | wrong decision.
         | 
         | If you had some kind of 24-hour live-in assistant (a butler,
         | maybe?), then they probably have the knowledge and intuition to
         | make sensible decisions in response to fairly unstructured
         | requests. But I think we're miles off getting a voice assistant
         | to do it - not because they can't, necessarily, but because if
         | they mess it up at all it's infuriating.
        
           | bombcar wrote:
           | You can do _some_ of this with shortcuts, and then use Siri
           | to trigger the shortcut. But that involves thinking; the
           | magic of Jeeves is that he knows what you want even before
           | _you_ do.
        
             | bluGill wrote:
             | The problem is there are more different combinations I
             | might want as a shortcut then I have time to
             | program/remember. I can remember something like a dozen
             | commonly used shortcuts. However when 5 years from now I
             | arrive home at 2am (for the first time in several decades,
             | but it will probably happen at some point again in my life)
             | will I remember the correct shortcut - and assuming I do,
             | is it up to date with whatever changes have been made to my
             | house?
             | 
             | What about the shortcut for when I need to leave at 3am for
             | some reason. then a different shortcut for when it isn't
             | just me, but my whole family leaving at 3am. An still
             | another for my son having to leave that early.
             | 
             | Jeeves can figure it out when I arrive at 2am so I don't
             | need to program it.
        
               | matthewmacleod wrote:
               | You've reminded me of some aspects of these platforms
               | that I like in a more general sense - like for example
               | the way the Apple Watch will automatically ring the alarm
               | on my phone if I forget to put my watch on, or if I get
               | up before my alarm goes off the watch will notice and ask
               | if I want to skip the alarm for the day. This stuff
               | genuinely feels almost like magic sometimes - the risk is
               | that when anything like this goes wrong it's awful.
        
               | bombcar wrote:
               | Yeah these are graceful - and the watch will start out
               | very light buzzing and then get louder.
        
         | antupis wrote:
         | If I would be in this space I would just build voice assistant
         | to very specific situations where you cannot type like driving,
         | cooking, doing some sport etc. There is lots of potential but
         | big players are kinda trying build generic tool for every
         | situation which is super hard problem.
        
           | dmitriid wrote:
           | You want utility. The big players want a product that can be
           | monetized and milked for revenue.
        
             | Mistletoe wrote:
             | My Alexa asked me today if I wanted an Avatar theme. No I
             | really do not, Alexa. I was reminded of the article a few
             | days ago how they can't monetize this well and are somehow
             | losing $10 billion. :)
        
         | amelius wrote:
         | > There /is/ power to-be-had
         | 
         | This is not power. This is just first-world problems.
        
         | eternityforest wrote:
         | Timers and reminders alone are enough to make them a pretty
         | nice thing to have though.
         | 
         | I don't really want them to be all that much more powerful,
         | because natural language can be imprecise, and... there's just
         | not much I that I want to automate in a home setting beyond
         | some real simple timers for lights and stuff.
         | 
         | What if I had a bad day and didn't want to see depressing news?
         | Or what if I came home and was talking on the phone when it
         | turned the news on?
         | 
         | True automation as opposed to just telemetry and remote control
         | can easily be annoying more than helpful.
         | 
         | I like the idea of automation... but I don't actually...
         | automate anything aside from timers and reminders.
        
           | ghaff wrote:
           | I think that's generally true though playing music is a
           | little more freeform. (And, guess what? Voice assistants tend
           | to be worse at that.)
           | 
           | The problem is that you have many many billions of dollars
           | have been sunk into making these devices about more than
           | setting alarms and timers. There's actually been a lot of
           | pretty amazing progress. But it's yet another one of those
           | things that getting to 90% to anyone but techies who want to
           | fiddle with their smarthome stuff or otherwise play with the
           | technology.
        
             | eternityforest wrote:
             | They might have a sudden increase in usefulness when
             | smarthome stuff is more common, although smart bulbs are a
             | bit of a hassle in most switched outlets, because the
             | switch is usually more convenient.
             | 
             | Maybe they'll add an app that lets you browse possible
             | commands so it's more discoverable.
        
               | ghaff wrote:
               | It's probably true that a well-integrated smarthome would
               | benefit from voice control.
               | 
               | But I'd observe that I'm going up to my brother's
               | tomorrow and he has all manner of timers and other WiFi-
               | connected stuff and none of it has any sort of
               | centralized control and that's pretty normal even for
               | people who have a lot of that sort of thing.
               | 
               | And, yeah, the only smart light thing I have at home is
               | one thing that doesn't have a controlling light switch
               | and I used X10 for it for years before I got an Alexa.
        
         | 4b11b4 wrote:
         | Talon voice can do everything a keyboard and mouse offers, plus
         | more (contextual awareness, higher level abstraction). Very
         | powerful in combination with modal editing. I'm not affiliated,
         | just a user.
         | 
         | Granted, this is for a specific user base and yes, not in
         | coffee shops.
        
         | Sakos wrote:
         | The big issue is that there's no clearly defined interface for
         | users. What commands are possible? Nobody knows. So people
         | default to the most obvious things like setting a timer. Is it
         | possible to setup your own commands and build your own work
         | flows? AFAIK, no. So the tech is essentially dead in the water
         | until companies fundamentally rethink what they're trying to do
         | with voice assistants.
        
           | jasmer wrote:
           | Yup. At the risk of being glib I would say this is 90% of the
           | issue. Or more like 'the big blocking issue' at the moment.
           | 
           | Voice can do way more than we know, but we have no idea what
           | it does or how to use it.
           | 
           | Standardizing the interface and providing tutorials would
           | possibly change things dramatically.
           | 
           | And this goes for the back-end protocols as well.
           | 
           | The tech is way, way ahead of the UI and integration.
           | 
           | Imagine getting the power of 'git' with no tutorial and not
           | really an understanding of what it does? Good luck with that.
           | 
           | 90% of us would be using it in the car to do a lot of things
           | if we really knew how to do it:
           | 
           | You: "Siri: Command. Open. Mail. Prompt. Recipients starting
           | with S"
           | 
           | Siri: "Sarah, Sue, Sundar"
           | 
           | You: "Stop. Command. Message. To: Sunar. Thanks for the note.
           | Stop. Send without Review"
           | 
           | Some of this already exists, but it's product specific etc.
           | there needs to be some kind of natural universal interface -
           | or we have to wait until the AI is really, really that good.
        
         | sliken wrote:
         | I tried Amazon's Alexa, the top end model with a display. Often
         | it would taunt you about new/interesting things on the screen,
         | but I could never get them to work. I'd had to memorize things
         | to get even the basics working. Ended up unplugging it.
         | 
         | However Google's Assistant in comparison worked great, no
         | memorization, and very useful. Sure time, weather, set timers,
         | and alarms worked great with a very flexible set of natural
         | language queries. Even more complex things like what will be
         | the temperature tomorrow at 10pm, simple calculations and unit
         | conversions. But also things like IMDB like queries about
         | directors, actors, which movies someone was in, etc generally
         | worked well. It seemed to really understand things, not just "A
         | web search returned ...". Even more complex things like the
         | wheelbase of a 2004 WRX would return an answer, not a search
         | result.
         | 
         | With all that said I'm looking for a non-cloud/on site
         | solution, even if it requires more work, most recently noticed
         | https://github.com/rhasspy/rhasspy
        
         | iquerno wrote:
         | I would think that a good command-line is one that responds to
         | me within milliseconds on a crapbox i386 machine, and I can
         | COMMAND it what to do. A good command-line is not a binary blob
         | that cannot parse simple instructions correctly.
         | 
         | At the same time, siri seems to be getting slower and fatter
         | every iteration so perhaps it is becoming more human ;)
        
         | sokoloff wrote:
         | > "hey siri, create alarms every 5 minutes starting at 6am
         | tomorrow"
         | 
         | "OK, I've created an infinite number of alarms, every five
         | minutes, starting at 6 AM tomorrow!"
         | 
         | (As a native English speaker, I'm not sure what specific
         | outcome you _want to happen_ from that request. That 's the one
         | that makes the most sense.)
        
           | ghaff wrote:
           | As a native English speaker, that seems a profoundly odd
           | request but that is what you asked for.
           | 
           | And you now have me wondering how open-ended calendar
           | requests are actually implemented given that they can't
           | literally have entries out to infinity. (I assume they go out
           | some finite period and some background process periodically
           | re-populates future entries.)
        
             | mercutio2 wrote:
             | A recurrence rule is added to a start event, then an
             | occurrence cache is either generated on the fly for periods
             | of interest, or, yes, a rolling cache a year or two in the
             | future is maintained and updated daily.
        
               | ghaff wrote:
               | Perhaps trivial, but actually seems like an interesting
               | question given you have to potentially tradeoff RPCs for
               | routine queries (and the number of database records) vs.
               | being wrong for the random "Am I free on this day three
               | years from now?" query. Of course, the answer may be
               | that, in general, the differences don't really matter.
        
         | SheinhardtWigCo wrote:
         | > There /is/ power to-be-had, but nobody has really tapped it.
         | 
         | This kind of thing can't be built for modern mainstream
         | operating systems because they generally prevent subjugation of
         | the OS components and other programs, even if the user wants
         | that, ostensibly for security reasons.
         | 
         | Unlike a human operator, an assistant "app" can only operate
         | within the bounds of APIs defined by the OS vendors and third-
         | party developers. Gone are the days of third-party software
         | that extends the operating system in ways that the overlords
         | couldn't (or wouldn't) dream of.
        
           | sdf4j wrote:
           | That's not entirely true. Accessibility APIs on macOS, for
           | example, would let you control so many aspects of the OS from
           | user land apps given that permissions are granted. But voice
           | assistants are not up to the task.
        
         | Eleison23 wrote:
         | Me and voice assistants are like me on the ballroom dance
         | floor. I loved to take the lessons and learn all sorts of moves
         | and chain them all together and look impressive, but when I got
         | onto the floor with a partner, I just wouldn't know what to do
         | or where to start. I kept to the "basic" steps and maybe a
         | timid little turn once in a while.
         | 
         | Maybe it's possible to learn a working vocabulary and know how
         | to command a voice assistant. I know my way around several
         | command lines, but I have no idea what to say to Hey Google.
        
           | Avicebron wrote:
           | it almost sounds like you are describing how it feels to
           | learn a new language. And if that's the case and people need
           | to learn "voice assistant" to communicate with their device
           | effectively, hasn't it utterly failed as a natural language
           | processor?
           | 
           | Also I know this is true in other domains as well, obviously
           | there is a common "google-ese" that people learn to narrow
           | down their searches.
        
         | brycehalley wrote:
         | Voice assistants have reached the Unhelpful Valley stage.
         | 
         | When they were a novelty I recall the excitement of trying new
         | commands and layering in context, after many failures I've been
         | conditioned to now only attempt and expect success with generic
         | queries.
        
         | sublinear wrote:
         | I don't think this is actually reliably possible due to the
         | fact that while grammar does tend to follow patterns sometimes,
         | we're fundamentally dealing with an exponential amount of ways
         | to say things to a voice assistant.
         | 
         | In the spirit of the title of this post, someone else also has
         | to say something.
         | 
         | If your argument is that this is a "non-visual command line"
         | there's slim hope of the layperson learning a whole secret
         | grammar without even a goddamn man page just to do their menial
         | tasks.
        
           | ianai wrote:
           | I really doubt *nix would have made it so far if the cli were
           | audio based, too. It's a fundamentally slower and lower
           | bandwidth communication channel.
        
             | zozbot234 wrote:
             | *nix was optimized for low-bandwith channels. That's why
             | the command names and options are extremely terse and
             | typically return trivial output on success. OTOH it was
             | assumed that input would be reliable, so there's no
             | confirmation required for potentially dangerous commands. A
             | "*nix for voice" would need to address that, at the very
             | least.
        
               | ianai wrote:
               | I'd sure be lost if I had to listen to the entirety of a
               | manpage or dmesg output or /var/log/messages read out by
               | voice. Some of those could take hours to read out.
               | Nothing actually trivial about *nix command output. Just
               | sometimes terse.
        
         | gspencley wrote:
         | I might be in the minority, but I also don't want to add things
         | to my life that make my environment noisier or that require me
         | or others living with me to speak more. As much of a Star Trek
         | fan as I am, I never found "The Computer" to be appealing, and
         | always thought of it more as an artistic device. It's a lot
         | easier to communicate a character's intent / action if they are
         | vocalizing it for performance. Even in scenes where they are
         | "typing" something into the computer, they will inevitably be
         | communicating to the captain or another character what they are
         | doing.
         | 
         | In practical reality these interfaces feel, to me, as extremely
         | inefficient. As someone who doesn't particularly like to speak,
         | and prefers silent environments, these interfaces require more
         | energy from me to use. Unless they are serving someone who has
         | a physical impairment then I don't see what problems exist that
         | these solve, but I can identify lots of problems that they
         | introduce (not only noise but privacy / security
         | vulnerabilities etc.)
         | 
         | Personal preference.
        
         | bistable wrote:
         | I think you're identifying some of the right problems here. All
         | voice assistants are based on turn-taking, and when the VoiceAI
         | hits one of those failure points and just comes back with "I
         | didn't get that" it leaves the user in a frustrating state
         | trying to debug what's wrong.
         | 
         | I work at SoundHound where we've been worried about these
         | issues. (I'm going to plug our recent work...) Our new approach
         | is to do natural language understanding in real-time instead of
         | at the utterance (turn) taking level. That way we can give the
         | user constant feedback in real-time. In the case of a screen
         | that means the user sees right away that they are understood,
         | and if not, a better hint of what went wrong. For example a
         | likely mistake is an ASR mistranscription for a word or two.
         | 
         | We still need to prove this is a better paradigm for VoiceAI in
         | products that people can try for themselves, and are working
         | towards that goal. I hope that voice interfaces that were
         | clunky with turn-taking will finally be more naturally usable
         | with real-time NLU.
         | 
         | https://www.youtube.com/watch?v=5WLYH1qHfq8
        
         | serial_dev wrote:
         | > Instead, it's a guessing game about syntax and semantics, and
         | frequently a source of frustration
         | 
         | My biggest frustration with Alexa is getting it play the
         | podcasts I want to listen to. Even popular podcasts with
         | English names are hard to get just right for Alexa. The same
         | goes for song titles and bands that are not popular, or they
         | are in other languages.
         | 
         | Usually when I want to take a shower, I try to get the
         | podcasts/music to play for 2 minutes, then sigh, give up and
         | just say "Alexa play Britney Spears".
        
           | ghaff wrote:
           | And discoverability. For a long drive I probably want to pick
           | out some specific podcast episodes rather than play whatever.
           | I'm just not a whatever background sound sort of person. The
           | interfaces aren't really good enough to present me with some
           | options with voice control only. So I end up mostly pre-
           | populating a "Car" playlist.
        
         | _dain_ wrote:
         | >Voice assistants are basically just mainstream non-visual
         | command-lines, and it's unsurprising to me that something that
         | relies heavily on memorization and extremely specialized
         | "skills" isn't quite taking off in the way it was imagined.
         | 
         | This got me thinking. Voice recognition is basically a
         | commodity now .. there are open source AI engines that can do
         | it offline really well. So the recognition part is solved, you
         | can just grab it from your distro's package manager. Now
         | there's just the language part.
         | 
         | Thing is, I _don 't_ want to speak to my computer using
         | English. Aside from the enormous practical problems in natural
         | language processing you've outlined, I just find the idea
         | creepy[1].
         | 
         | What I want is to unambiguously tell it to do arbitrary things.
         | I.e. use it as an actual computer, not a toy that can do a few
         | tricks. I.e. actually program it. In some kind of Turing
         | complete shell language that is optimized for being spoken
         | aloud. You would speak words into the open source voice
         | recognizer, it writes those to stdout, then an interpreter
         | reads from stdin and executes the instructions.
         | 
         | Is there any language like this? What should it look like?
         | 
         | And yeah that would take effort to learn to use it right, just
         | like any other programming language; so be it. This would be a
         | hobbyist thing.
         | 
         | [1] https://i.kym-
         | cdn.com/photos/images/original/002/054/961/748...
        
           | Shank wrote:
           | > Voice recognition is basically a commodity now .. there are
           | open source AI engines that can do it offline really well. So
           | the recognition part is solved, you can just grab it from
           | your distro's package manager.
           | 
           | I personally don't consider this a fully-solved problem. The
           | best transcription system I've used is OpenAI Whisper, and it
           | doesn't work in realtime. Maybe it's fine on small amounts
           | but it's still not perfect. You really need error to be
           | driven down dramatically. Zoom auto-captions are a joke in
           | terms of how badly they work for me, and Live Text (beta) on
           | macOS is equally dreadful. YouTube auto-captions suck. All of
           | these use industry-leading APIs. If I'm speaking a voice
           | command and one single word is wrong, usually the whole thing
           | fails.
           | 
           | There's an entirely separate issue about things that are
           | Proper Nouns that don't exist. For example, "Todoist" is
           | often misunderstood by Siri. Thus, people started saying "Two
           | doist (where doist rhymes with joist)" to fool it into
           | understanding "Todoist". Media like anime with strange titles
           | from other languages often flat out trolls these
           | transcription systems. ("Hey Siri, remind me to watch Kimetsu
           | no Yaiba tomorrow".)
        
           | Aramgutang wrote:
           | That reminds me of the handwriting recognition approach [1]
           | used in old Palm Pilot devices. Even though the shapes it
           | expected you to draw resembled the corresponding letters, you
           | would never draw them like that if you were writing on paper.
           | 
           | You knew that you were drawing something designed for a
           | computer to recognise as unambiguously as possible, while
           | being efficient to draw quickly and easy to learn for you. I
           | feel like that's the kind of notion that voice interfaces
           | should somehow expand upon.
           | 
           | [1] https://en.wikipedia.org/wiki/Graffiti_(Palm_OS)
        
           | simiones wrote:
           | > Voice recognition is basically a commodity now .. there are
           | open source AI engines that can do it offline really well. So
           | the recognition part is solved, you can just grab it from
           | your distro's package manager.
           | 
           | This is potentially far from true, depending on how exactly
           | you draw the line between "voice recognition" and "language".
           | I've looked at quite a few transcription services, and they
           | fail a lot of the time for most people - those who either
           | have a non-native accent (even if very slight!) or those who
           | do any amount of stammering or other vocal tics.
        
             | ghaff wrote:
             | I find the ML transcription services, given 2 people
             | speaking English with high quality sound and without heavy
             | accents/a lot of jargon, to be adequate for having a
             | skimmable record--such as for extracting quotations (and
             | just go back to the recording to confirm the exact words if
             | it's not obvious). But if I'm publishing a transcript I get
             | a human transcription. Cleaning up the ML stuff takes way
             | too much time and I wouldn't publish a transcript without
             | cleaning it up.
        
               | simiones wrote:
               | I was in fact looking at some transcriptions of my recent
               | meetings, and found one that captures how even small
               | mistakes can make for completely not-understandable
               | transcripts, unless they are manually cleaned up.
               | 
               | Manual transcription:
               | 
               | > So no: long story short, Slum is basically the way we
               | can have an individual [, uhhh,] instance that carries
               | all the licenses.
               | 
               | (Slum is a project name in this case)
               | 
               | Computer transcription (MS Teams):
               | 
               | > So no.
               | 
               | > A long story shorts. Love is basically the way we can
               | have an individual.
               | 
               | > OHS instance that carries all the license.
        
           | viraptor wrote:
           | > So the recognition part is solved
           | 
           | If you're using an averaged American voice - maybe. But it's
           | really not solved for everyone. Google assistant can't set
           | the right timer for me 1/10 times. And that's before we get
           | to heavy accent Scots and others.
        
             | bambax wrote:
             | Obligatory reference:
             | https://www.youtube.com/watch?v=NMS2VnDveP8
        
             | arethuza wrote:
             | Even my "affected Edinburgh accent", as someone once
             | described it, causes no end of trouble with voice
             | recognition.
        
           | draugadrotten wrote:
           | > And yeah that would take effort to learn to use it right,
           | just like any other programming language; so be it. This
           | would be a hobbyist thing.
           | 
           | There are quite a few hobbyists working on local on-prem
           | privacy focused voice assistans with conversation support.
           | 
           | https://www.home-assistant.io/integrations/#voice
           | https://www.home-assistant.io/integrations/conversation/
           | 
           | Have fun. It is a rabbit hole.
        
         | mc32 wrote:
         | To me what's interesting is that MS smelled that it was a
         | problem a while ago and pulled the plug before it ate a hole in
         | their wallet but Amazon and Google keep plugging along
         | ploughing money into a bottomless pit. Apple has a different
         | play and looks like they are controlling their losses there
         | quite well and may act as a slight loss leader for other
         | products.
        
           | foobarian wrote:
           | I can't fathom how they managed to spend so much on it,
           | though. The product has been around for quite a while, as
           | well, so it's not some initial ramp-up cost. $3B/quarter
           | $10B/year? Wow.
           | 
           | Edit: Maybe things like this happen because there are various
           | nerds who lead these products and are good at talking the
           | businesspeople into funding it. Maybe this was only possible
           | at the big tech growth stage while business wasn't that good
           | at telling the value proposition. So end result, lots more
           | engineers get paid which is great in my book :-)
        
         | freeone3000 wrote:
         | Your queries continue to be money-sinks -- even in your _ideal_
         | case, you aren 't buying anything! This query costs them money
         | but earns them nothing. This is useless.
        
         | 1MachineElf wrote:
         | Another pitfall of most voice assistants is that they are
         | really designed first with the corporation in mind rather than
         | the user. Most are proxies for surveillance, advertising, or
         | are just steering consumers back to a preferred set of walled-
         | garden services.
        
         | gernb wrote:
         | > an assistant that's integrated into the OS and can change any
         | setting
         | 
         | That sounds like a security nightmare. Someone walks by and
         | starts changing your system settings? No thank you
        
         | qsort wrote:
         | The problem isn't voice, it's natural language.
         | 
         | Natural language is a fundamentally wrong vehicle to convey
         | information to a computer. It can be useful for some specific
         | tasks, automated Q/A, simple interfaces to databases, stuff
         | where I can't be properly f_ed to remember the syntax or the
         | shortcut like IDE commands.
         | 
         | But the idea it can replace formal language is fundamentally
         | and dangerously incorrect. I agree with Dijkstra's quip, we
         | shouldn't regard formal language as a burden, but rather as a
         | privilege.
        
           | Ajedi32 wrote:
           | I'm just waiting for someone to finally release a voice
           | assistant built around an actual language model, like GPT-3
           | or LaMDA.
           | 
           | It would be more error prone in a lot of ways, which is
           | probably why nobody's done it yet, but it would also be a
           | _lot_ more powerful, and fulfill the vision of conversational
           | AI in a way the current rules-based assistants do not.
           | 
           | I think if powerful language models were easily accessible to
           | normal people (in an inexpensive and completely unrestricted
           | fashion, like with Stable Diffusion) we'd already see this
           | happening in the open source world. Companies are going to be
           | a lot more hesitant to try it though until they have a way to
           | 100% prevent the models from making mistakes that could
           | reflect poorly on the company, which is going to take _way_
           | longer to achieve.
        
           | darkerside wrote:
           | The problem is both
        
           | version_five wrote:
           | Right - natural language works for people because we have
           | minds that are communicating. A virtual assistant has a list
           | of things it can do, and uses language as an interface to
           | them. So the language just becomes obfuscation instead of
           | allowing clarification.
           | 
           | I've said before, I would prefer a voice assistant that
           | optimized for traversing its menu system, in response to
           | unambiguous noises (could be high and low pitch hums or
           | whatever) that lets me bypass the guessing game and use the
           | menu it's hiding
        
             | klibertp wrote:
             | Like this: https://www.youtube.com/watch?v=8SkdfdXWYaI ?
             | Here you traverse the AST, but the idea is similar, I
             | think.
        
           | 4b11b4 wrote:
           | An example backing this is voice assistants that DO work,
           | e.g. Talon voice. But these require defining a language, and
           | then they are very accurate and powerful.
           | 
           | I don't see why a voice assistant for the masses couldn't
           | "train it's own users", for example suggesting the language
           | it does expect. But even then, most times people are talking
           | in noisy environments or talk to fast or don't have an
           | understand of how the machine might work. Regardless, who
           | cares. They ruin the audio environment of a home. They're
           | good for setting timers while you're cooking, that's about
           | it.
        
             | Thlom wrote:
             | Only thing I use Siri for as well.
        
             | Al-Khwarizmi wrote:
             | They're also fantastic at playing soothing music while your
             | hands are busy holding a crying baby.
        
             | tsss wrote:
             | Car voice assistants do this, but they're still clunky and
             | it takes them forever to list their options. Voice
             | interfaces just like CLI suffer from extremely bad
             | discoverability and presentation compared to GUIs and thus
             | will always be limited to specialty applications. CLIs at
             | least have a league of try-hards and hobby linux users to
             | keep them alive.
        
           | RupertEisenhart wrote:
           | Are you trying to say, Alexa should be funding the synthetic
           | language nerds over at Lojban[0] or the Universal Networking
           | Language[1]???
           | 
           | That would be a fun universe.
           | 
           | [0] https://mw.lojban.org/index.php?title=Lojban&setlang=en-
           | US
           | 
           | [1]
           | https://en.wikipedia.org/wiki/Universal_Networking_Language
        
           | foobarian wrote:
           | The problem is that it doesn't make money.
           | 
           | Otherwise, it works great :-) We love the hands-off usage
           | mode because we cook a lot, so adding things to shopping
           | lists or looking stuff up doesn't require cleaning hands in
           | the middle of prep. Also the speakers are pretty darn good
           | for the size and work well for music.
           | 
           | Doing complicated things is right out though. But the simple
           | stuff works fine.
        
           | albertzeyer wrote:
           | On the other side, humans have been fine using natural
           | language to delegate commands to each other.
           | 
           | So maybe it's just that the subfield of natural language
           | understanding is still too early to be really useful. Speech
           | recognition itself has gotten really good but then
           | understanding the context, the intent, etc, all that is
           | natural language understanding, and that is often the
           | problem.
        
             | moffkalast wrote:
             | > have been fine
             | 
             | Citation needed, there's a lot of disagreements and
             | misunderstandings (some have cost lives) that could've been
             | avoided if we didn't have 10 different ways to say the same
             | vague thing that can be interpreted in 20 ways. You think
             | the military uses a phonetic alphabet and specifically
             | structured communications for fun? Or the way planes talk
             | to ATC for example. Where precision and unambiguity is
             | crucial, natural language always gets ditched for something
             | more formal.
        
               | groestl wrote:
               | Thanks for that. A lot of energy is currently sunk
               | because of natural language, and I'd argue gains from
               | employing software (instead of human processes) for
               | various tasks is in part due to scaling up the results of
               | many confusing discussions in natural language about what
               | a specific process actually comprises.
        
               | numpad0 wrote:
               | There is a widely accepted and straightforward thinking
               | that humans has ideas, which are expressed in languages,
               | and that languages being ambiguous is problematic: this
               | I'm starting to have doubts on.
               | 
               | Maybe we don't have clear intentions in the first place,
               | maybe languages are not just ambiguous, but only meant to
               | narrow realms of valid interpretations down to a desired
               | precision, rather than intended to form a logically fully
               | constrained statements. Maybe this is why intelligent
               | entities are needed to "correctly" interpret natural
               | language statements, because an act of interpretation
               | itself is a decision making and an action.
               | 
               | Just my thoughts but I do think there are more to be said
               | than "natural languages are ambiguous".
        
               | bbarnett wrote:
               | This is part of the reason Google search sucks more and
               | more.
               | 
               | Around when Android appeared, and the first voice
               | searches began, Google suddenly started to alias
               | everything.
               | 
               | Search for 'Andy', 'Andrew' appears. Search for 'there',
               | and 'they're' appears.
               | 
               | This has been taken further, now silly aliases such as
               | debian .. ubuntu exist, and as google happily drops words
               | in your search, to find a match, this makes precision
               | impossible.
               | 
               | But, that's the only way to make voice search remotely
               | work, so...
        
               | jefftk wrote:
               | I don't think this is to support voice search: Google
               | generally knows whether a query was initiated by voice or
               | typing. Instead, I think it's because most users find
               | what they're looking for faster with it.
               | 
               | If you have terms you don't want interpreted broadly you
               | can put them in quotes.
        
               | bluGill wrote:
               | Most people are not precise enough in their terminology.
        
               | Zach_the_Lizard wrote:
               | Google "helpfully" ignores the quotes sometimes too.
               | They're not the hard and fast rule they used to be.
               | 
               | I preached the Gospel of Google when the competition was
               | composed of web rings and Altavista, but Google in its
               | infinite wisdom has abandoned the advanced user with
               | changes of this nature.
        
               | jvolkman wrote:
               | Pretty sure quote support has improved recently.
               | 
               | https://blog.google/products/search/how-were-improving-
               | searc...
        
               | thfuran wrote:
               | So what is the gospel de jour, or are we forsaken in
               | these benighted times?
        
               | galaxyLogic wrote:
               | I find voice-assistant often useful for using the phone
               | such as opening a given setting, say make the display
               | brighter. Trying to navigate the settings pages is very
               | error-prone. There seems to be no universal standard as
               | to where each setting should be found.
        
               | shanebellone wrote:
               | This is actually an interesting point. In the Army, we
               | used terms that limited ambiguity thereby increasing
               | efficiency. Even if one eliminates the complexity of
               | language, there's still a specification problem.
               | 
               | I only use voice assistants to set alarms. I cannot
               | imagine voice as a primary input. Then again, many have
               | opted out of owning desktops and laptops in favor of
               | mobile phones. That also seems terribly inefficient.
        
               | ghaff wrote:
               | >Then again, many have opted out of owning desktops and
               | laptops in favor of mobile phones. That also seems
               | terribly inefficient
               | 
               | A lot of people don't _need_ computers in the general
               | purpose sense. I admit my mind boggles a bit when co-
               | workers tell me their kids don 't want a computer to do
               | their school papers because their phone is fine. But,
               | then, I'm used to keyboards and what we think of as a
               | "computer" and have been using one for decades--and grab
               | one when I can for any remotely complex or input-heavy
               | task.
        
               | em500 wrote:
               | > A lot of people don't need computers in the general
               | purpose sense. I admit my mind boggles a bit when co-
               | workers tell me their kids don't want a computer to do
               | their school papers because their phone is fine.
               | 
               | I grew up in the 1980s, when handwritten papers were
               | still the norm. I do see the advantages of using a word-
               | processor for writing papers, but don't see why it would
               | be a necessity (at least, until University).
        
               | icapybara wrote:
               | I think the implication is that the kids use a word
               | processor on their phone.
        
               | [deleted]
        
               | moffkalast wrote:
               | It sounds ridiculous, but I'll admit that when you've got
               | something like Dex that lets you dock the phone for usb
               | and hdmi out and gives you close to a full desktop OS I'd
               | imagine it really is enough for the casual user.
        
               | ghaff wrote:
               | I certainly know colleagues in the industry who travel
               | with just a tablet and external keyboard. No, they're not
               | running IDEs etc., but they find it OK for emails,
               | editing docs, taking notes, etc. Personally I'll spend
               | the extra few pounds to also carry along a laptop. But I
               | can imagine not needing/wanting a dedicated laptop when I
               | travel at some point.
        
               | iso1631 wrote:
               | Is a tablet and keyboard really much lighter than a
               | laptop?
               | 
               | https://www.theverge.com/2020/4/20/21227741/apple-ipad-
               | pro-m...
               | 
               | Suggests a keyboard and large tablet is heavier than a
               | laptop
        
               | everdrive wrote:
               | The obsession with being lighter definitely has
               | diminishing returns. At some point another few ounces
               | doesn't make any difference in a real, practical sense. I
               | think have just started to associate "lightness" ==
               | "better" despite there being no actual benefit past a
               | certain threshold.
        
               | galaxyLogic wrote:
               | Right at some point. But at the current point my tablet
               | is too heavy to hold in hand for more than 20 secs
               | perhaps. Phone is ok. Tablet is not (for me). I only use
               | tablet by placing it on table or a stand. Then actually
               | using a laptop is much better than a table.
               | 
               | The killer-tech will be when we have a tablet that is as
               | light as phone.
        
               | ghaff wrote:
               | I'm usually carrying a tablet anyway though for
               | entertainment/reading purposes. So it's usually a choice
               | of tablet + laptop vs. tablet + keyboard. (I admittedly
               | don't really have a weight optimized travel laptop these
               | days either.)
               | 
               | I actually do wish there were good Mac or Chromebook
               | choices for a travel 11" or so laptop but the market
               | seems to have settled on a thin 13" as the floor and,
               | admittedly, the weight/size difference isn't huge.
        
               | mark_l_watson wrote:
               | While I am mostly a Mac person, for travel I often prefer
               | a tiny and cheap Lenovo Chromebook that does everything
               | (a bit poorly): Linux containers for light weight
               | programming and writing, consume media like books,
               | audiobooks, and streaming.
               | 
               | In response to a grandparent comment about weight for
               | tablets: I prefer Apple's folio old style of
               | cases/keyboards because of weight. I have one for both my
               | small and large iPad Pros. Whenever I travel, I usually
               | just take one of my iPads if I don't need a dev
               | environment [1].
               | 
               | [1] but with GitHub Codespaces and Google Colab,
               | development on an iPad is sort of OK.
        
               | moffkalast wrote:
               | I still don't see the point of tablets. It's just a
               | smartphone with a larger screen, and practically all
               | people already carry phones.
               | 
               | Might as well go for the laptop at that point given that
               | it can actually do far more imo, unless you ditch the
               | phone and go for one of those half phone half tablets I
               | guess.
        
               | ghaff wrote:
               | I'd rather watch movies, read, play certain games, etc.
               | on my tablet than on a phone. (Obviously there are also
               | specific use cases like digital art.) That said, I mostly
               | use my tablet when traveling and it's a distant third in
               | necessity compared to either a laptop or a phone--and
               | only somewhat more useful than a smartwatch.
        
               | everdrive wrote:
               | Watching movies on a tablet is terrible, though. All
               | methods for propping the device up so you can watch the
               | movie are inferior to the way a laptop screen props
               | itself up via hinges and a base.
        
               | ghaff wrote:
               | On a plane I'd rather use the tablet in my lap than have
               | to put the tray table down. And in a hotel room I'm
               | watching on the couch if there is one. (I do also have an
               | attachment for my tablet that will let you prop it up on
               | a table but I mostly don't use it because it adds
               | weight.)
               | 
               | For reading, I'm probably bringing my Kindle along if I
               | don't bring my tablet.
        
               | pfdietz wrote:
               | How old are you? Because larger screens become really
               | nice as your eyes go bad. And I don't need the full size
               | of a laptop for things I'd want to do on a tablet.
        
               | mod wrote:
               | I bought a surface for that reason. I like the
               | portability, and it is just a normal PC with a pretty bad
               | keyboard.
        
             | psadri wrote:
             | I agree with this. We have evidence that natural language
             | works well enough to run most of the world. AI will
             | eventually get there.
        
             | denton-scratch wrote:
             | > humans have been fine using natural language to delegate
             | commands to each other.
             | 
             | Not always resulting in unambiguous instructions:
             | 
             | "Lord Raglan wishes the cavalry to advance rapidly to the
             | front, follow the enemy, and try to prevent the enemy
             | carrying away the guns." ~Lord Raglan, Balaclava
             | 
             | "I wish him to take Cemetery Hill if practicable." ~Robert
             | E. Lee, Gettysburg
        
             | heavyset_go wrote:
             | > _On the other side, humans have been fine using natural
             | language to delegate commands to each other._
             | 
             | On the other hand, legalese exists and is the lingua franca
             | of telling people what to do, and math exists.
        
             | missjellyfish wrote:
             | > On the other side, humans have been fine using natural
             | language to delegate commands to each other.
             | 
             | And that's why all of aviation has moved to a tight
             | phraseology, such that delegated commands are universally
             | understood and their meaning is set in stone.
             | 
             | Natural language has cost many lives.
        
             | stubish wrote:
             | > On the other side, humans have been fine using natural
             | language to delegate commands to each other.
             | 
             | Using language to instruct humans goes wrong all the time.
             | Just a short while ago on British Bakeoff I saw 2 of the
             | contestants make white chocolate feathering on their
             | biscuits by making actual feathers out of white chocolate
             | and placing them on their biscuits. And I'm sure that will
             | confuse quite a few people reading this too. It certainly
             | confuses image searches. Language is a fuzzy interface.
             | Compare to interface like clicking on a button that does
             | the thing I want done.
        
               | Closi wrote:
               | How would you (easily) describe the concept of chocolate
               | feathering to a computer without using natural language?
               | (e.g. if you wanted the computer to generate an image, or
               | search for an image of / recipe with chocolate
               | feathering).
        
             | marcosdumay wrote:
             | > humans have been fine using natural language to delegate
             | commands to each other.
             | 
             | Every time we try to minimize errors, we formalize a
             | language. I don't even think people use natural language to
             | issue commands often. Commanding people is often considered
             | rude.
        
             | ska wrote:
             | > On the other side, humans have been fine using natural
             | language to delegate commands to each other.
             | 
             | I think this is really a characterization. Mostly human
             | communication is full of errors and problems.
             | 
             | What is true is that when it is important enough, humans
             | have come up with ways that minimize communication errors
             | and frameworks to deal with ambiguity - mostly these
             | involve training and effort though, it really doesn't come
             | naturally.
        
               | ska wrote:
               | "really a problematic characterization"...
        
           | pjc50 wrote:
           | The problem is that it's not actually a conversation. To
           | significantly improve it, you'd want to:
           | 
           | - identify users by voice
           | 
           | - ask them clarifying questions
           | 
           | - remember the answers on a per-user basis
           | 
           | - understand "no, that was the wrong answer"
           | 
           | If you're going to provide a formal interface to the
           | computer, you also have to provide teaching in that formal
           | interface, which is far more of a burden to the user than the
           | cost of the device. And we've completely moved away from that
           | model (not necessarily a good thing, but that's what the
           | market has chosen).
        
             | enobrev wrote:
             | Calling it a burden is an assumption that ignores and
             | belittles the end user. Sure, there are people who won't
             | want to train their personal ai.
             | 
             | But I imagine there are significantly more who would
             | appreciate clarifying requests by a teachable assistant
             | capable of interacting with the entire digital world on
             | their behalf, efficiently and intelligently.
        
             | michaelbuckbee wrote:
             | I think you're right. There are glimpses of this in the
             | voice interfaces right now. For example, Alexa will
             | distinguish between voices and preferentially take actions
             | for me, saying "Play Music" plays Spotify, and for my kids,
             | it plays Amazon music.
        
           | bombcar wrote:
           | I'd be perfectly happy with a list of Siri commands that _I_
           | would have to learn to be able to do things. I don 't care if
           | I ended up sounding like:
           | 
           | Hey Siri
           | 
           | Turn lights on 50 percent
           | 
           | For one hour
           | 
           | Dim over that time
           | 
           | Play music.
           | 
           | I can learn what I need to do; JUST LET ME KNOW THE MAGIC
           | WORDS!
        
             | LooseMarmoset wrote:
             | It's like playing Zork all over again.
        
               | toxik wrote:
               | But with the added complexity that sometimes the speech-
               | to-text will just crap out completely.
        
               | LooseMarmoset wrote:
               | Alexa, turn on lights
               | 
               | ...I don't know how to do that
               | 
               | Alexa, turn lights on
               | 
               | ...What do I turn the lights with?
               | 
               | Alexa, activate lights
               | 
               | ...I don't know what you mean
               | 
               | ...It is pitch black. You are likely to be eaten by a
               | grue.
               | 
               | ALEXA TURN ON THE DAMN LIGHTS
               | 
               | ...I don't know the word "lights"
               | 
               | ...Oh no! You have walked into the slavering fangs of a
               | grue!
               | 
               | ** You have died **
        
               | bombcar wrote:
               | Siri, turn on bathroom lights.
               | 
               | Downstairs or upstairs bathroom?
               | 
               | Downstairs.
               | 
               | Sorry, I didn't understand. Downstairs it upstairs
               | bathroom?
               | 
               | Downstairs bathroom.
               | 
               | Sorry, I didn't understand. Downstairs it upstairs
               | bathroom?
               | 
               | Cancel.
               | 
               | Ok. Cancelling.
               | 
               | Siri turn on downstairs bathroom lights.
               | 
               | (Turns off all lights)
        
               | gernb wrote:
               | For me, about once a week it's
               | 
               | "hey siri?"
               | 
               | (no response, no icon),
               | 
               | "hey siri?"
               | 
               | (no response, no icon),
               | 
               | "hey siri?" (louder)
               | 
               | (no response, no icon),
               | 
               | "hey siri?" (louder and slower)
               | 
               | (no response, no icon),
               | 
               | reboot iphone 13 pro
               | 
               | "hey siri?"
               | 
               | works
        
               | spookthesunset wrote:
               | "Did you mean 'bathroom LED' or 'bathroom'?"
               | 
               | Because god help you if your device names are similar to
               | your room names...
        
               | bombcar wrote:
               | I've taken to naming my lights things like Greg, The
               | Beacons, etc.
               | 
               | And I added scenes so I can say "Gondor calls for aid"
               | and the beacons will light.
        
               | ghaff wrote:
               | Yes. And it may be worth noting that Zork is literally
               | something like 50 year old parser technology.
        
               | guestbest wrote:
               | A lisp compiler in a voice assistant would seem like an
               | improvement in that the user could define objects and
               | then express the actions to be performed in the same
               | room. But these assistants seem to drop objects between
               | commands making them hard to program conversationally.
               | 
               | I guess a list like language would be ideal and the
               | pauses would be like parentheses
        
             | wkdneidbwf wrote:
             | i don't know that you can do exactly all these things, but
             | is this the use case for custom routines in the amazon
             | ecosystem.
             | 
             | you great the prompt and add one or more actions to take.
        
             | everdrive wrote:
             | I highly doubt there is "a" magic list. I'll bet the magic
             | list changes constantly.
        
               | bombcar wrote:
               | I noticed a drop in usability about the time they went
               | with ML.
        
               | ASalazarMX wrote:
               | Same with the predictive keyboard, it feels more random
               | now.
        
             | ics wrote:
             | Not to take away from your point (I'd like the magic list
             | too) but to some degree, this can be worked around using
             | Shortcuts. If you use inputs, Siri will prompt for them
             | which is a bit slow but you could even use a dictate text
             | and parse yourself if desired.
        
           | ClumsyPilot wrote:
           | > we shouldn't regard formal language as a burden, but rather
           | as a privilege
           | 
           | What the hell? Is riding public transport or riding a bike
           | either a burdain or a privilidge? Is Driving a car?
           | 
           | I am trying to control shit in my home, it should be neither.
        
             | duggan wrote:
             | Dijkstra's full essay[1] is a bit more illuminating, but
             | essentially it's about how, for example, developing a
             | system of symbols and formal language around mathematics
             | has allowed "school children [to] learn to do what in
             | earlier days only genius could achieve".
             | 
             | 1: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06
             | xx/E...
        
               | em500 wrote:
               | I think his argument even generalizes to literacy in
               | general. Remember that reading and writing skills don't
               | develop naturally (as opposed to spoken language). They
               | require a large educational investment, and used to be
               | reserved for the wealthy and the privileged.
        
           | gernb wrote:
           | Natural language conveys information to other people just
           | fine. So the problem isn't that "Natural language is a
           | fundamentally wrong vehicle to convey information to a
           | computer". The problem is getting the computer to understand
           | natural language to the same level as a human.
        
       | adamsmith143 wrote:
       | They just can't do useful things for me. I can do things like
       | turn off a light or get the weather which are only marginally
       | better than just flipping a switch or pressing one button on my
       | phone. It's not a large gain in efficiency.
        
       | vintermann wrote:
       | That's right, they're doing it for big surveillance.
       | 
       | Though, as much as these services are apparently bleeding, the
       | in-kind payments from big surveillance had really be worth it.
        
       | backtoyoujim wrote:
       | i don't want my voice assistant to be a sock puppet for a
       | corporate identity proctologist.
       | 
       | can you imagine hiring a production assistant that gave
       | everything you asked them to do to a corporate competitor ?
       | 
       | Why would I want to do that with my life and Amazon and Apple ?
        
       | [deleted]
        
       | nkrisc wrote:
       | I wrote off Siri when I was driving and said "play episode 6 of
       | XYZ podcast" and was completely incapable. If it can't do
       | something like that, then what's the point? It's no different
       | than those hands-free Bluetooth adapters for your car that my dad
       | uses for his old android phone.
       | 
       | There are many other seemingly simple tasks it has failed at. All
       | I use it for now is sending texts and turning in navigation when
       | I'm driving.
        
       ___________________________________________________________________
       (page generated 2022-11-23 23:02 UTC)