[HN Gopher] Hold on there: WPA3 connections fail after 11 hours
       ___________________________________________________________________
        
       Hold on there: WPA3 connections fail after 11 hours
        
       Author : zdw
       Score  : 179 points
       Date   : 2024-01-25 21:21 UTC (2 days ago)
        
 (HTM) web link (rachelbythebay.com)
 (TXT) w3m dump (rachelbythebay.com)
        
       | HenryBemis wrote:
       | "It's not a bug, it's a feature". This way we know you are really
       | serious of keeping the thing going, by making sure that every 11
       | hours you are THERE by "turning it off and on again" (to quote
       | the the IT Crowd)
        
         | develatio wrote:
         | This is the hardware version of Netflix's "Are you still
         | watching?"
        
       | pierat wrote:
       | Or another reason to go with a real computer on eBay vs a RPi and
       | loads of other equipment.
        
         | Jabrov wrote:
         | According to the article, this has nothing to do with Pi and is
         | more about WPA3 itself
        
           | vlovich123 wrote:
           | Unclear if WPA3 or this specific Infineon chip. Most likely
           | it's a chip issue where some counter overflows or something
           | and suddenly the crypto starts failing.
        
             | MBCook wrote:
             | My read was it's the chip/driver combo.
             | 
             | And the Pi is not the only thing to use that chip, it's
             | just very popular.
             | 
             | So even if one's attitude is "get a real computer" like the
             | first post in this thread you may still find yourself with
             | this same chip.
        
               | tialaramex wrote:
               | I can easily buy either
               | 
               | 1. The chip works perfectly but it's documented badly/
               | wrongly, e.g. doesn't spell out the whole strategy needed
               | for a periodic key renewal or does so confusingly -
               | driver is written to match the docs or at least the best
               | understanding of these docs, and unfortunately this
               | defect arises after 10 hours but devs with the actual
               | docs have never tested for that long.
               | 
               | 2. The chip has a bug, it's not possible to use it
               | correctly per se. Every few hours, just reset it and
               | start over, thus a "good" driver should do resets when
               | quiescent. e.g. After 2 hours, if you did no work for 60
               | seconds, reset it, after 4 hours make that 15 seconds,
               | after 6 hours, 5 seconds, after 7 hours immediately on
               | quiescence, and after 8 hours just reset immediately
               | rather than wait for the bug.
               | 
               | [Edited to fix numerous typographical errors]
        
         | tamimio wrote:
         | Although I don't think this issue relates to the pi, but I
         | agree, some people think pis will do anything and everything
         | just because you managed to hack some components together,
         | sure, it might works but it isn't reliable, build a proper
         | computer for that task and save yourself the trouble.
        
         | Levitating wrote:
         | Yes, curse these low power cheap system on a chips! All
         | embedded software should be run on x86! Routers on x86, phones
         | on x86, smart thermostats on x86. Just buy them on eBay am I
         | right.
        
           | aleph_minus_one wrote:
           | > All embedded software should be run on x86! Routers on x86,
           | phones on x86, smart thermostats on x86.
           | 
           | In 2013, Intel made an attempt to make x86 suitable for
           | embedded purposes with the Intel Quark microcontroller:
           | 
           | > https://en.wikipedia.org/wiki/Intel_Quark
           | 
           | and branded developer boards for it under the name Intel
           | Galileo:
           | 
           | > https://en.wikipedia.org/wiki/Intel_Galileo
           | 
           | As far as I aware the markets rather preferred existing
           | microcontroller solutions instead of Intel's offer.
        
             | fifteen1506 wrote:
             | Maybe they'll try again with x86s.
        
             | topspin wrote:
             | MCUs are all about peripherals. The ISA barely matters. The
             | peripherals are frequently ported forward unchanged or
             | enhanced in forward compatible ways from older MCUs with
             | different ISAs because they are specialized, proprietary,
             | carefully refined circuits and developers can port their
             | working C peripheral drivers with minimal change. You can't
             | helicopter in to that market with your half baked, brand
             | new peripheral suite and expect everyone to adopt it.
             | 
             | Intel did go far in set top boxes and smart TVs with their
             | atom/canmore (SoC, not MCU) platform in ~2008 and onward. I
             | ported some media software to it. I think a lot of that has
             | been given back to ARM and Android since, however.
        
           | postepowanieadm wrote:
           | "cheap"
        
           | toast0 wrote:
           | Intel kind of abandoned embedded recently, but 80186 was x86
           | for embedded before it was cool. And they've tried to make it
           | work from time to time.
           | 
           | x86 routers are great. x86 phones aren't awful, and could
           | have been amazing if Intel didn't cancel them right before
           | Microsoft released Continuum.
           | 
           | Not sure if I've seen a smart thermostat on x86, but I dunno
           | sure, as long as its got a common wire do it can pull power,
           | should be fine.
        
         | NoZebra120vClip wrote:
         | "Here's a nickel, kid."
         | http://www.miketaylor.org.uk/tech/eta/doc/dilbert.gif
        
       | throw0101d wrote:
       | > _The thread runs for close to a year, and then just stops cold
       | in August 2022 with no resolution._
       | 
       | "Who were you, DenverCoder9? _What did you see?_ "
       | 
       | * https://xkcd.com/979/
        
         | toyg wrote:
         | Some datascientist out there should crunch the numbers and give
         | us the probability that an XKCD link will appear in any given
         | HN post. I bet it's fairly close to 1.
         | 
         | One day we will have an "XKCD number", to represent how far
         | from an XKCD reference any existing concept is.
        
           | chaboud wrote:
           | Maybe the useful XKCD number would be the proportion of
           | comments before and after the first XKCD reference? There's
           | also the "true XKCD percentage", which could exceed 1 for
           | highly XKCD'd topics.
        
             | toyg wrote:
             | _> the useful XKCD number would be the proportion of
             | comments before and after the first XKCD reference_
             | 
             | That would be XKCD velocity, I guess.
        
               | ploum wrote:
               | Velocity is indeed a nice idea.
               | 
               | According to XKCD's law, the probability to have an xkcd
               | reference is proportional to the number of comments (see
               | https://ploum.net/xkcds-law/index.html )
               | 
               | But adding velocity is really nice. Should think more
               | about it...
        
           | amenghra wrote:
           | I don't consider myself a datascientist but I did some HN x
           | XKCD analysis a while ago. The HN dataset is hosted on Google
           | BigQuery and it is included in some TB per month of free tier
           | processing you get. I.e. you can answer your question quite
           | easily. See my stuff if you are interested:
           | https://www.quaxio.com/hackernews_xkcd_citations/
        
             | gavinhoward wrote:
             | Great post!
             | 
             | You should submit a Hacker News post for it!
        
               | amenghra wrote:
               | [delayed]
        
         | layer8 wrote:
         | The even better case is when the user posts "nevermind, found
         | the solution", but doesn't post the solution, and marks the
         | thread as "[RESOLVED]".
        
       | throw0101d wrote:
       | One hypothesis: the cipher needs regular key rotation? E.g. (not
       | what WPA3 actually does):
       | 
       | > _Encryption keys should be changed (or rotated) based on a
       | number of different criteria: (...) After the key has been used
       | to encrypt a specific amount of data. This would typically be
       | 2^35 bytes (~34GB) for 64-bit keys and 2^68 bytes (~295 exabytes)
       | for 128 bit keys._
       | 
       | * https://security.stackexchange.com/questions/259808/why-shou...
       | 
       | Though this would be bit-dependent and not time-dependent.
       | 
       | I'm guessing that this likely some kind of 'uptime bug':
       | 
       | * https://en.wikipedia.org/wiki/Integer_overflow
        
         | genewitch wrote:
         | i highly doubt they sent 34GB in 11 hours - or, put another
         | way, sent 34GB consistently ever 11 hours such that the keys
         | were exhausted after the same amount of time. This isn't a
         | smart washing machine we're talking about here, just [a]
         | raspberry pi(s)
        
           | Jtsummers wrote:
           | In exactly 11 hours on three devices, in particular. On one
           | device, the first time it fails after 11 hours could be a
           | coincidence (unlikely, but possible) but repeatedly happening
           | across all three devices suggests something entirely
           | different.
        
         | DannyBee wrote:
         | GTK rekey interval is time dependent. Most AP's set it to 3600
         | seconds by default.
        
         | londons_explore wrote:
         | Why are we allowing anyone to use 64 bit keys in 2024? Why is
         | it even part of the WPA3 spec?
         | 
         | To be honest, whoever wrote that bit of the spec should have
         | 'probably part of a three letter agency, don't trust what they
         | say' to the top of their Wikipedia page.
        
           | throw0101d wrote:
           | > _Why are we allowing anyone to use 64 bit keys in 2024? Why
           | is it even part of the WPA3 spec?_
           | 
           | "We" are not, and it is not. That is just a general example
           | of the concept of needing key rotation.
        
           | 38 wrote:
           | because many programming languages dont support anything
           | beyond 64 bit integers.
        
             | Jtsummers wrote:
             | Maximum integer size on a CPU or in a language has nothing
             | to do with key sizes for serious (non-toy) cryptographic
             | systems. Use multiple integers, likely in an array, if
             | needed, as has been done for a very long time.
        
               | tialaramex wrote:
               | Yes, it's not usual for the rest of the software to think
               | about these as integers at all, they're just a bunch of
               | bits, like a JPEG so yes, the 128-bit key would be e.g.
               | Rust's [u8; 16] exactly 16 contiguous bytes.
               | 
               | The encryption algorithms themselves, _if_ they 're even
               | written in a high level language rather than supplied as
               | machine code, perhaps using hardware acceleration, may
               | treat this some other way, but that's completely
               | irrelevant to you in the rest of the code. Maybe it sees
               | this as [u16; 8] or as [[u32; 2]; 2] for some reason, you
               | don't care.
        
           | Jtsummers wrote:
           | Turns out that that text is incorrect and it's 64-bit and
           | 128-bit block sizes. The text on the cheat sheet page has
           | been partially corrected to now say 128-bit block size, but
           | still says 64-bit key size.
           | 
           | The current version of that page reads as:
           | 
           | > This would typically be 2^35 bytes (~34GB) for 64-bit keys
           | and 2^68 bytes (~295 exabytes) for 128-bit block size.
           | 
           | So it's sloppy writing that's the issue, not that people are
           | still using 64-bit keys. (I had a similar question reading
           | the quote above and followed the link where this was pointed
           | out.)
        
             | scharman wrote:
             | I'm not a crypto geek! But, I thought the block size had to
             | be smaller than the key size?
        
               | Dylan16807 wrote:
               | There's no rule that you need to mix a raw key bit with
               | every data bit. Block ciphers usually expand their key
               | into a bunch of subkeys to use in different rounds, and
               | you can stretch that expansion as far as you desire.
               | 
               | And if you squint, a stream cipher is just a block cipher
               | with a stupidly large block.
        
       | dmorgan81 wrote:
       | 11 hours = 39,600 seconds
       | 
       | Max signed int16 = 32,767
       | 
       | Maybe a signed overflow?
        
         | stevewodil wrote:
         | What happened to the other two hours
        
       | rubatuga wrote:
       | Just a reminder that the original ath9k driver supports WPA3 and
       | 802.11s mesh without any hacks. The open source wifi ecosystem is
       | not cursed... although we are stuck with old hardware.
        
       | natch wrote:
       | But is there an automatable workaround?
       | 
       | Even a 'no' to this question would be a useful addition to this
       | article.
        
         | genewitch wrote:
         | yes, but it's relatively uninteresting; you could just reboot
         | every 10 hours and 59 minutes, for example. You could check the
         | outuput of a ping command and if it stops giving new sequence
         | numbers or whatever ifupdown the interface, and so on.
        
           | natch wrote:
           | So say a scripted reboot, like in a cron job, is that what
           | you're talking about? Automatable is important. I suppose the
           | script would have to run as root or with suid root? And
           | thanks for the reply.
        
             | simcop2387 wrote:
             | I would probably use a systemd timer instead because it has
             | a feature that most crons don't. You can tell it to start
             | the reboot 10 hours and 55 minutes after the timer was
             | started, regardless of time of day or other clock concerns.
             | that means you don't need valid global time or anything to
             | schedule it properly and it'll be built in to the already
             | there raspberry pi os (and most other distros for the pi)
             | so there's no new software to install or setup.
        
           | JoBrad wrote:
           | Or could you just establish a new connection on the same
           | cadence?
        
         | clhodapp wrote:
         | Posting workarounds is quite fraught because as soon as you say
         | that there's a workaround, upstream responses tend to
         | immediately become "has workaround, wontfix"
        
       | jandrese wrote:
       | Damn, I was hoping this was going to be a thread about the guy
       | who tracked down the bug in that damn chip and fixed it with some
       | binary patch even though the vendor was completely useless.
        
         | redundantly wrote:
         | I'd like to read that.
        
           | LoganDark wrote:
           | I know, right?
        
       | DannyBee wrote:
       | Reading that thread, it's failing to rekey properly.
       | 
       | The rekey interval is probably set to 3600 seconds, something
       | dumb breaks after the 10th rekey (36000 seconds), and so when the
       | 11th rekey interval comes around at the 11th hour (39600 seconds)
       | and it hasn't re-key'd, the authentication ends up no longer
       | valid and it disconnects.
       | 
       | Easy way to test this theory:
       | 
       | Set the re-key interval on the AP to like 10 seconds, see if it
       | disconnects after 110 seconds.
        
         | k_sze wrote:
         | sounds like a 16-bit int overflow?
        
           | Jtsummers wrote:
           | A signed 16-bit integer rolls over after 32767, which is a
           | little after 9 hours and 6 minutes (if we're tracking
           | seconds). An unsigned 16-bit integer rolls over at 18:12 and
           | change. Neither fits the observed 11 hour mark.
        
             | vlovich123 wrote:
             | HW often used counters to represent multiple seconds so I
             | was thinking something like a 32 bit counter representing
             | the number of 10us elapsed which comes out to just over 11
             | hours.
             | 
             | Hard to say though since no obvious counter comes out to 11
             | hours. That being said, a counter of some kind going wrong
             | would be the most obvious reason to experience an issue.
             | Maybe there's a WiFi frame counter that participates in
             | ratcheting the key used to encrypt each frame or something.
        
               | Jtsummers wrote:
               | Same issue.
               | 
               | A 32-bit unsigned integer would roll over after ~42949
               | seconds used as a 10 us counter. That's just shy of 12
               | hours at 11:55 and change, not just over 11.
        
       | Thri4895o wrote:
       | > Broadcom/Cypress/Infineon CYW stuff
       | 
       | > My conclusion: this entire ecosystem is deeply cursed
       | 
       | There is no curse, but cheap junk sold for premium price! Those
       | components are designed for cheap laptops and tvs. Building
       | networks and servers out of this foolish!
       | 
       | If you want stable WiFi, get Intel card connected over Pcie, not
       | USB! It is like 20 USD with official WPA3 support!
        
         | mardifoufs wrote:
         | Uh? I thought Broadcom and Infineon made very solid stuff...
        
           | Thri4895o wrote:
           | Pi is build from cheapest components!
           | 
           | Broadcom is famous for their terrible Linux support (typical
           | android style binary firmware blob dump). Many drivers are
           | unofficial, only reverse engineered. Some commenter here even
           | mentions patching binary blobs to fix issues!
           | 
           | Murata Type1GC mentioned here, goes back to 2015. It also has
           | several compromises to keep size small. Hardly stellar
           | config!
        
           | bpye wrote:
           | Broadcom at least are certainly capable of making reliable
           | network devices, their switch ASICs [0] are very common.
           | 
           | [0] - https://www.broadcom.com/products/ethernet-
           | connectivity/swit...
        
       | drewg123 wrote:
       | I have seen issues with apple (Mac and iPhone) devices loosing
       | connectivity on wpa2 after each rekey. Symptom is the device
       | drops offline for several seconds, sometimes connecting another
       | network if one is available. This wreaks havoc with FaceTime
       | calls. I've also seen iPhones drop off wifi entirely if another
       | network is not available until you disable and re-enable wifi
       | 
       | I found that extending the rekey interval makes it happen less. I
       | think i set the interval to something very large
        
         | poyu wrote:
         | I've always thought this is due to Wi-Fi Assist [0] being less
         | than helpful? Never tested any further though
         | 
         | [0]: https://support.apple.com/en-us/102228
        
       | Scene_Cast2 wrote:
       | I recently found out that WPA3 and especially WPA2/WPA3 mixed
       | were having issues with WiFi roaming - as in, with 2 access
       | points, devices were not switching from one to another unless one
       | of them just drops the client to force a reassociation.
       | 
       | I ended up just using WPA2, as I don't have any 6GHz or WiFi 7
       | APs.
        
       | barbegal wrote:
       | I suspect that this patch
       | https://w1.fi/cgit/hostap/commit/?id=b0f457b6191aa8ee329331c...
       | may fix the issue although the default key lifetime is 12 hours
       | not 11 but maybe there is something which runs every hour and
       | checks which keys will expire within the next hour. New
       | technology like this is often horrifically under tested and there
       | must be so many bugs for those who go hunting.
        
       | nikanj wrote:
       | It's an interesting failure mode of our current economic system
       | that the fix for this issue will inevitably come from a lone
       | frustrated outsider.
       | 
       | The employees of the company are much better equipped with all
       | the insider knowledge, blueprints, source codes et cetera. But
       | the incentive models for companies essentially guarantee that
       | they'll never go through the trouble of solving a persistent
       | problem for an already-on-the-market product, if the problem is
       | not big enough to cause a recall or substantial numbers of
       | product returns
       | 
       | So our only hope is some amateur sleuth operating from home,
       | stabbing blindly at the card with caffeine and hatred as the only
       | tools in their disposal. But they'll probably solve it, and we'll
       | get to read a cool blog post about it!
        
       | roboman wrote:
       | I've worked as an engineer in the Wi-Fi industry. Here's my
       | advice: stay at least one or, even better, two generations behind
       | the current Wi-Fi standards.
       | 
       | Vendors care about only two things: (1) Cost (2) the Gbps they
       | can print on the box.
       | 
       | So today that means 802.11ac and WPA2, unfortunately.
       | 
       | Stability of the software is not a consideration.
        
         | itronitron wrote:
         | I'm the proud owner of a major brand WiFi router, and there is
         | a typo on the admin login screen. Not a good look, but it's the
         | best we've got.
        
         | silisili wrote:
         | I think ax is now 2 generations behind, if we count 6e and 7,
         | and a pretty nice upgrade over ac.
         | 
         | The nice thing about the newer 2 is 6ghz, for people in highly
         | congested areas. I live on an acre and pick up tons of 2.4ghz,
         | some 5ghz. I can't imagine how bad it would be in a modern
         | tract home development, or worse, a large apartment complex.
        
       | spr-alex wrote:
       | There's a misunderstanding in the article and the comments worth
       | clarifying for people on SAE/WPA3 and the 6Ghz/6-E Bands.
       | 
       | WPA3 is only required when operating on the 6ghz bands.
       | 
       | For 802.11ac/ax its not required on the 5ghz bands.
       | 
       | So it is possible to spin up an AP on 6ghz with SAE only, and
       | another AP on 5ghz in WPA2/3 mixed mode. It's not an all or
       | nothing choice for the pi which first of all only supports
       | 802.11ac and secondly does not support 6ghz with the builtin wifi
       | card.
       | 
       | In terms of range and capabilities we've found the mt76 series to
       | be very reliable and support wifi6 and 6-e on the pis with speeds
       | easily reaching 600mbps.
       | 
       | These are the A8000 cards from netgear and the ALFA AWUS036AXML.
       | They're pretty good for what they are. I only wish they were Dual
       | Band Dual Channel.
       | 
       | WiFi 7 will see clients gaining Multi Link Operation capability,
       | they'll be able to simultaneously hit 5ghz and 6ghz and 2.4ghz
       | bands for 2-3x throughput.
        
       | zamalek wrote:
       | An alternative to the rekeying interval is potentially a DFS
       | channel being used and a radar or causing a channel switch.
        
       ___________________________________________________________________
       (page generated 2024-01-27 23:00 UTC)