[HN Gopher] What we know about the Apple Neural Engine
___________________________________________________________________
What we know about the Apple Neural Engine
Author : SerCe
Score : 265 points
Date : 2023-03-25 11:04 UTC (11 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| jeffrallen wrote:
| Ane means donkey in French.
|
| Just sayin'.
| marginalia_nu wrote:
| Alright, then, Apple Semantic System.
| eastbound wrote:
| No those initials were already taken by Atlassian Software
| Systems. They seem to have lodged the paperwork with that
| name in 2002, and to have dropped it later on (they rather
| went with TEAM when going IPO in 2015), but back in 2010 when
| I applied, there was a book (collection of news articles) in
| the waiting room for candidates titled "Atlassian Software
| Systems".
|
| Great guys.
|
| https://youtu.be/VfyUbuFoiBU
| marginalia_nu wrote:
| Can't make this shit up.
| dylan604 wrote:
| Don't forget the Advance SubStation Alpha subtitle format
| mrweasel wrote:
| Which shortened also means donkey, brilliant.
| ls612 wrote:
| Does anyone know if the neural engine on the new M1/M2 Max is
| directly hooked up to the unified memory the way the GPU is?
| wmf wrote:
| Define directly I guess.
| ls612 wrote:
| My understanding is that the CPU and GPU both have DMA to the
| memory at some incredible speed since it's all on the same
| chip. Does the ANE have that same DMA speed and latency?
| jamiek88 wrote:
| I believe so as it's used by adobe amongst others, this was
| from a convo with an adobe engineer gushing about the
| UMA/DMA and what an improvement it was from the fans
| whirring jet engine end of Intel era.
|
| I can't find any documentation about it though just
| everyone working under that assumption.
| anentropic wrote:
| Do we think Apple are going to provide more info and maybe a
| public API over time?
|
| Or they are keeping it obscure for commercial reasons?
|
| Or just not very competent/don't care?
|
| Seems weird having these amazing chips and only blunt tools
| my123 wrote:
| CoreML.
|
| Directly exposing the ANE wouldn't make much sense, as it's an
| IP block that changes between generations in incompatible ways.
| brookst wrote:
| This is the answer. CoreML gives you an abstraction over
| different generations and sizes of underlying NPU.
|
| You might not _want_ the abstraction, but love it or hate it,
| that's kind of the Apple way.
|
| It will be very interesting to see what their next chips look
| like since we're getting to the point where HW designs will
| reflect the rise of the, uh, transformers.
| sebzim4500 wrote:
| Can this really be everything publically known about the ANE?
| Sounds hard to believe, I would have thought someone would have
| reverse engineered _something_ about it by now.
| SomeHacker44 wrote:
| See other commenter above about GeoHot's analysis which is much
| more in depth.
| detrites wrote:
| My question too. This semi-answer on the page seems to
| contradict itself (source: https://github.com/hollance/neural-
| engine/blob/master/docs/p... ):
|
| "> Can I program the ANE directly?
|
| Unfortunately not. You can only use the Neural Engine through
| Core ML at the moment.
|
| There currently is no public framework for programming the ANE.
| There are several private, undocumented frameworks but
| obviously we cannot use them as Apple rejects apps that use
| private frameworks.
|
| (Perhaps in the future Apple will provide a public version of
| AppleNeuralEngine.framework.)"
|
| The last part links to this bunch of headers:
|
| https://github.com/nst/iOS-Runtime-Headers/tree/master/Priva...
|
| So might it be more accurate to say you can program it
| directly, but won't end up with something that can be
| distributed on the app store?
| saagarjha wrote:
| Correct. (It is also unlikely that Apple exposes the Neural
| Engine directly.)
| djoldman wrote:
| geohot's findings:
|
| https://github.com/geohot/tinygrad/tree/master/accel/ane
| ljlolel wrote:
| https://news.ycombinator.com/item?id=35302833
| detrites wrote:
| Ok, this is much more like what I expected from the OP.
|
| Anyone disappointed, here be full details on everything.
| bitL wrote:
| It's really terrible that Apple markets this as the next big
| thing but forgets to include detailed documentation so people
| have to experiment and figure out what works...
| lonelyasacloud wrote:
| Part Apple's docs haven't been great for a while, part that's
| just how they roll, and part trying (like most everyone) to
| figure out what their strategy is going to be in a post GPT4
| world [0].
|
| [0] Persist with their own models running locally, how much to
| integrate with rest of the OS and maintain privacy moral
| ground, that sort of thing.
| barkingcat wrote:
| Apple didn't "forget" they never want to ever release apple
| proprietary docs. It's their competitiveness advantage/ moat.
| saagarjha wrote:
| People don't have to do anything. You use CoreML to program it.
| cynicalsecurity wrote:
| Proprietary software, dude. It really sucks.
| m3kw9 wrote:
| Can LLM run in it?
| egman_ekki wrote:
| maybe https://github.com/apple/ml-ane-transformers
| enzomanico wrote:
| Yes of cour
| simonw wrote:
| So my phone and my laptop both have the capability to perform 15
| trillion operations per second, just in the neural engine?
|
| What kind of things are taking advantage of this right now? It's
| gotta be more than just Face ID right?
|
| What's my laptop likely to be doing with that?
| k_bx wrote:
| They're putting it everywhere they can. From Notes to pressing
| pause on Video in QuickTime or Safari and copying text from a
| frame instantly.
| [deleted]
| gedy wrote:
| I can imagine some thing like Siri running on device much more
| effectively against local content. The cynic in me doesn't want
| to hope too much for cloudless services like this, but one can
| hope.
| sroussey wrote:
| This is true since iOS 15 moved Siri on device.
| sroussey wrote:
| https://www.engadget.com/ios-15-siri-on-device-app-
| privacy-1...
| IIAOPSW wrote:
| Siri wasn't a product. She was an emergent feature they
| couldn't extinguish.
| conradev wrote:
| It's used for a variety of things:
|
| - Biometrics (Face ID and Touch ID)
|
| - Image analysis (face matching, aesthetic evaluations, etc)
|
| - Text to speech and speech to text (smaller models on device,
| used for privacy/latency/reliability)
|
| - Small ad-hoc models like Raise to Speak on Apple Watch, the
| Hey Siri detector
| (https://machinelearning.apple.com/research/hey-siri)
|
| These things have been in phones for 5 years now and have been
| used from day one
| simonw wrote:
| Right, but do any of those things really need 15 trillion
| operations for second? Have they been getting noticeably
| better with upgraded phone models?
| jamiek88 wrote:
| Yes definitely.
|
| I could only find a blurry YouTube video of the instruction
| manual for an old old heater in my house.
|
| I paused the video on the bit I needed the guy had zoomed
| into and was able to copy and paste the text that I could
| barely read into a notes doc.
|
| There's no one splashy thing just lots of little quality of
| life improvements.
| burnished wrote:
| I got one recently and generally think the phone is
| garbage, but the OCR built into pictures is really
| something else. I took a photo of a label for a barcode
| when I couldnt see it myself but could get my hand nearby,
| it was at an odd angle, but when I pressed my finger to the
| text I was interested in the phone captured it immediately,
| highlighted it, and I copied it nice as you please.
| blululu wrote:
| No but the first party users should not consume all the
| compute on the chip. The bigger the margin the better for
| the device. The other aspect of this is speed and power
| consumption (battery life is a top 3 phone feature across
| pretty much all consumers).
| secretsatan wrote:
| Arkit makes use of it on the phone, there's plane detection and
| classification, image and object detection, segmentation for
| people occlusion, probably more behind the scenes.
|
| I find it a little frustrating we aren't using the built in
| capabilities of iphones more in our company, i still kinda
| think apple tech is kind of a pariah in some circles, so we
| have to run with stuff that runs on cloud that costs us money
| over, heaven forbid something you could run on an iphone
| kmeisthax wrote:
| There's an app called Draw Things, for iOS/iPadOS/macOS/etcOS,
| that uses the ANE to run Stable Diffusion on your
| phone/tablet/laptop.
| fauigerzigerk wrote:
| I don't know for sure, but things like text recognition (Live
| Text) or object recognition in Photos (Visual Look Up) are
| obvious candidates.
|
| I think neural engine is absolutely key to Apple's strategy.
| They want people to buy expensive devices and they don't want
| to process user data on their servers.
|
| Users get privacy. Apple gets money. It's a pretty coherent
| business model.
| jjoonathan wrote:
| Privacy isn't the only benefit of local compute, users also
| get colossal bandwidth, tiny latency, and high reliability.
| crazygringo wrote:
| On the other hand, it kills your battery.
|
| Back when dictation was done in the cloud, I could dictate
| all day on my iPhone no problem.
|
| Now that it's on-device it kills my battery in a couple of
| hours.
|
| The latency is absolutely improved, and continuous
| dictation (not stopping every 30s) is a godsend.
|
| But it does absolutely destroy your battery life.
| bibanez wrote:
| Don't worry too much because there is Moore's law to the
| rescue. NPUs benefit from new processes
| mcculley wrote:
| Moore's Law makes it a good long term strategy for Apple.
| The GP is complaining about his battery life today.
| lucideer wrote:
| I hope it's not disrespectful to point this out, less
| than 24 hours after his passing, but I don't think Gordon
| would object to my pointing out that Moore's Law has a
| finite length. Some have argued it expired up to 13 years
| ago; Moore himself predicted another 2 years or so.
| nwienert wrote:
| I built an always on local OCR system that used ML on
| CPU/GPU a few years ago and I can say with confidence it
| doesn't use much. We literally scanned your entire screen
| every two seconds and it used less than 1% in total, and
| this was before CoreML which is far more efficient. I
| think it's FUD that it is that significant.
| fauigerzigerk wrote:
| Agreed.
|
| On the downside, we have to acknowledge that it is hugely
| inefficient for everyone to own expensive hardware that has
| to sit idle most of the time because it would otherwise
| drain the battery.
|
| Where low latency is not an absolute necessity, the
| economic pull of the cloud will be tremendous, especially
| if mobile networks become ubiquitous and fast.
| vinay_ys wrote:
| That's a weak argument. Lots of hardware sits idle in the
| cloud as well. And on your phone its not expensive. In
| fact, the $/tflop is cheaper on phone than in the cloud -
| cloud has to deal with all kinds of complexity that you
| assume away in your local single-tenant phone context.
| fauigerzigerk wrote:
| I wouldn't be so sure. A quick web search brings up
| average server utilisation numbers for large-scale cloud
| providers between 45% and 65%. That's probably an order
| of magnitude or two higher than what you could do on a
| mobile device without absolutely annihilating the
| battery.
| flutas wrote:
| > They want people to buy expensive devices and they don't
| want to process user data on their servers.
|
| > Users get privacy. Apple gets money.
|
| Apple also gets users to subsidize the cost of compute
| indefinitely (by buying the expensive phone), rather than
| using their servers.
| blululu wrote:
| It's not a subsidy. It's a pricing structure for a
| commercial transaction. Fundamentally a business can not
| just give out free compute. In the long run the user of
| computation needs to pay for it. It's a question of whether
| people feel more satisfied paying for it in a lump sum
| bundled with a device or through a subscription plan on the
| cloud. For frequently, on demand, low latency applications
| I would suspect that people will always be happier running
| the computations locally.
| saagarjha wrote:
| Apple also runs an OS on that device, so they can't just
| offload infinite computation for it: it would use too much
| battery.
| iamgopal wrote:
| 15 trillion operation per second ? Of what kind ? Addition ?
| Isn't that mind blowing ?
| selectodude wrote:
| Matrix multiplication
| amelius wrote:
| Of what size??
| sebzim4500 wrote:
| I know you probably didn't mean this, but in case anyone is
| confused ANE is not doing 15 trillion matrix
| multiplications per second. It is doing 15 trillion scalar
| operations in order to multiply a much smaller number of
| matrices.
| blululu wrote:
| To my knowledge this is mostly used by internal tools, though a
| number of common 3rd party apis (qr code scan) use hw
| acceleration under the hood. Internally there is a ton of ML
| running on the device. The most obvious is touch screen and
| inputs and the camera. 3rd party developers have acres to this
| via CoreML, but unless latency is critical it is usually easier
| to develop and run ml on the cloud. For camera apps using ml,
| this chip is going to be used either explicitly or implicitly.
| simonw wrote:
| Oh the touch screen! That's fascinating, is that definitely
| running stuff on the neural engine?
| blululu wrote:
| If you think about it a capacitive touch sensor provides a
| noisy Grayscale image and the goal is to detect and
| classify blobs as touch gestures as quickly and accurately
| as possible. Since it is running at all times and latency
| really burns the UX. Consequences this has always been done
| on a HW accelerator.
| ManuelKiessling wrote:
| Running Microsoft Teams. Barely.
| waboremo wrote:
| Scene analysis in photos, image captions, and machine
| translations are also done using ANE. CoreML also utilizes it
| when possible.
| mmaunder wrote:
| Anyone done any work on using a model for transcription on the
| local device using the ANE? I've heard it kills the battery.
| Having to transcribe voice in the cloud is a serious impediment
| to end to end encryption for certain applications.
| intalentive wrote:
| This is close: https://github.com/ggerganov/whisper.cpp
| thedonkeycometh wrote:
| [dead]
| ah- wrote:
| There's also basic ANE support for Asahi now:
|
| https://github.com/eiln/ane
|
| https://github.com/eiln/anecc
|
| https://github.com/AsahiLinux/m1n1/pull/296/files
| rowanG077 wrote:
| That's misleading. It's much more apt to say it's being worked
| on. This is not available in any Asahi release at this time.
___________________________________________________________________
(page generated 2023-03-25 23:00 UTC)