[HN Gopher] Apple's On-Device and Server Foundation Models
       ___________________________________________________________________
        
       Apple's On-Device and Server Foundation Models
        
       Author : 2bit
       Score  : 112 points
       Date   : 2024-06-10 21:42 UTC (1 hours ago)
        
 (HTM) web link (machinelearning.apple.com)
 (TXT) w3m dump (machinelearning.apple.com)
        
       | GaggiX wrote:
       | It would be cool to understand when the system will use one or
       | the other (the ~3 billion on-device model or the bigger one on
       | Apple servers).
        
         | swatcoder wrote:
         | Conceivably, they don't have precise answers for that yet, and
         | won't until after they see what real-world usage looks like.
         | 
         | They built out a system that's ready to scale to deliver
         | features that may not work on available hardware, but they're
         | also incentivized to minimize actual reliance on that cloud
         | stuff as it incurs per-use costs that local runs don't.
        
           | GaggiX wrote:
           | Yeah this is probably right. If it works well enough during
           | real-world usage it will be using the on-device model, if not
           | then there is the bigger one on the servers. There is also
           | GPT-4o, so they have 3 different models to use depending on
           | the task.
        
       | kmeisthax wrote:
       | > We train our foundation models on licensed data, including data
       | selected to enhance specific features, as well as publicly
       | available data collected by our web-crawler, AppleBot. Web
       | publishers have the option to opt out of the use of their web
       | content for Apple Intelligence training with a data usage
       | control.
       | 
       | And, of course, nobody has known to opt-out by blocking AppleBot-
       | Extended until after the announcement where they've already
       | pirated shittons of data.
       | 
       | In completely unrelated news, I just trained a new OS development
       | AI on every OS Apple has ever written. Don't worry. There's an
       | opt-out, Apple just needed to know to put these magic words in
       | their installer image years ago. I'm sure Apple legal will be OK
       | with this.
        
         | mdhb wrote:
         | So built on stolen data essentially.
        
           | bigyikes wrote:
           | Does that imply I just stole your comment by reading it?
           | 
           | No snark intended; I'm seriously asking. If the answer is
           | "no" then where do you draw the line?
        
             | mdhb wrote:
             | I don't actually think this is complicated and reading a
             | comment is not the same thing as scraping the internet and
             | you obviously know that.
             | 
             | A few factors that come to mind would be:
             | 
             | - scale
             | 
             | - informed consent which there was none in this case
             | 
             | - how you are going to use that data. For example using
             | everybody others work so the worlds richest company can
             | make more money from it while giving back nothing in return
             | is a bullshit move.
        
               | cwp wrote:
               | Reading a comment is exactly the same thing as scraping
               | the internet, you just stop sooner.
        
               | llamaimperative wrote:
               | I think it's even simpler than that: incentives. The
               | entire premise of copyright law (and all IP law) is to
               | protect the incentive to create new stuff, which is often
               | a very risky and highly time or capital intensive
               | endeavor.
               | 
               | So here's the question:
               | 
               | Does my reading your comment destroy the incentive for
               | you to post it? No. In fact, it is the only thing that
               | _produces_ the incentive for you to post it. People post
               | here when they want that thing to be read by someone
               | else.
               | 
               | Does a model sucking up all the artistic output of the
               | last 400 years and using that to produce an image
               | generator model destroy the incentive of producing and
               | sharing said artistic output? _Yes._
               | 
               | Of course you have plenty of the people who aim to
               | benefit from this incentive-destruction claiming it does
               | no such thing, but I personally tend to put more credence
               | in the words of people who have historically been
               | incentivized by said incentives (i.e. artists) who
               | generally seem to perceive this as _destructive_ to their
               | desire to create and share their work.
        
             | cush wrote:
             | Reading, no. Selling derivative works using, yes.
        
               | cwp wrote:
               | If I read your comment, then write a reply, is it a
               | derivative work?
        
             | renewiltord wrote:
             | Data gets either stolen or freed depending on whether the
             | guy who copied it is someone you dislike or like.
             | Personally, I think that Apple is giving the data more
             | exposure which, as I've been informed many times here, is
             | much more valuable than paying for the data.
        
           | threeseed wrote:
           | Web scraping is legal.
           | 
           | And if you run a website and want to opt-out then simply add
           | a robots.txt.
           | 
           | The standard way of preventing bots for 30 years.
        
             | mdhb wrote:
             | How are people supposed to block it when they stole all the
             | data first and then only after that point they decide to
             | even tell anyone what they are planning to do with it?
        
         | bigyikes wrote:
         | > just trained a new OS development AI on every OS Apple has
         | ever written.
         | 
         | ...is there publicly visible source code for every OS Apple has
         | ever written?
        
           | tensor wrote:
           | Partially:
           | 
           | https://github.com/apple-oss-distributions/distribution-
           | macO...
           | 
           | https://github.com/apple-oss-distributions/distribution-iOS
           | 
           | I'm not sure how it all fits together but people have even
           | made an open source distribution of the base of darwin, the
           | underlying OS:
           | 
           | https://github.com/PureDarwin/PureDarwin
        
         | multimoon wrote:
         | Apple just did more to make this a privacy focused feature
         | versus just a data mine than literally anyone else to date and
         | still people complain.
         | 
         | Public content on the internet is public content on the
         | internet - I thought we had all agreed years ago that if you
         | didn't want your content copied, don't make it freely available
         | and unlicensed on the internet.
        
           | advael wrote:
           | No, they _said_ they did. Huge difference
        
             | threeseed wrote:
             | It was mentioned in the keynote that they allow researchers
             | to audit their claims.
        
               | advael wrote:
               | And as soon as independent sources support this claim it
               | will be more than a claim. I actually am impressed by the
               | link I missed and was provided elsewhere in this thread,
               | and I hope to also be impressed when this claim is
               | actually realized and we have more details about it
        
           | AshamedCaptain wrote:
           | What ? What did they do? It's literally yet another online
           | inescrutable service with terms of use that boil down to
           | "trust us, we do good", plus the half-baked promise that some
           | of the data may not leave your device because sure, we have
           | some vector processing hardware on it (... which hardware
           | announced this year doesn't do that?).
           | 
           | Frankly I tried a samsung device which I would have assume is
           | the worst here, and the promises are exactly the same. They
           | show you two prompts, one for locally processed services
           | (e.g. translation), and one when data is about to leave your
           | device, and you can accept or reject them separately. But
           | both of them are basically unverifiable promises and closed
           | source services.
        
           | layer8 wrote:
           | Public content is still subject to copyright, and I doubt
           | that AppleBot only scrapes content carrying a suitable
           | license.
        
           | kmeisthax wrote:
           | Oh no, don't get me wrong. I _like_ the privacy features, it
           | 's already way better than OpenAI's "we make it proprietary
           | so we can spy on you" approach.
           | 
           | What I don't like is the hypocrisy that basically every AI
           | company has engaged in, where copying my shit is OK but
           | copying theirs is not. The Internet is _not_ public domain,
           | as much as Eric Bauman and every AI research team would say
           | otherwise. Even if you don 't like copyright[0], you should
           | care about copyleft, because denying valuable creative work
           | to the proprietary world is how you get them to concede. If
           | you can shove that work into an AI and get the benefits of
           | that knowledge without the licensing requirement, then
           | copyleft is useless as a tactic to get the proprietary world
           | to bend the knee.
           | 
           | [0] And I don't.
           | 
           | My opinion is that individual copyright ownership is a bad
           | deal for most artists and we need collective negotiation
           | instead. Even the most copyright-respecting, 'ethical' AI
           | boils down to Adobe dropping a EULA roofie in the Adobe Stock
           | Contributor Agreement that lets them pay you pennies.
        
         | zer00eyz wrote:
         | > publicly available data collected
         | 
         | Data, implies factual information. You can not copyright
         | factual information.
         | 
         | The fact that I use the word "appalling" to describe the
         | practice of doing this results in some vector relationship
         | between the words. Thats the data, the fact, not the writing
         | itself.
         | 
         | There are going to be a bunch of interesting court cases where
         | the court is going to have to backtrack on copyrighting facts.
         | Or were going to have to get some real odd legal
         | interpretations of how LLM's work (and buy into them). Or we're
         | going to have to change the law (giving everyone else first
         | mover advantage).
         | 
         | Base on how things have been working I am betting that it's the
         | last one, because it pulls up the ladder.
        
           | cush wrote:
           | > Data, implies factual information. You can not copyright
           | factual information
           | 
           | Where on Earth did you get that from?
        
         | threeseed wrote:
         | > And, of course, nobody has known to opt-out by blocking
         | AppleBot-Extended until after the announcement where they've
         | already pirated shittons of data
         | 
         | This is wrong. AppleBot identifier hasn't changed:
         | https://support.apple.com/en-us/119829
         | 
         | There is no AppleBot-Extended. And if you blocked it in the
         | past it remains blocked.
        
           | fotta wrote:
           | From your own link:
           | 
           | > Controlling data usage
           | 
           | > In addition to following all robots.txt rules and
           | directives, Apple has a secondary user agent, Applebot-
           | Extended, that gives web publishers additional controls over
           | how their website content can be used by Apple.
           | 
           | > With Applebot-Extended, web publishers can choose to opt
           | out of their website content being used to train Apple's
           | foundation models powering generative AI features across
           | Apple products, including Apple Intelligence, Services, and
           | Developer Tools.
        
             | threeseed wrote:
             | Might want to actually read it:
             | 
             | Applebot-Extended does not crawl webpages.
             | 
             | They gave this as an _additional_ control to allow crawling
             | for search but blocking for use in models.
        
               | fotta wrote:
               | > There is no AppleBot-Extended. And if you blocked it in
               | the past it remains blocked.
               | 
               | You said there is no Applebot-Extended. The link says
               | otherwise.
        
             | ziml77 wrote:
             | But it also says that Applebot-Extended doesn't crawl
             | webpages and instead this marker is only used to determine
             | what can be done with the pages that were visited by
             | Applebot.
             | 
             | Not that I like an opt-out system, but based on the wording
             | of the docs it is true that if you blocked Applebot then
             | blocking Applebot-Extended isn't necessary.
        
               | fotta wrote:
               | Yeah that is true, but I suspect that most publishers
               | that want their content to appear in search but not used
               | for model training will not have blocked Applebot to
               | date.
        
         | scosman wrote:
         | There will be further versions of this model. Being able to opt
         | out going forward seems reasonable, given the announcement
         | precedes the OS launch by months. Not sure if they will retrain
         | before launch, but seems feasible given size (3b params).
        
       | ddxv wrote:
       | Will these smaller on device models lead to a crash in GPU
       | prices?
        
         | htrp wrote:
         | X to doubt.
        
         | sooheon wrote:
         | Prices fall when supply outpaces demand -- this is adding more
         | demand.
        
           | wmf wrote:
           | This isn't adding GPU demand.
        
       | htrp wrote:
       | > Our foundation models are fine-tuned for users' everyday
       | activities, and can dynamically specialize themselves on-the-fly
       | for the task at hand. We utilize adapters, small neural network
       | modules that can be plugged into various layers of the pre-
       | trained model, to fine-tune our models for specific tasks. For
       | our models we adapt the attention matrices, the attention
       | projection matrix, and the fully connected layers in the point-
       | wise feedforward networks for a suitable set of the decoding
       | layers of the transformer architecture.
       | 
       | >We represent the values of the adapter parameters using 16 bits,
       | and for the ~3 billion parameter on-device model, the parameters
       | for a rank 16 adapter typically require 10s of megabytes. The
       | adapter models can be dynamically loaded, temporarily cached in
       | memory, and swapped -- giving our foundation model the ability to
       | specialize itself on the fly for the task at hand while
       | efficiently managing memory and guaranteeing the operating
       | system's responsiveness.
       | 
       | This kind of sounds like Loras......
        
         | cube2222 wrote:
         | The article explicitly states they're Loras.
        
         | alephxyz wrote:
         | The A in LoRA stands for adapters
        
       | advael wrote:
       | I'm disappointed that they make the fundamental claim that their
       | cloud service is private with respect to user inputs passed
       | through it and don't even a little bit talk about how that's
       | accomplished. Even just an explanation of what guarantees they
       | make and how would be much more interesting than explanations of
       | their flavor of RLHF or whatever nonsense. I read the GAZELLE*
       | paper when it came out and wondered what it would look like if a
       | large-scale organization tried to deploy something like it.
       | 
       | Of course, Apple will never give adequate details about security
       | mechanisms or privacy guarantees. They are in the business of
       | selling you security as something that must be handled by them
       | and them alone, and that knowing how they do it would somehow be
       | less secure (This is the opposite of how it actually works, but
       | also Apple loves doublespeak, and 1984 allusions have been their
       | brand since at least 1984). I view that, like any claim by a tech
       | company that they are keeping your data secure in any context, as
       | security theater. Vague promises are no promises at all. Put up
       | or shut up.
       | 
       | * https://arxiv.org/pdf/1801.05507
        
         | killingtime74 wrote:
         | Don't they do it in this linked article?
         | https://security.apple.com/blog/private-cloud-compute/
        
           | advael wrote:
           | Woa, good catch! Maybe they're doing better about at least
           | being concrete about it, though I still have to side-eye
           | "Users control their devices" (Even with root on macbooks I
           | don't have access to everything running on it). However, the
           | section that promises to open-source the cloud software are
           | impressive and if true gives them more credibility than I
           | assumed. I would still look out for places where devices they
           | do control could pass them keys in still-proprietary parts of
           | the stack they're operating, as even if we can verify the
           | cloud container OS in its entirety if there's a backchannel
           | for keys that a hypervisor could use then that's still a
           | backdoor, but they are at least seemingly making a real
           | effort here
        
       | epipolar wrote:
       | It would be interesting to see how these models impact battery
       | life. I've tried a few local LLMs on my iPhone 15 Pro via the
       | PrivateLLM app, and the battery charge plummets just after a few
       | minutes of usage.
        
         | urbandw311er wrote:
         | Likely they'll be able to take advantage of the hardware neural
         | engine and be far more power efficient. Apple has demonstrated
         | this is something it takes pretty seriously.
        
           | brcmthrowaway wrote:
           | So iOS LLM Apps dont use the neural engine? Lol
        
             | renewiltord wrote:
             | Probably not. The CoreML LLM stuff only works on Macs
             | AFAIK. Probably the phone app uses the GPU.
        
         | bradly wrote:
         | During my time at Apple the bigger issue with personalized, on-
         | device models was the file size. At the time, each model was a
         | significant amount of data to push to a device, and with lots
         | of teams wanting an on-device model and the desire to update
         | them regularly, it was definitely a big discussion.
        
       | cube2222 wrote:
       | Halfway down the article contains some great charts with
       | comparisons to other relevant models, like Mistral-7B for the on-
       | device models, and both gpt-3.5 and 4 for the server-side models.
       | 
       | They include data about the ratio of which outputs human graders
       | preferred (for server side it's better than 3.5, worse than 4).
       | 
       | BUT, the interesting chart to me is ,,Human Evaluation of Output
       | Harmfulness" which is much, much "better,, than the other models.
       | Both on-device and server-side.
       | 
       | I wonder if that's part of wanting to have gpt as the ,,level 3".
       | Making their own models much more cautious, and using OpenAI's
       | models in a way that makes it clear ,,it was ChatGPT that said
       | this, not us".
       | 
       | Instruction following accuracy seems to be really good as well.
        
       | TheRoque wrote:
       | Why isn't there a comparison with the Llama3 8b in the
       | "benchmarks" ?
        
       | ra7 wrote:
       | > _Our foundation models are trained on Apple 's AXLearn
       | framework, an open-source project we released in 2023. It builds
       | on top of JAX and XLA, and allows us to train the models with
       | high efficiency and scalability on various training hardware and
       | cloud platforms, including TPUs and both cloud and on-premise
       | GPUs._
       | 
       | Interesting that they're using TPUs for training, in addition to
       | GPUs. Is it both a technical decision (JAX and XLA) and a hedge
       | against Nvidia?
        
       | Isuckatcode wrote:
       | >By fine-tuning only the adapter layers, the original parameters
       | of the base pre-trained model remain unchanged, preserving the
       | general knowledge of the model while tailoring the adapter layers
       | to support specific tasks.
       | 
       | From a ML noob (me) understanding of this, does this mean that
       | the final matrix is regularly fine tuned instead of fine tuning
       | the main model ? Is this similar to how chatGPT now remembers
       | memory[1] ?
       | 
       | [1] https://help.openai.com/en/articles/8590148-memory-faq
        
       ___________________________________________________________________
       (page generated 2024-06-10 23:00 UTC)