[HN Gopher] We hacked Gemini's Python sandbox and leaked its sou...
       ___________________________________________________________________
        
       We hacked Gemini's Python sandbox and leaked its source code (at
       least some)
        
       Author : topsycatt
       Score  : 626 points
       Date   : 2025-03-28 18:12 UTC (1 days ago)
        
 (HTM) web link (www.landh.tech)
 (TXT) w3m dump (www.landh.tech)
        
       | sneak wrote:
       | > _However, the build pipeline for compiling the sandbox binary
       | included an automated step that adds security proto files to a
       | binary whenever it detects that the binary might need them to
       | enforce internal rules. In this particular case, that step wasn't
       | necessary, resulting in the unintended inclusion of highly
       | confidential internal protos in the wild !_
       | 
       | Protobufs aren't really these super secret hyper-proprietary
       | things they seem to make them out to be in this breathless
       | article.
        
         | daeken wrote:
         | Yeah, this is honestly super interesting as a journey, but not
         | as a destination. The framing takes away from how cool the work
         | really is.
        
         | ratorx wrote:
         | Yup, there's no reason to believe that the proto files (which
         | are definitions rather than data) are any more confidential
         | than the Gemini source code itself.
        
         | film42 wrote:
         | No, but having the names to the fields, directly from Google,
         | is very helpful for further understanding what's available from
         | within the sandbox.
        
           | kingforaday wrote:
           | Reminds me of this HN article from a month ago with lots of
           | commentary on whether a database scheme is proprietary.
           | 
           | https://news.ycombinator.com/item?id=43175628
        
             | film42 wrote:
             | Yeah there are some interesting similarities. However, the
             | biggest difference is Google has the right to keep source
             | proprietary, and companies like Unity are allowed to
             | provide source code with a reference only license (still
             | proprietary), but the US has FOIA to help push information
             | into the open. Does a DB schema fall under FOIA scope? I
             | think a better question is, can (or is) a db schema being
             | used to conceal information? Is the law attempting to
             | reinforce this barrier?
             | 
             | In other words, it should not be about the intent of the
             | requester, but the intent of its owner; and in the case of
             | that article, either by bias in narrative, or the fact that
             | it rhymes with events of the past, there is some tomfoolery
             | about.
        
         | ipsum2 wrote:
         | Yes, there's a lot of internal protos from Google that are
         | leaked on the internet. If I recall correctly, it was a hacker
         | News comment that linked to it.
         | 
         | Edit: I don't know why the parent comment was flagged. It is
         | entirely accurate.
        
           | kccqzy wrote:
           | You are probably thinking of the Google search ranking leak.
           | That leak was the leak of the generated documentation from
           | proto files.
        
         | whatevertrevor wrote:
         | The protos in question are related to internal authn/z so it's
         | conceivable that having access to that structure would be
         | valuable information to an attacker.
        
           | rurban wrote:
           | The protos were already available. See above.
           | 
           | A valuable information would be able to run those RPC calls
           | as Principal (their root user)
        
       | topsycatt wrote:
       | That's the system I work on! Please feel free to ask any
       | questions. All opinions are my own and do not represent those of
       | my employer.
        
         | Mindwipe wrote:
         | Does anyone at Google care that you're trying to replace
         | Assistant with this in the next few months and it can't set a
         | timer yet?
         | 
         | (I mean it will tell you it's set a timer but it doesn't talk
         | to the native clock app so nothing ever goes off if you
         | navigate away from the window.)
        
           | hnuser123456 wrote:
           | I doubt the guy working on the code sandbox can do anything
           | about the overall resource allocation towards ensuring all
           | legacy assistant features still work as well as they used to.
           | That being said, I was trying to navigate out of an
           | unexpected construction zone and asked google to navigate me
           | home, and it repeatedly tried to open the map on my watch and
           | lock my phone screen. I had to pull over and use my thumbs to
           | start navigation the old fashioned way.
        
           | arebop wrote:
           | The Assistant can't reliably set timers either, though I
           | guess 80% is considerably better than 0. Still, I think it
           | used to be better back before Google caught a glimpse of a
           | different squirrel to chase.
        
           | 7bit wrote:
           | It can't do shit, especially in some EU countries, where it
           | can do even less shit.
           | 
           | Setting timers reminders, calendar events. Nothing. If they
           | kill the assistant, I'll go Apple, no matter how much I hate
           | it.
        
           | iury-sza wrote:
           | I keep reading people complaining about this but I can't
           | understand why. Gemini can 100% set timers and with much more
           | subtle hints than assistant ever could. It just works. I
           | don't get why people say it can't.
           | 
           | It can also play music or turn on my smart lamps, change
           | their colors etc. I can't remember doing any special
           | configuration for it to do that either.
           | 
           | Pixel 9 pro
        
             | jdiff wrote:
             | I certainly can't get it to reliably play music on my Pixel
             | 8. Mostly it summons YT Music, only occasionally do I get
             | my music player, and sometimes I merely get "I'm an LLM, I
             | can't help you with that."
             | 
             | And you used to be able to say "Find my phone" and it would
             | chime and max screen brightness until found. Tried that
             | with Gemini once, and it went on with very detailed
             | instructions on using Google or Apple's Find My Device
             | website (depending on what type of phone I owned), maybe
             | calling it from another device if it's not silenced, or
             | perhaps accepting that my device was lost or stolen if none
             | of the above worked. Did find it during that lengthy
             | attempt at being helpful though.
             | 
             | Another fun example, weather. When Gemini's in control,
             | "What's the weather like tonight?" gets a short ramble
             | about how weather depends on climate, with some examples of
             | what the weather might be like broadly in Canada, Japan, or
             | the United States at night.
             | 
             | Unlike Assistant where you could learn to adapt to its
             | unique phrasing preferences, you just flat out can never
             | reliably predict what Gemini's going to do. In exchange for
             | higher peak performance, the floor dropped out the bottom.
        
           | dgunay wrote:
           | I dislike Google's (mis)management of Assistant as much as
           | the next guy, but this just has not been my experience. I can
           | tell Gemini on my phone to set timers and it works just fine.
        
           | ChadNauseam wrote:
           | I have a rooted pixel with a flashed custom android ROM,
           | which should be a nightmare scenario for gemini, and it can
           | set timers just fine (and the timers show up in the native
           | clock app)
        
           | nosrepa wrote:
           | I just want the assistant voice. I hate the Gemini ones.
        
             | whatevertrevor wrote:
             | I'm with you on that. I prefer a human trying to sound like
             | a robot instead of a robot trying to sound human.
        
         | hnuser123456 wrote:
         | Is the interactive python sandbox incompatible with thinking
         | models? It seems like I can only get the interactive sandbox by
         | using 2.0 flash, not 2.0 flash thinking or 2.5 pro.
        
           | topsycatt wrote:
           | That's a good question! It's not incompatible, it's just a
           | matter of getting the flow right. I can't comment too much on
           | that process but I'm excited for the possibilities there.
        
             | hnuser123456 wrote:
             | Oh, I see Gemini can run code as part of the thinking
             | process. I suppose the sandbox that happens in was the
             | target of this research, while code editing in Gemini
             | Canvas just has a button to export to Colab for running.
             | The screenshots in the research show a "run" button for
             | generated code in the chat, but I'm not seeing that exact
             | interface.
             | 
             | In any case, I share your excitement.
        
               | topsycatt wrote:
               | Canvas actually has a mix of this sandbox (with a
               | different container) and fully client-side.
               | 
               | The "run" option for generated code was removed due to
               | underutilization, but the sandbox is still used for
               | things like the data analysis workflow and running
               | extensions amongst other things. It's really just a
               | general purpose sandbox for running untrusted code
               | server-side.
        
               | hnuser123456 wrote:
               | Is there a way for you to campaign to return the run
               | button for common queries for code examples? It's
               | probably the most powerful educational tool ever
               | invented, to be able to see how the human language
               | description turns into strange computer code which turns
               | into resulting output. If you guys can get it secure
               | enough, it's a killer feature.
        
               | sans_souse wrote:
               | +1 vote here
        
               | sans_souse wrote:
               | Talk about indirect gas-lighting, I can never find info
               | on deprecated functions like this one, to the point I
               | convinced myself I imagined it. I guess now I know who to
               | ask
        
             | TechDebtDevin wrote:
             | Have you by chance read this paper: https://agent-
             | gen.github.io/
        
         | seydor wrote:
         | you re the hacker or the google?
        
           | topsycatt wrote:
           | The google
        
             | onemoresoop wrote:
             | Question: how does it feel inside google in terms of losing
             | their lunch to OpenAi? Losing here is very loose, I don't
             | think OpenAI won yet but seems to have made a leap ahead of
             | google in terms of marker share and we know google was
             | sitting on tons of breakthroughs and research. Any
             | panicking or internal discontent at google's product
             | policies? No need to answer if you're uncomforable that
             | your employer may hold you responsible for what you write
             | here.
        
               | MyelinatedT wrote:
               | From my perspective (talking very generally about the
               | mood and environment here), it's important to remember
               | that Google is a very, very big company with many
               | products and activities outside of AI.
               | 
               | As far as I can see, there is a mix of frustration at the
               | slowness of launching, optimism/excitement that there are
               | some really awesome things cooking, and indifference from
               | a lot of people who think AI/LLMs as a product category
               | are quite overhyped.
        
               | dataflow wrote:
               | > Google is a very, very big company with many products
               | and activities outside of AI.
               | 
               | Profit is what matters though, not number of products.
               | The consumer perception is that Search rakes in the
               | largest profits, so if they lose that, it doesn't matter
               | what else is there. Thoughts?
        
               | fennecbutt wrote:
               | Idk, I used to want to work for Google but I'm not so
               | sure anymore. They built an awesome landscaper next to my
               | office in London.
               | 
               | But the UX and general functionality of their apps and
               | services has been in steep decline for a long time now,
               | imo. There are thousands of examples of the most basic
               | and obvious mistakes and completely uninspired, sloppy
               | software and service design.
        
               | MoonGhost wrote:
               | > obvious mistakes and completely uninspired, sloppy
               | software and service design.
               | 
               | That's something you can work on to improve.
               | 
               | A few years back I wanted to work for FAANG big company.
               | Now I don't after working for smaller but with 'big'
               | management. There are rats races, dirty tricks. And
               | engineers don't have much control on what and how they
               | are doing. Many things decided by incompetent managers.
               | Architect position is actually a manager's title, no
               | brain or skills required.
               | 
               | Today I rather go to a small company or startup where the
               | results are visible and appreciated.
        
               | nikcub wrote:
               | Nobody serious believes this. OpenAI may be eating up
               | consumer mindshare - but Google are providing some of the
               | most capable, best, cheapest and fastest models for dev
               | integration.
        
               | bitexploder wrote:
               | As the hype dies down, Goliath shakes off the
               | competition. AI models are now a game of inches and those
               | inches cost billions every inch, but it matters in the
               | long run.
        
               | lanyard-textile wrote:
               | I'm honestly shocked to hear anyone defend gemini,
               | respectfully :)
               | 
               | What casts it as most capable?
        
               | mediaman wrote:
               | This is an unusual opinion in industry, although common
               | with consumers.
               | 
               | Currently, Google has the most cost effective model
               | (Flash 2) for tons of corporate work (OCR, classifiers,
               | etc).
               | 
               | They just announced likely the most capable model
               | currently in the market with Gemini 2.5.
               | 
               | Their small open source models (Gemma 3) are very good.
               | 
               | It is true that they've struggled to execute on product,
               | but the actual technology is very good and getting
               | substantial adoption in industry. Personally I've moved
               | quite a few workloads to Google from OpenAI and
               | Anthropic.
               | 
               | My main complaint is that they often release impressive
               | models, but gimp them in experimental mode for too long,
               | without fully releasing them (2.5 is currently in this
               | category).
        
               | snoman wrote:
               | How does Flash compare to Nova Lite? The latter looks
               | less expensive. I haven't really used either (used Nova
               | Pro and it was good)
        
               | luke-stanley wrote:
               | They just released a SOTA model (Gemini 2.5 Pro) that
               | beats all models on most benchmarks, it's a great
               | comeback from the model side but IMO they are less strong
               | on the product side, they pioneered the sticky ecosystem
               | of web app products model, though kinda like the
               | Microsoft Office suite that (originally) had to be
               | downloaded, ironically building on XML HTTP request
               | support the IE5 introduced for Outlook.
        
             | larodi wrote:
             | "im the google" is definitely a top 3 chart synthpop song
             | by ladytron .)
        
             | sans_souse wrote:
             | Can a Mod please change thread title to I'm The Google.
             | AMA.
        
         | fragmede wrote:
         | Do you think "hacked Gemini and leaked its source code" is an
         | accurate representation of what happened here?
        
           | topsycatt wrote:
           | I'm on the Google side of the equation. I think the title is
           | a bit sensationalized, but that's the author's prerogative.
        
             | devdudect wrote:
             | When are we going to be able to run sandboxed php code?
        
               | topsycatt wrote:
               | We could, it's just not high up on the priority list. Any
               | particular reason you want php?
        
               | alienbaby wrote:
               | Possibly they are mildly insane
        
               | egeozcan wrote:
               | Next step is gemini hosting Personal Home Pages.
        
               | 0xbadcafebee wrote:
               | >75% of the web's server-side code is php. most of that
               | is WordPress, but lots of people customize it, and being
               | able to write your own themes, plugins, etc is a big deal
        
               | ipaddr wrote:
               | Why would you want to run anything else?
        
               | simonw wrote:
               | You can run PHP in ChatGPT Code Interpreter today if you
               | upload the right binary (also Deno and Lua and more):
               | https://til.simonwillison.net/llms/code-interpreter-
               | expansio...
        
             | koakuma-chan wrote:
             | > but that's the author's prerogative
             | 
             | You submitted this.
        
               | topsycatt wrote:
               | I submitted this HN link with a title that exactly
               | matches the one on the article, but I didn't write the
               | title on the article. AFAIK HN posts should match the
               | title of the article they link to.
        
               | koakuma-chan wrote:
               | > AFAIK HN posts should match the title of the article
               | they link to.
               | 
               | I am not aware of such rule's existence.
               | 
               | Also "should" not "must."
               | 
               | To be clear: I don't have a problem with you submitting
               | this, but the title appears to be completely false.
        
               | dang wrote:
               | Actually the rule is designed to let you correct
               | misleading titles:
               | 
               | " _Please use the original title, unless it is misleading
               | or linkbait; don 't editorialize._" -
               | https://news.ycombinator.com/newsguidelines.html
               | 
               | I've done that now
               | (https://news.ycombinator.com/item?id=43509103).
               | 
               | I appreciate your scruples though! Because even though
               | you would have been on the right side of HN's rules to
               | correct a misleading (and/or linkbait) title, the fact
               | that you work for Google would have opened you to the
               | usual gotcha attacks about conflict of interest. This way
               | we avoided all of that, and it's still a good submission
               | and thread!
        
               | topsycatt wrote:
               | Thank you very much dang!
        
               | bitexploder wrote:
               | Dang, you are cool. :)
        
               | gorlilla wrote:
               | Can you run the country too?
        
               | wil421 wrote:
               | Even better, OP shared something OP didn't write but
               | thought it was interesting.
        
               | marcellus23 wrote:
               | From the HN guidelines:
               | 
               | > Otherwise please use the original title, unless it is
               | misleading or linkbait; don't editorialize.
               | 
               | Arguably this is misleading or clickbait, but safer to
               | err on the side of using the original title.
        
         | enoughalready wrote:
         | Have you contemplated running the python code in a virtual
         | environment in the browser?
        
         | KennyBlanken wrote:
         | Can you get someone to fix the CSS crap on the website? When I
         | have it open it uses 40-50% of my GPU (normally ~5% in most
         | usage)...and when I try to scroll, the scrolling is jerky mess?
        
         | ryao wrote:
         | I imagine you need to make and destroy sandboxed environments
         | quite often. How fast does your code create a sandboxed
         | environment?
         | 
         | Do you make the environments on demand or do you make them
         | preemptively so that one is ready to go the moment that it is
         | needed?
         | 
         | If you make them on demand, have you tested ZFS snapshots to
         | see if it can be done even faster using zfs clone?
        
           | luke-stanley wrote:
           | I use ZFS, but isn't the situation the sandbox is in totally
           | different? Why would it be optimal?
        
             | RunningDroid wrote:
             | I believe they were referring to the use of ZFS snapshots
             | for a Copy-on-Write type setup
        
             | ryao wrote:
             | If you are making sandboxes, you need to put the files in
             | place each time. With ZFS clones, you can keep referencing
             | the same files repeatedly, so the amount of changes to
             | memory needed to create an environment are minimized. Let's
             | say the sandbox is 1GB and each clone operation does less
             | than 1MB of memory writes. Then you have a >1000x reduction
             | in writing needed to make the environment.
             | 
             | Furthermore, ZFS ARC should treat each read operation of
             | the same files as reading the same thing, while a sandbox
             | made the traditional way would treat the files as unique,
             | since they would be full copies of each other rather than
             | references. ZFS on the other hand should only need to keep
             | a single copy of the files cached for all environments.
             | This reduces memory requirements dramatically.
             | Unfortunately, the driver has double caching on mmap()'ed
             | reads, but the duplication will only be on the actual files
             | accessed and the copies will be from memory rather than
             | disk. A modified driver (e.g. OSv style) would be able to
             | eliminate the double caching for mmap'ed reads, but that is
             | a future enhancement.
             | 
             | In any case, ZFS clones should have clear advantages over
             | the more obvious way of extracting a tarball every time you
             | need to make a new sandbox for a Python execution
             | environment.
        
               | o11c wrote:
               | It's worth noting that if you go down a layer, LVM
               | snapshots are filesystem-independent.
        
               | ryao wrote:
               | You need to preallocate space on LVM2 for storing changes
               | and if it fills, bad things happen. You have write
               | amplification of 4MB per write by default on LVM2, while
               | ZFS just writes what is needed, since LVM2 isn't aware of
               | the filesystem structures. All of the advantages WRT
               | cache are gone if you use LVM2 too. Correct me if I am
               | wrong.
               | 
               | That said, if you really want to use block devices, you
               | could use zvols to get something similar to LVM2 out of
               | ZFS, but it is not as good as using snapshots on ZFS'
               | filesystems. The write amplification would be lower by
               | default (8KB versus 4MB). The page cache would still
               | duplicate data, but the buffer cache duplication should
               | be bypassed if I recall correctly.
        
           | blixt wrote:
           | Seconding this. Also curious if this is done with
           | microkernels (I put Unikraft high on the list of tech I'd use
           | for this kind of problem, or possibly the still-in-beta
           | CodeSandbox SDK - and maybe E2B or Fly but didn't have as
           | good experiences with those).
        
           | dullcrisp wrote:
           | What's ZFS? That doesn't sound like a Google internal tool
           | I've ever heard of.
        
             | 2OEH8eoCRo0 wrote:
             | Oh boy. Get ready for the zealots
        
             | x-complexity wrote:
             | https://en.wikipedia.org/wiki/ZFS
             | 
             | It's a filesystem, to put it simply.
        
         | wunderwuzzi23 wrote:
         | That's cool. I did something similar in the early days with
         | Google Bard when data visualization was added, which I believe
         | was when the ability to run code got introduced.
         | 
         | One question I always had was what the user "grte" stands
         | for...
         | 
         | Btw. here the tricks I used back then to scrape the file
         | system:
         | 
         | https://embracethered.com/blog/posts/2024/exploring-google-b...
        
           | jemfinch wrote:
           | grte is probably "google runtime environment", I would
           | imagine.
        
           | flawn wrote:
           | It says in the article - Google Runtime Environment
        
           | waych wrote:
           | The "runtime" is a google internal distribution of libc +
           | binutils that is used for linking binaries within the
           | monolithic repo, "google3".
           | 
           | This decoupling of system libraries from the OS itself is
           | necessary because it otherwise becomes unmanageable to ensure
           | "google3 binaries" remain runnable on both workstations and
           | production servers. Workstations and servers each have their
           | own Linux distributions, and each also needs to change over
           | time.
        
             | saagarjha wrote:
             | Of course, this meant that some tools got stuck on some old
             | glibc from like 2007.
        
         | jwlake wrote:
         | Is there any reason it's not documented?
        
         | ed_elliott_asc wrote:
         | This is why hacker news is so cool
        
       | fpgaminer wrote:
       | Awww, I was looking forward to seeing some of the leak ;) Oh
       | well. Nice find and breakdown!
       | 
       | Somewhat relatedly, it occurred to me recently just how important
       | issues like prompt injection, etc are for LLMs. I've always
       | brushed them off as unimportant to _me_ since I'm most interested
       | in local LLMs. Who cares if a local LLM is weak to prompt
       | injection or other shenanigans? It's my AI to do with as I
       | please. If anything I want them to be, since it makes it easier
       | to jailbreak them.
       | 
       | Then Operator and Deep Research came out and it finally made
       | sense to me. When we finally have our own AI Agents running
       | locally doing jobs for us, they're going to encounter random
       | internet content. And the AI Agent obviously needs to read that
       | content, or view the images. And if it's doing that, then it's
       | vulnerable to prompt injection by third party.
       | 
       | Which, yeah, duh, stupid me. But ... is also a really fascinating
       | idea to consider. A future where people have personal AIs, and
       | those AIs can get hacked by reading the wrong thing from the
       | wrong backalley of the internet, and suddenly they are taken over
       | by a mind virus of sorts. What a wild future.
        
         | 20after4 wrote:
         | > reading the wrong thing from the wrong backalley of the
         | internet, and suddenly they are taken over by a mind virus of
         | sorts. What a wild future.
         | 
         | This already happens to people on the internet.
        
           | tcoff91 wrote:
           | Yeah, the way some people lose it from the internet reminds
           | me of Snow Crash.
        
       | paxys wrote:
       | Funny enough while "We hacked Google's AI" is going to get the
       | clicks, in reality they hacked the one part of Gemini that was
       | NOT the LLM (a sandbox environment meant to run untrusted user-
       | provided code).
       | 
       | And "leaked its source code" is straight up click bait.
        
         | HenryBemis wrote:
         | Click and cash (for the great trio).
        
         | dang wrote:
         | Ok, we put the sandbox in the title above. Thanks!
         | 
         | (Submitted title was "We hacked Google's A.I Gemini and leaked
         | its source code (at least some part)")
        
           | topsycatt wrote:
           | Thanks!
        
           | infinghxsg wrote:
           | Instead of sandbox can you just make sure people know it was
           | not a meaningful hack?
           | 
           | I mean I "hacked" this site too by those standards.
        
             | dang wrote:
             | What would be a more accurate and neutral wording?
        
               | xnx wrote:
               | We uncovered some internal details of the Gemini Python
               | sandbox
        
         | IshKebab wrote:
         | They didn't even hack it.
        
       | ein0p wrote:
       | They hacked the sandbox, and leaked nothing. The article is
       | entertaining though.
        
         | kccqzy wrote:
         | They leaked one file in the sandbox that contained lots of
         | internal proto files. The security team reviewed everything in
         | the sandbox and thought nothing in it is sensitive and gave the
         | green light; apparently the review didn't catch this in the
         | sandbox.
         | 
         | I guess this is a failing of the security review process, and
         | possibly also how the blaze build system worked so well that
         | people forgot a step existed because it was too automated.
        
           | charcircuit wrote:
           | >that contained lots of internal proto files
           | 
           | So does Google Chrome.
        
             | kccqzy wrote:
             | No it's not the same level of internal. There are internal
             | proto files specific to Chromium and its API endpoints, and
             | then there are internal proto files for google3. The latter
             | can divulge secrets about Google's general server side
             | architecture. The former only divulges secrets about server
             | side components relevant to Chromium.
        
       | simonw wrote:
       | I've been using a similar trick to scrape the visible internal
       | source code of ChatGPT Code Interpreter into a GitHub repository
       | for a while now: https://github.com/simonw/scrape-openai-code-
       | interpreter
       | 
       | It's mostly useful for tracking what Python packages are
       | available (and what versions): https://github.com/simonw/scrape-
       | openai-code-interpreter/blo...
        
         | Zopieux wrote:
         | Meanwhile they could just decide to publish this list in a
         | document somewhere and keep it automatically up to date with
         | their infra.
         | 
         | But not, secrecy for the sake of secrecy.
        
           | aleksiy123 wrote:
           | Tbh I doubt this is secrecy.
           | 
           | More likely just noone has taken the time and effort to do
           | it.
        
           | 12345hn6789 wrote:
           | What would the benefit of doing this be?
        
             | simonw wrote:
             | It's documentation. Makes it much easier for people to know
             | what kind of problems they can solve using Code
             | Interpreter.
             | 
             | It's a bit absurd that the best available documentation for
             | that feature exists in my hacky scraped GitHub repository.
        
         | fudged71 wrote:
         | I just used this package list (and sandbox limitations) to
         | synthesize a taxonomy of capabilities:
         | https://gist.github.com/trbielec/a00a58fa97a232bef8984cc8d01...
        
       | theLiminator wrote:
       | It's actually pretty interesting that this shows that Google is
       | quite secure, I feel like most companies would not fare nearly as
       | well.
        
         | kccqzy wrote:
         | Yes and especially the article mentions "With the help of the
         | Google Security Team" so it's quite collaborative and not
         | exactly black box hacking.
        
       | jll29 wrote:
       | Running the built-in "strings" command to extract a few file
       | names from a binary is hardly hacking/cracking.
       | 
       | Ironically, though, getting the source code of Gemini perhaps
       | wouln't be valuable at all; but if you had found/obtained access
       | to the corpus that the model was pre-trained with, that would
       | have been kind of interesting (many folks have many questions
       | about that...).
        
         | dvt wrote:
         | > but if you had found/obtained access to the corpus that the
         | model was pre-trained with, that would have been kind of
         | interesting
         | 
         | Definitionally, that input gets compressed into the weights.
         | Pretty sure there's a proof somewhere that shows LLM training
         | is basically a one-way (lossy) compression, so there's no way
         | to go back afaik?
        
           | jdiff wrote:
           | Not the original, but a lossy facsimile that's Good Enough
           | for almost anything. And as the short history of LLMs and
           | other nets has shown us, they're often not even all that
           | lossy.
        
       | tgtweak wrote:
       | The definition of hacking is getting pretty loose. This looks
       | like the sandbox is doing exactly what it's supposed to do and
       | nothing sensitive was exfiltrated...
        
       | jeffbee wrote:
       | I guess these guys didn't notice that all of these proto
       | descriptors, and many others, were leaked on github 7 years ago.
       | 
       | https://github.com/ezequielpereira/GAE-RCE/tree/master/proto...
        
       | bluelightning2k wrote:
       | Cool write up. Although it's not exactly a huge vulnerability. I
       | guess it says a lot about how security conscious Google is that
       | they consider this to be significant. (You did mention that you
       | knew the company's specific policy considered this highly
       | confidential so it does count but it feels a little more like
       | "technically considered a vulnerability" rather than clearly
       | one.)
        
       | parliament32 wrote:
       | > resulting in the unintended inclusion of highly confidential
       | internal protos in the wild
       | 
       | I don't think they're all that confidential if they're all on
       | github: https://github.com/ezequielpereira/GAE-
       | RCE/tree/master/proto...
        
         | saagarjha wrote:
         | I mean, those were also disclosed via a vulnerability.
        
           | Brian_K_White wrote:
           | But it still means they aren't guilty of leaking/disclosing
           | them.
           | 
           | It's not a valid point of criticism. The escape did not in
           | fact "result" in the leak of confidential photos. That
           | already happened somewhere else. This only resulted in the
           | republishing of something already public.
           | 
           | Or another way, it's not merely that they were already public
           | elsewhere, the imortant point is that the photos were not
           | given to the ai in confidence, and so re-publishing them did
           | not violate a confidence, any more than say github did.
           | 
           | I'm no ai apologist btw. I say all of these ais are
           | committing mass copyright violation a million times a second
           | all day every day since years ago now.
        
       | qwertox wrote:
       | Super interesting article.
       | 
       | > but those files are internal categories Google uses to classify
       | user data.
       | 
       | I really want to know what kind of classification this is. Could
       | you at least give one example? Like "Has autism" or more like "Is
       | user's phone number"?
        
         | StephenAmar wrote:
         | The latter. Like is it a public ID, an IP, user input, ssn,
         | phone number, lat/long...
         | 
         | Very useful for any scenario where you output the proto, like
         | logs, etc...
        
       | commandersaki wrote:
       | _Their "LLM bugSWAT" events, held in vibrant locales like Las
       | Vegas, are a testament to their commitment to proactive security
       | red teaming._
       | 
       | I don't understand why security conferences are attracted to
       | Vegas. In my opinion its a pretty gross place to conduct any
       | conference.
        
         | numbsafari wrote:
         | You answered your own question.
        
         | zem wrote:
         | relatively cheap event space and hotels. it's hard to find a
         | city to host a large conference.
        
         | hashstring wrote:
         | Real, I feel the exact same way.
        
         | desmosxxx wrote:
         | What don't you understand. Vegas is literally built for
         | conferences.
        
         | scudsworth wrote:
         | reinvent is in vegas
        
         | lmm wrote:
         | Excluding uptight scolds is a feature not a bug. There's a lot
         | of overlap between people who find Vegas objectionable and
         | people who find red teaming objectionable (because why would
         | any decent person know attacking/exploiting techniques).
        
           | commandersaki wrote:
           | The irony is that Vegas takes a dim view of those that take
           | advantage of their gaming venues. The institutions that run
           | it are quite aggressive when it comes to being attacked.
           | 
           | Anyways, security conferences such as BSides run all over the
           | world in various cities where red teaming type activities is
           | embraced. IMO it'd be nice to diversify from Vegas,
           | preferably places with more scenery/greenery like Boulder or
           | something.
        
       | b0ner_t0ner wrote:
       | Very distracting background/design on desktop; had to toggle
       | reader view.
        
       | lqstuart wrote:
       | So by "we hacked Gemini and leaked its source code" you really
       | mean "we played with Gemini with the help of Google's security
       | team and didn't leak anything"
        
       | Cymatickot wrote:
       | Probably best text I've seen in AI train ride recently:
       | 
       | """"" As companies rush to deploy AI assistants, classifiers, and
       | a myriad of other LLM-powered tools, a critical question remains:
       | are we building securely ? As we highlighted last year, the rapid
       | adoption sometimes feels like we forgot the fundamental security
       | principles, opening the door to novel and familiar
       | vulnerabilities alike. """"
       | 
       | There this case and there many other cases. I worry for copy &
       | paste dev.
        
       | mr_00ff00 wrote:
       | Slightly irrelevant, but love the color theme on the python code
       | snippets. Wish I knew what it was.
        
       ___________________________________________________________________
       (page generated 2025-03-29 23:01 UTC)