hngopher.com

       [HN Gopher] Launch HN: Cua (YC X25) - Open-Source Docker Contain...
       ___________________________________________________________________
        
       Launch HN: Cua (YC X25) - Open-Source Docker Container for
       Computer-Use Agents
        
       Hey HN, we're Francesco and Alessandro, the creators of c/ua
       (https://www.trycua.com), a Docker-style container runtime that
       lets AI agents drive full operating systems in lightweight,
       isolated VMs. Our entire framework is open-source
       (https://github.com/trycua/cua), and today we're thrilled to have
       our Launch HN!  Check out our demo to see it in action:
       https://www.youtube.com/watch?v=Ee9qf-13gho, and for more examples
       - including Tableau, Photoshop, CAD workflows - see the demos in
       our repo: https://github.com/trycua/cua.  For Computer-Use AI
       agents to be genuinely useful, they must interact with your
       system's native applications. But giving full access to your host
       device is risky. What if the agent's process gets compromised, or
       the LLM hallucinates and leaks your data? And practically speaking,
       do you really want to give up control of your entire machine just
       so the agent can do its job?  The idea behind c/ua is simple: let
       agents operate in a mirror of the user's system - isolated, secure,
       and disposable - so users can fire-and-forget complex tasks without
       needing to dedicate their entire system to the agent. By running in
       a virtualized environment, agents can carry out their work without
       interrupting your workflow or risking the integrity of your system.
       While exploring this idea, I discovered Apple's
       Virtualization.Framework and realized it offered fast and
       lightweight virtualization on Apple Silicon. This led us to build a
       high-performance virtualization layer and, eventually, a computer-
       use interface that allows agents to interact with apps just like a
       human would - without taking over the entire system.  As we built
       this, we decided to open-source the virtualization core as a
       standalone CLI tool called Lume (Show HN here:
       https://news.ycombinator.com/item?id=42908061). c/ua builds on top
       of Lume, providing a full framework for running agent workflows
       inside secure macOS or Linux VMs, so your system stays free for you
       to use while the agent works its magic in the background.  With Cua
       you can build an AI agent within a virtual environment to: -
       navigate and interact with any application's interface; - read
       screen content and perform keyboard/mouse actions; - switch between
       applications and self-debug when needed; - operate in a secure
       sandbox with controlled file access. All of this occurs in a fully
       isolated environment, ensuring your host system, files, and
       sensitive data remain completely secure, while you continue using
       your device without interruption.  People are using c/ua to: -
       Bypass CryptoJS-based encryption and anti-bot measures to interact
       with modern web apps reliably; - Automate Tableau dashboards and
       export insights via Claude Desktop; - Drive Photoshop for batch
       image editing by prompt; - Modify 3D models in Fusion 360 with a
       CAD Copilot; -Extract data from legacy ERP apps without brittle
       screen-scraping scripts.  We're currently working on multi-VM
       orchestration for parallel agentic workflows, Windows and Linux VM
       support, and episodic and long-term memory for CUA Agents.  On the
       open-source side, c/ua is 100 % free under the MIT license - run it
       locally with any LLM you like. We're also gearing up a hosted
       orchestration service for teams who want zero-ops setup (early
       access sign-ups opening soon).  We'd love to hear from you. What
       desktop or legacy apps do you wish you could automate? Any
       thoughts, feedback, or horror stories from fragile AI automations
       are more than welcome!
        
       Author : frabonacci
       Score  : 157 points
       Date   : 2025-04-23 15:55 UTC (1 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | mountainriver wrote:
       | This is cool! We built a similar thing with AgentDesk
       | https://github.com/agentsea/agentdesk
       | 
       | Would love to chat sometime!
        
         | reindent wrote:
         | That's great.
         | 
         | Also built something on top of Browser Use (Nanobrowser) and
         | Docker.
         | 
         | https://github.com/reindent/nanomachine
         | 
         | Just finished planning and shell capabilities
         | 
         | Lets chat @reindentai (X)
        
           | frabonacci wrote:
           | Sure - just followed you back!
        
         | frabonacci wrote:
         | I love AgentDesk's take on Kubernetes - it's something we had
         | considered as well, but it didn't make much sense for macOS
         | since you can only spin up two macOS VMs at a time due to
         | Apple's licensing restrictions.
         | 
         | Feel free to join our Discord so we can chat more:
         | https://discord.com/invite/mVnXXpdE85
        
           | mountainriver wrote:
           | Thats a fantastic way to get your IP banned :)
        
         | abshkbh wrote:
         | https://github.com/abshkbh/arrakis Also building in this space
         | using MicroVMs. Currently working on a Mac port. Would love to
         | connect - abshkbh AT gmail.com
        
       | jameskuj wrote:
       | A superfan of this product!
        
         | frabonacci wrote:
         | Thank you - your support means a lot to us!
        
       | zwenbo wrote:
       | Amazing product! Congrats on the launch!
        
         | frabonacci wrote:
         | Thank you so much - we truly appreciate your support!
        
       | tomatohs wrote:
       | Would love to use this for TestDriver, but needs to support
       | Windows :*(
        
         | frabonacci wrote:
         | Windows host support is on our roadmap - we're currently
         | exploring virtualization options with KVM/QEMU. Please join the
         | discussion on our Discord:
         | https://discord.com/invite/mVnXXpdE85
        
       | brene wrote:
       | will this also be available as a hosted service? Or do you have
       | instructions on how to manage a fleet of these manually while
       | you're building the orchestration workflows?
        
         | frabonacci wrote:
         | Yes, we're currently running pilots with select customers for a
         | hosted service of Cua supporting macOS and Windows cloud
         | instances. Feel free to reach out with your use case at
         | founders@trycua.com
        
       | ekarabeg wrote:
       | Congrats on the launch! Awesome product!
        
         | frabonacci wrote:
         | Thanks -- we really appreciate your support!
        
       | rahimnathwani wrote:
       | I tried this three times. Twice a few days ago and once just now.
       | 
       | First time: it opened a MacOS VM and started to do stuff, but it
       | got ahead of itself and starting typing things in the wrong
       | place. So now that VM has a Finder window open, with a recent
       | file that's called                 plt.ylabel('Price(USD)').sh
       | 
       | The second and third times, it launched the VM but failed to do
       | anything, showing these errors:                 INFO:cua:VM run
       | response: None       INFO:cua:Waiting for VM to be ready...
       | INFO:cua:Waiting for VM macos-sequoia-cua_latest to be ready
       | (timeout: 600s)...       INFO:cua:VM status changed to: stopped
       | (after 0.0s)       DEBUG:cua:Waiting for VM IP address... Current
       | IP: None, Status: stopped       DEBUG:cua:Waiting for VM IP
       | address... Current IP: None, Status: stopped
       | DEBUG:cua:Waiting for VM IP address... Current IP: None, Status:
       | stopped       INFO:cua:VM status changed to: running (after
       | 12.4s)       INFO:cua:VM macos-sequoia-cua_latest got IP address:
       | 192.168.64.2 (after 12.4s)       INFO:cua:VM is ready with IP:
       | 192.168.64.2       INFO:cua:Initializing interface for macos at
       | 192.168.64.2       INFO:cua.interface:Logger set to INFO level
       | INFO:cua.interface.macos:Logger set to INFO level
       | INFO:cua:Connecting to WebSocket interface...
       | INFO:cua.interface.macos:Waiting for Computer API Server to be
       | ready (timeout: 60s)...       INFO:cua.interface.macos:Attempting
       | WebSocket connection to ws://192.168.64.2:8000/ws
       | WARNING:cua.interface.macos:Computer API Server connection lost.
       | Will retry automatically.       INFO:cua.interface.macos:Still
       | waiting for Computer API Server... (elapsed: 10.0s, attempts: 11)
       | INFO:cua.interface.macos:Still waiting for Computer API Server...
       | (elapsed: 20.0s, attempts: 21)
       | INFO:cua.interface.macos:Still waiting for Computer API Server...
       | (elapsed: 30.0s, attempts: 31)
       | WARNING:cua.interface.macos:Computer API Server connection lost.
       | Will retry automatically.       INFO:cua.interface.macos:Still
       | waiting for Computer API Server... (elapsed: 40.0s, attempts: 41)
       | INFO:cua.interface.macos:Still waiting for Computer API Server...
       | (elapsed: 50.1s, attempts: 51)
       | ERROR:cua.interface.macos:Could not connect to 192.168.64.2 after
       | 60 seconds       ERROR:cua:Failed to connect to WebSocket
       | interface       DEBUG:cua:Computer initialization took 76856.09ms
       | ERROR:agent.core.agent:Error in agent run method: Could not
       | connect to WebSocket interface at 192.168.64.2:8000/ws: Could not
       | connect to 192.168.64.2 after        60 seconds
       | WARNING:cua.interface.macos:Computer API Server connection lost.
       | Will retry automatically.
       | 
       | This was using the gradio interface, with the agent loop provider
       | as OMNI and the model as gemma3:4b-it-q4_K_M
       | 
       | These versions:                 cua-agent==0.1.29       cua-
       | computer==0.1.23       cua-core==0.1.5       cua-som==0.1.3
        
         | frabonacci wrote:
         | Thanks for trying out c/ua! We still recommend pairing the Omni
         | loop configuration with a more capable VLM, such as Qwen2.5-VL
         | 32B, or using a cloud LLM provider like Sonnet 3.7 or OpenAI
         | GPT-4.1. While we believe that in the coming months we'll see
         | better-performing quantized models that require less memory for
         | local inference, truth is we're not quite there yet.
         | 
         | Stay tuned - we're also releasing support for UI-Tars-1.5 7B
         | this week! It offers excellent speed and accuracy, and best of
         | all, it doesn't require bounding box detection (Omni) since
         | it's a pixel-native model.
        
           | rahimnathwani wrote:
           | Thanks. I'll try that, but right now it's not working at all,
           | i.e. cua can't interact with the VM at all. That's a not a
           | model issue.
        
             | frabonacci wrote:
             | If you're running Cua from VS Code or Cursor, have you
             | checked out this issue?
             | https://github.com/trycua/cua/issues/61
             | 
             | Feel free to ping me on Discord (I'm francesco there) -
             | happy to hop on a quick call to help debug:
             | https://discord.com/invite/mVnXXpdE85
        
       | brap wrote:
       | Congrats on the launch!
       | 
       | I don't know if this is a problem you've faced, but I'm curious:
       | how do LLM tool devs handle authn/authz? Do host apps normally
       | forward a token or something? Is there a standard commonly used?
       | What if the tool needs some permissions to act on the user's
       | behalf?
        
         | alexchantavy wrote:
         | There are companies like https://www.keycard.sh/ taking this
         | on. There are other competitors too but I can't think of them
         | atm
        
         | frabonacci wrote:
         | Good question! Specifically around computer-use agents (CUAs),
         | I haven't seen much exploration yet - and I think it's an area
         | worth exploring for vertical products. For example, how do you
         | securely handshake between a CUA agent and an API-based agent
         | without exposing credentials? If everything stays within a
         | local cluster, it's manageable, but once you start scaling out,
         | authn/authz becomes a real headache.
         | 
         | I'm also working on a blog post that touches on this -
         | particularly in the context of giving agents long-term and
         | episodic memory. Should be out next week!
        
       | SkylerJi wrote:
       | This is insane y'all
        
         | frabonacci wrote:
         | Thank you - we appreciate it!
        
       | farazmsiddiqi wrote:
       | i love this -- isolation and permissioning for computer use
       | agents. why can't i use regular docker containers to deploy my
       | computer use agent?
        
         | frabonacci wrote:
         | Glad you love it! Right now, we're relying more on the Lume CLI
         | and its API server rather than a full Docker setup. However,
         | we'll soon be shipping a Docker interface that'll handle VNC
         | and model hosting (through docker model runner). Stay tuned for
         | that!
        
       | winwang wrote:
       | Congrats! How do you guys deal with SOC2/HIPAA/etc.? Or are those
       | separate concerns?
        
         | frabonacci wrote:
         | Thanks! Great question - those are definitely relevant, but
         | they depend a lot on the deployment model. Since CUAs often run
         | locally or in controlled environments (e.g. a user's own VM or
         | cluster), we can sidestep a lot of traditional SOC2/HIPAA
         | concerns around centralized data handling. That said, if you're
         | running agents across org boundaries or processing sensitive
         | data via cloud APIs, then yeah - those frameworks absolutely
         | come into play.
         | 
         | We're designing with that in mind: think fine-grained
         | permissioning, auditability, and minimizing surface area. But
         | it's still early, and a lot of it depends on how teams end up
         | using CUAs in practice.
        
       | xdotli wrote:
       | THIS IS FIRE been wanting this for ages
        
         | frabonacci wrote:
         | Thank you for your support!
        
       | swanYC wrote:
       | Love this !
        
         | frabonacci wrote:
         | Thank you - we appreciate it!
        
       | throw03172019 wrote:
       | This is precisely what I am looking for but for Windows. We need
       | to automate some Windows native apps.
       | 
       | In the meantime, I'll give this a shot on macOS tonight.
       | Congrats!
        
         | shykes wrote:
         | Check out pig: https://pig.dev
         | 
         | (I am not affiliated)
        
           | throw03172019 wrote:
           | I do recall looking at it before but was concerned about
           | HIPAA if they are storing data on their servers as well.
           | 
           | Also, is the project still active? No commits for 2 months is
           | odd for a YC startup in current batch :)
        
         | frabonacci wrote:
         | Yes - pig.dev is a great product! You should definitely check
         | it out.
         | 
         | Also, let us know on Discord once you've tried out c/ua locally
         | on macOS: https://discord.com/invite/mVnXXpdE85
        
       | 3s wrote:
       | this is really cool! congrats on the launch
        
         | frabonacci wrote:
         | Thank you - we appreciate your support!
        
       | taikon wrote:
       | How's it different from e2b computer use?
        
         | orliesaurus wrote:
         | Active development of CUA, according to GitHub
        
         | frabonacci wrote:
         | We're still figuring things out in public, but a few key
         | differences:
         | 
         | - Open-source from the start. Cua's built under an MIT license
         | with the goal of making Computer-Use agents easy and accessible
         | to build. Cua's Lume CLI was our first step - we needed fast,
         | reproducible VMs with near-native performance to even make this
         | possible.
         | 
         | - Native macOS support. As far as we know, we're the only ones
         | offering macOS VMs out of the box, built specifically for
         | Computer-Use workflows. And you can control them with a
         | PyAutoGUI-compatible SDK (cua-computer) - so things like click,
         | type, scroll just work, without needing to deal with any inter-
         | process communication.
         | 
         | - Not just the computer/sandbox, but the agent too. We're also
         | shipping an Agent SDK (cua-agent) that helps you build and run
         | these workflows without having to stitch everything together
         | yourself. It works out of the box with OpenAI and Anthropic
         | models, UI-Tars, and basically any VLM if you're using the
         | OmniParser agent loop.
         | 
         | - Not limited to Linux. The hosted version we're working on
         | won't be Linux-only - we're going to support macOS and Windows
         | too.
        
       | orliesaurus wrote:
       | bravi! the future is the Agent OS - How robust is the UI element
       | detection and interaction across different apps and inside
       | navigating complex menus? Is it resistant to UI changes? That's
       | often where these automations get brittle.
       | 
       | thank you e forza Cua
        
         | frabonacci wrote:
         | UI detection's a big focus - we use visual grounding +
         | structured observations (like icons, OCR, app metadata, window
         | state), so the agent can reason more like a user would. It's
         | surprisingly robust even with layout shifts or new themes
        
       | gavinbains wrote:
       | Legendary. This is going to be very helpful, and the TAM is
       | getting bigger. Thank you guys for this, and for all the
       | learnings in-batch -- I'm excited for the future!
       | 
       | I reckon I could run this for buying fashion drops, is this a use
       | case y'all have seen?
        
         | frabonacci wrote:
         | Appreciate that a lot! Yep - buying fashion drops, limited
         | releases, ticketing, etc. are all great fits. Cua can also
         | bypass CryptoJS-based encryption and other anti-bot measures,
         | so it plays nicely with modern web apps out of the box.
        
       | sagarpatil wrote:
       | Love your accent!
        
         | frabonacci wrote:
         | Thank you!!
        
       | dhruv3006 wrote:
       | One-shot VM would be nice. ephemeral VM spins up, agent runs
       | task, VM is deleted --perfect for CI pipelines.
        
         | frabonacci wrote:
         | 100% - ephemeral VMs are on the roadmap. Perfect for CI: spin
         | up, run the agent, nuke it
        
           | jeol_wa wrote:
           | perfect
        
       | badmonster wrote:
       | Congrats on the launch! love this idea. How does the LLM interact
       | with the VM--screen+metadata as JSON, or higher-level planning?
        
         | frabonacci wrote:
         | Thanks, really appreciate it!
         | 
         | The LLM interacts with the VM through a structured virtual
         | computer interface (cua-computer and cua-agent). It's a high-
         | level abstraction that lets the agent act (e.g., "open
         | Terminal", "type a command", "focus an app") and observe (e.g.,
         | current window, file system, OCR of the screen, active
         | processes) in a way that feels a lot more like using a real
         | computer than parsing raw data.
         | 
         | So under the hood, yes, screen+metadata are used (especially
         | with the Omni loop and visual grounding), but what the model
         | sees is a clean interface designed for agentic workflows -
         | closer to how a human would think about using a computer.
         | 
         | If you're curious, the agent loops (OpenAI, Anthropic, Omni,
         | UI-Tars) offer different ways of reasoning and grounding
         | actions, depending on whether you're using cloud or local
         | models.
         | 
         | https://github.com/trycua/cua/tree/main/libs/agent#agent-loo...
        
           | baritone wrote:
           | First off- this is great, and I think there are use-cases for
           | this. Being able to even partially isolate could be helpful.
           | 
           | Second, as a user, you'd want to handle the case where some
           | or all of these have been fully compromised. Surreptitiously,
           | super-intelligently, and partially or fully autonomously, one
           | container or many may have access to otherwise isolated
           | networks within homes, corporate networks, or some device in
           | a high security area with access to a nuclear weapons,
           | biological weapons, the electrical grid, our water supply,
           | our food supplies, manufacturing, or even some other key
           | vulnerability we've discounted, like a toy.
           | 
           | While providing more isolation is good, there is no amount of
           | caution that can prevent calamity when you give everyone a
           | Pandora's box. It's like giving someone a bulletproof jacket
           | to protect them from fox tapeworm cancer or hyper-
           | intelligent, time-traveling, timespace-manipulating super-
           | Ebola.
           | 
           | That said, it's the world we live in now, where we're in a
           | race to our demise. So, thanks for the bulletproof jacket.
        
       | contr-error wrote:
       | This is amazing, especially if it helps facilitate astroturfing,
       | such as these comments made by fresh users, all with AI-generated
       | responses from frabonacci:
       | 
       | https://news.ycombinator.com/threads?id=SkylerJi
       | 
       | https://news.ycombinator.com/threads?id=zwenbo
       | 
       | https://news.ycombinator.com/threads?id=ekarabeg
       | 
       | https://news.ycombinator.com/threads?id=jameskuj
        
         | TehCorwiz wrote:
         | The dead internet is real.
         | 
         | Seriously though, this kind of behavior should be considered a
         | violation of the social contract.
        
         | dang wrote:
         | It's not intentional - it's YC founders excitedly telling their
         | friends (and especially their YC batchmates) that they launched
         | on HN. They didn't ask anyone to vote or comment, and the
         | responses were not AI-generated. (That last point should be
         | obvious btw; no one needs AI to write "Thank you - we
         | appreciate it", and frabonacci was obviously just being
         | polite.)
         | 
         | Here's what you guys need to understand:
         | 
         | (1) Not everyone spends hours on Hacker News--many casual users
         | have no idea about the culture of this place re voting rings,
         | booster comments, and so on.
         | 
         | (2) Many people enjoy congratulating their friends when they
         | reach a major milestone.
         | 
         | (3) Other sites have a culture where this kind of thing is
         | fine.
         | 
         | HN is different, of course, and we tell founders to stop this
         | from happening. In fact, I basically yell it at them in the
         | Launch HN guide: https://news.ycombinator.com/yli.html#noboost.
         | I also yell it at them in person every chance I get--I do my
         | best to scare them! But if you think that including something
         | in a list of rules plus repeating it over and over in person is
         | sufficient to get a message across, may I introduce you to the
         | Measure Zero Effect: no matter how often you repeat something,
         | the set of users who receive the message has measure zero (http
         | s://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...)
         | 
         | As it happens, I saw those comments in the thread (mostly the
         | same ones you listed), marked them offtopic, and emailed the
         | founders as soon as I could:
         | 
         | " _Btw, did you send a message to batchmates /friends about
         | this thread? I"m seeing a lot of booster comments in there now.
         | This is not good for you! (See
         | https://news.ycombinator.com/yli.html.)_
         | 
         |  _Fortunately though, there are a lot of organic comments as
         | well so I can just move the booster ones lower down and they
         | shouldn 't harm anything. Still, if you have a way to tell your
         | friends not to do that, it would be good. Send them to
         | https://news.ycombinator.com/yli.html as well, if you like :) -
         | the text about that is repeated and in a bold font for a
         | reason!_"
         | 
         | They replied that their Discord was probably spreading word of
         | the launch and they'd add a message asking people to stop.
         | After that, it mostly stopped.
        
       | suninsight wrote:
       | Very cool product !
       | 
       | We, at NonBioS.ai [AI Software Dev], built something like this
       | from scratch for Linux VM's, and it was a heavy lift. Could have
       | used you guys if had known about it. But can see this being
       | immediately useful at a ton of places.
        
         | frabonacci wrote:
         | Thank you, really appreciate that! Curious - did you end up
         | using QEMU for your Linux VMs? And are you running your system
         | locally or in the cloud?
         | 
         | We're currently focused on macOS but planning to support Linux
         | soon, so I'd love to hear more about your use case. Feel free
         | to reach out at founders@trycua.com - always great to learn
         | from others building in this space.
        
           | suninsight wrote:
           | No we dont use QEMU - never heard of them till now. We built
           | our own software from scratch - using Ubuntu - for AI. We are
           | completely on the cloud. Every user gets a full Ubuntu Cloud
           | VM for his NonBioS AI Engineer to work on.
           | 
           | We covered this a fair bit on our blogs: -
           | https://www.nonbios.ai/post/why-nonbios-chose-cloud-vms-
           | for-... - https://www.nonbios.ai/post/private-linux-vms-for-
           | every-nonb...
        
       | gitroom wrote:
       | man this is insane - being able to spin up secure agent vms this
       | easy would save me so much pain lmao
        
         | frabonacci wrote:
         | Thanks! I'd love to hear more about your use case!
        
       | jeol_wa wrote:
       | Amazing, I was thinking of implementing something like this after
       | taking a course on Building Code Agents with Smolagents from
       | Deeplearning.ai
       | 
       | I wanted to look at a Docker alternative to e2b
        
         | frabonacci wrote:
         | Thank you! If you're looking for a Docker alternative to
         | something like e2b, we're planning to ship a containerized
         | version of c/ua that also handles VNC and model hosting. Right
         | now we're using the Lume CLI
         | (https://github.com/trycua/cua/tree/main/libs/lume) with an API
         | server on the host as a lightweight alternative, but the Docker
         | setup will make it easier to self-host and extend. Would love
         | to hear what kind of workloads or use cases you had in mind!
        
       ___________________________________________________________________
       (page generated 2025-04-24 23:01 UTC)