[HN Gopher] Launch HN: Mosaic (YC W25) - Agentic Video Editing
___________________________________________________________________
Launch HN: Mosaic (YC W25) - Agentic Video Editing
Hey HN! We're Adish & Kyle from Mosaic (https://edit.mosaic.so,
https://docs.mosaic.so/, https://mosaic.so). Mosaic lets you create
and run your own multimodal video editing agents in a node-based
canvas. It's different from traditional video editing tools in two
ways: (1) the user interface and (2) the visual intelligence built
into our agent. We were engineers at Tesla and one day had a fun
idea to make a YouTube video of Cybertrucks in Palo Alto. We
recorded hours of cars driving by, but got stuck on how to scrub
through all this raw footage to edit it down to just the
Cybertrucks. We got frustrated trying to accomplish simple tasks
in video editors like DaVinci Resolve and Adobe Premiere Pro.
Features are hidden behind menus, buttons, and icons, and we often
found ourselves Googling or asking ChatGPT how to do certain edits.
We thought that surely now, with multimodal AI, we could accelerate
this process. Better yet, an AI video editor could automatically
apply edits based off what it sees and hears in your video. The
idea quickly snowballed and we began our side quest to build
"Cursor for Video Editing". We put together a prototype and to our
amazement, it was able to analyze and add text overlays based on
what it saw or heard in the video. We could now automate our
Cybertruck counting with a single chat prompt. That prototype is
shown here: https://www.youtube.com/watch?v=GXr7q7Dl9X0. After
that, we spent a chunk of time building our own timeline-based
video editor and making our multimodal copilot powerful and
stateful. In natural language, we could now ask chat to help with
AI asset generation, enhancements, searching through assets, and
automatically applying edits like dynamic text overlays. That
version is shown here: https://youtu.be/X4ki-QEwN40. After talking
to users though, we realized that the chat UX has limitations for
video: (1) the longer the video, the more time it takes to process.
Users have to wait too long between chat responses. (2) Users have
set workflows that they use across video projects. Especially for
people who have to produce a lot of content, the chat interface is
a bottleneck rather than an accelerant. That took us back to first
principles to rethink what a "non-linear editor" really means. The
result: a node-based canvas which enables you to create and run
your own multimodal video editing agents.
https://screen.studio/share/SP7DItVD. Each tile in the canvas
represents a video editing operation and is configurable, so you
still have creative control. You can also branch and run edits in
parallel, creating multiple variants from the same raw footage to
A/B test different prompts, models, and workflows. In the canvas,
you can see inline how your content evolves as the agent goes
through each step. The idea is that canvas will run your video
editing on autopilot, and get you 80-90% of the way there. Then you
can adjust and modify it in an inline timeline editor. We support
exporting your timeline state out to traditional editing tools like
DaVinci Resolve, Adobe Premiere Pro, and Final Cut Pro. We've also
used multimodal AI to build in visual understanding and
intelligence. This gives our system a deep understanding of video
concepts, emotions, actions, spoken word, light levels, shot types.
We're doing a ton of additional processing in our pipeline, such as
saliency analysis, audio analysis, and determining objects of
significance--all to help guide the best edit. These are things
that we as human editors internalize so deeply we may not think
twice about it, but reverse-engineering the process to build it
into the AI agent has been an interesting challenge. Some of our
analysis findings: Optimal Safe Rectangles:
https://assets.frameapp.ai/mosaicresearchimage1.png Video Analysis:
https://assets.frameapp.ai/mosaicresearchimage2.png Saliency
Analysis: https://assets.frameapp.ai/mosaicresearchimage3.png Mean
Movement Analysis:
https://assets.frameapp.ai/mosaicresearchimage4.png Use cases for
editing include: - Removing bad takes or creating script-based cuts
from videos / talking-heads - Repurposing longer-form videos into
clips, shorts, and reels (e.g. podcasts, webinars, interviews) -
Creating sizzle reels or montages from one or many input videos -
Creating assembly edits and rough cuts from one or many input
videos - Optimizing content for various social media platforms
(reframing, captions, etc.) - Dubbing content with voice cloning
and lip syncing. We also support use cases for generating content
such as motion graphic animations, cinematic captions, AI UGC
content, adding contextual AI-generated B-Rolls to existing
content, or modifying existing video footage (changing lighting,
applying VFX). Currently, our canvas can be used to build
repeatable agentic workflows, but we're working on a fully
autonomous agent which will be able to do things like: style
transfer using existing video content, define its own editing
sequence / workflow without needing a canvas, do research and pull
assets from web references, and so on. You can try it today at
https://edit.mosaic.so. You can sign up for free and get started
playing with the interface by uploading videos, making workflows on
the canvas, and editing them in the timeline editor. We do paywall
node runs to help cover model costs. Our API docs are at
https://docs.mosaic.so. We'd love to hear your feedback!
Author : adishj
Score : 99 points
Date : 2025-11-19 15:28 UTC (7 hours ago)
(HTM) web link (mosaic.so)
(TXT) w3m dump (mosaic.so)
| tonyoconnell wrote:
| This is so cool. Good luck with your venture.
| adishj wrote:
| Thank you :)
| callamdelaney wrote:
| Hey, good luck with Mosaic.
|
| Some feedback initially on the landing page, looks great but I
| thought that there is, for me, too much motion going on on the
| homepage and the use cases page. May be an unpopular opinion!
| cjbarber wrote:
| Agreed, homepage was confusing for me also. I tried to scroll
| around and see a demo. For a product like this that is so
| visual, I expected to be able to find a 30s demo clip somewhere
| but couldn't see one on the homepage or product page (and the
| scrolling on the product page was annoying for me).
| adishj wrote:
| the sad part is spent so long on the product page scrolling
| animation haha
|
| very valid point though -- I think a demo clip of a BEFORE vs
| AFTER immediately somewhere in the hero even or right below
| it would be helpful
|
| thanks for the feedback
| adishj wrote:
| valid points, thanks for the feedback. i had gone for a certain
| aesthetic but you're right in that it may be a bit too
| overwhelming.
| cjbarber wrote:
| I think this is a great endeavor. I was thinking about a channel
| that I like watching on YouTube. They travel to exotic places by
| boat and film themselves, nature documentary style. To make good
| videos requires going to these places, a ton of filming, AND a
| ton of editing. They put out a video every 2 weeks or so on their
| trips. I imagine the editing is the hard part.
|
| This is a long winded way of saying that I think creators need
| what you're making! People who have hours of awesome footage but
| have to spend dozens of hours cutting it down need this. Then
| also people who have awesome footage but aren't good at editing
| or hiring an editor, same thing. I'd love to see someone solve
| this so that 90th percentile editing is available to all, and
| then it can be more about who has the interesting content, rather
| than who has the interesting content _and_ editing skills.
| adishj wrote:
| thanks! Mosaic can already do the rough cuts for you -- so you
| can upload all your footage from your travel, and prompt it to
| "make a 2 minute highlight reel of your trip to Japan", for
| instance.
|
| soon, we also plan to incorporate style transfer, so you could
| even give it a video from the channel you enjoy watching + your
| raw footage, and have the agent edit your footage in the same
| style of the reference video.
| mrbluecoat wrote:
| > you can upload all your footage from your travel, and
| prompt it to "make a 2 minute highlight reel of your trip to
| Japan"
|
| In relation to the demo requests below, I think this would be
| a good example of how an average person might use your
| platform.
| adishj wrote:
| for a demo, check out this one that I put together using 81
| clips from a skydiving trip we took in Monterey, CA:
|
| https://edit.mosaic.so/links/c51c0555-3114-45f4-ab8f-c25f17
| 2...
| penne_pastaa wrote:
| this is so cool, can we see some demos of edits you'd make with
| it?
| adishj wrote:
| thanks! check out the demo video here of the latest version of
| the interface: https://screen.studio/share/SP7DItVD
|
| i playback parts of the cinematic edit I made to the
| conversation between Dwarkesh Patel and Satya Nadella (e.g.
| added cinematic captions, motion graphics)
|
| i can post the full edit as well if you're interested
| jaccola wrote:
| Very cool. It definitely feels to me that the power of pro tools
| should be available to more people with AI.
|
| Would have been nice if there was a killer demo on your landing
| page of a video made with Mosaic.
| adishj wrote:
| that's our perspective as well.
|
| a lot of tooling is being built around generative AI in
| particular, but there's still a big gap for people that want to
| share their own stories / experiences / footage but aren't
| well-versed with pro tools.
|
| valid feedback on the landing page -- something we'll add in.
| bluelightning2k wrote:
| The problem is, any video demo of a tool like this is just an
| entirely unrelated video.
| adishj wrote:
| can you clarify what you mean here? check out this demo
| video: https://screen.studio/share/SP7DItVD
| BolexNOLA wrote:
| > We got frustrated trying to accomplish simple tasks in video
| editors like DaVinci Resolve and Adobe Premiere Pro. Features are
| hidden behind menus, buttons, and icons, and we often found
| ourselves Googling or asking ChatGPT how to do certain edits.
|
| Hidden behind a UI? Most of the major tools like blade, trim,
| etc. are right there on the toolbars.
|
| > We recorded hours of cars driving by, but got stuck on how to
| scrub through all this raw footage to edit it down to just the
| Cybertrucks.
|
| Scrubbing is the easiest part. Mouse over the clip, it starts
| scrubbing!
|
| I'm being a bit tongue in cheek and I totally agree there is a
| learning curve to NLE's but those complaints were also a bit
| striking to me.
| adishj wrote:
| hey! You're right that most of the basic tools like splitting /
| trimming are available right in the timeline. but things like
| adding a keyframe to animate a counter, for instance, I had no
| idea where to go or how to start.
|
| Scrubbing is easy enough when you have short footage, but
| imagine scrubbing through the footage we had of 5 hours of cars
| driving by, or maybe a bunch of assets. This quickly becomes
| very tedious.
| BolexNOLA wrote:
| I don't need to imagine, I do it haha but again I was being
| tongue in cheek. I personally would love an effective tool
| that can mark and favorite clips for me based on written
| prompts. Would save me an awful amount of time!
| adishj wrote:
| curious -- what kind of content do you edit?
| BolexNOLA wrote:
| Now? Mostly long form educational content. But
| historically? Everything more or less! Freelancer for
| about 15 years until my current in-house producer role.
| andrewmlevy wrote:
| obligatory https://news.ycombinator.com/item?id=9224
| BolexNOLA wrote:
| Like I said, the description of some of the issues was just
| kind of funny to me - I think this could be a potentially
| very useful tool.
|
| Do you think this is the next Dropbox?
| teddyh wrote:
| Not related to NCSA Mosaic (RIP).
| adishj wrote:
| if you take a snippet of Ben Horowitz's interview out of
| context, he has a lot of good things to say about our product
| :)
| shivvtrivedi wrote:
| Mosaic team dev here Hanging in the comments all day and pushing
| updates as fast as we can -really appreciate the feedback!
| mberlove wrote:
| Is there a way to keep up to date on updates and new
| announcements? TIA.
| adishj wrote:
| yes! please join our discord https://discord.gg/26SAZzBTaP or
| follow us on X https://x.com/mosaic_so to keep up to date on
| updates
| lava123 wrote:
| YOOOO, this is super awesome. Love this for you all. Lets make
| life easier for more creators.
| adishj wrote:
| thanks!
| bluelightning2k wrote:
| Good luck. I've dabbled with this myself and ultimately decided
| that DaVinci Resolve would end up doing this natively. But then
| again they haven't yet so who knows!
|
| Good luck with it, sincerely.
| adishj wrote:
| thanks! curious what you started dabbling with and if you have
| any thoughts to share :)
| zkmon wrote:
| I just clicked the link and encountered a non-scrollable, dark,
| fixed content pane with loads of flickering images and scrolling
| text with random font sizes without much meaning. I felt
| imprisoned, subjected to unexpected suffering, can't scroll away,
| got scared and raced for the window close button, and then
| breathed easy.
| pelagicAustral wrote:
| They really managed to handcraft a unique user experience,
| that's for sure.
| adishj wrote:
| we did but the landing page seems to be detracting from it --
| head directly to https://edit.mosaic.so to try the actual
| canvas interface
| adishj wrote:
| seems like the landing page is detracting from the main
| product, this is good feedback so thanks! For now, avoid the
| scaries and head directly to https://edit.mosaic.so to try the
| actual canvas interface
| conductr wrote:
| Since video is your thing, I feel like you need to just make
| a very edited demo reel and put all your energy into trying
| to get people to watch that video. Meaning, remove almost all
| text and bloat from the site and just show us all the cool
| stuff the product does for/to video editing. Distill it to
| 60-120 seconds and put that on your landing, hell put it on
| auto play if you want to, so long as it's clear that is the
| one thing I'm supposed to be paying attention to
| adishj wrote:
| yeah I think a demo reel of a BEFORE vs AFTER immediately
| somewhere in the hero even or right below it would be
| helpful
| dang wrote:
| I've put the /edit and /docs links in the first sentence
| above to soften the blow as well :)
| deepspace wrote:
| I had the same reaction. About what you would expect from a
| team steeped in the Tesla mindset.
| adishj wrote:
| thanks for the feedback -- you can head directly to
| https://edit.mosaic.so to try the actual canvas interface
| dang wrote:
| Please don't cross into personal attack. We're trying for the
| opposite on this site.
|
| https://news.ycombinator.com/newsguidelines.html
| ack210 wrote:
| I just signed up for a Creator plan, but it looks like the
| automated "Thank you for being a Mosaic Creator" email going out
| is not configured correctly. Instead of having my company name,
| it referenced a different business name and description (that
| seems to exist/be accurate, so not a placeholder).
| adishj wrote:
| Hey! Thanks for calling this out -- looking into what happened
| here & fixing right now.
| adishj wrote:
| This has been fixed now.
| echelon wrote:
| Can you make this a desktop app?
|
| I'm really tired of editing videos in the cloud. I'm also also
| tired of all these AI image and video tools that make you work
| over a browser. Your workflow seems so second class buried
| amongst all the other browser tabs.
|
| I understand that this is how to deploy quickly to customers, but
| it feels so gross working on "heavy" media in a browser.
| supportengineer wrote:
| There's plenty of great native desktop apps for video editing.
| And there have been for almost 30 years. I also don't
| understand why anyone would want to use a browser for this.
| adishj wrote:
| there is some friction even in downloading a new app
|
| if our goal is to bring more people into the fold, minimizing
| the steps for them to start editing is something we want to
| optimize for
|
| that being said, being on the browser presents its own set of
| challenges, many of which are rightfully mentioned in this
| thread
| kleiba wrote:
| Sorry, not buying the argument. I think it's more like:
| that's the current zeitgeist.
| adishj wrote:
| we've done a ton of work to optimize the uploads / downloads /
| transcoding of videos to handle beefy files using proxies, and
| also allow you to XML export back to traditional editing tools
| that can link back to your "heavy" media, but I hear you and I
| think anything running locally on device is just going to feel
| faster
|
| it does present its own set of challenges, but something we've
| thought about
| shambu2k wrote:
| Damn, you beat me to it. I was building something similar but got
| too caught up optimizing the context extraction. I actually ended
| up building a full spec for it--basically a PoC of "grep for
| videos."
|
| My end goal was to let an agent make semantic changes (e.g.,
| "remove the parts where the guy in the blue dress is seen") by
| simply grepping the context spec for the relevant timestamps and
| using ffmpeg to cut them out.
|
| How are you extracting context from videos?
| adishj wrote:
| how would this be different from vector embeddings / semantic
| search?
| shambu2k wrote:
| Vector embeddings are fuzzy on finding boundaries. With my
| spec approach, my goal is to get precise start/end times for
| ffmpeg to do edits. The downside is, that there is a lot of
| pre-processing of raw footage in my approach. Vectors win on
| zero-shot flexibility here.
| adishj wrote:
| if you have an example you could share i'd be very curious
| on what you mean.
| sails wrote:
| I've had a lot of fun with Remotion and Claude Code for CLI video
| editing. I've been impressed with how much traditional video
| editing I can manage.
|
| I will be checking this out!
| adishj wrote:
| that's super interesting -- what kind of things have you done
| with remotion and Claude Code?
|
| they're very powerful, when you put them together, it almost
| feels like Cursor for Video Editing
| danishSuri1994 wrote:
| Really interesting direction. The node-based canvas feels like a
| more scalable abstraction for video automation than the usual
| chat-only interface. I'm curious how you're handling long-form
| content where temporal context matters (e.g., emotional shifts,
| pacing, narrative cues).
|
| Multimodal models are good at frame-level recognition, but
| editing requires understanding relationships between scenes, have
| you found any methods that work reliably there?
| adishj wrote:
| hey, thanks for the comment!
|
| we've actually found that multimodal models are surprisingly
| good at maintaining temporal context as well
|
| that being said, there's also a bunch of additional processing
| using more traditional CV / audio analysis we do to extract
| this information out as well (both frame-level and temporal) in
| your video understanding
|
| for example, with the mean-motion analysis -- you can see how
| subjects move over a period of time, which can help determine
| where important things are happening in the video, which
| ultimately can lead to better placements of edits.
| anthonySs wrote:
| As a creator who films long form content, editing (specifically
| clipping for short form) is such a nightmare - this solves such a
| huge problem and the ui is insanely clean.
|
| Will be using this a ton in the future
| adishj wrote:
| great to hear -- I'd recommend using the clips tile to create
| clips, but you can also use the rough cut tile to help edit
| down the raw footage for the long-form
| moinism wrote:
| Hey, this is super cool. congrats on the product and the launch!
|
| I'm building something exactly similar and couldn't believe my
| eyes when I saw the HN post. What i'm building (chatoctopus.com)
| is more like a chat-first agent for video editing, only at a
| prototype stage. But what you guys have achieved is insane.
| Wishing you lots of success.
|
| to healthy competition!
| adishj wrote:
| thank you! chatoctopus looks pretty cool, I'm trying it out
| right now!
|
| how did you find the chat-first interface to work out for
| video? what we found is that the response times can be so long
| that the chat UX breaks down a bit. how are you thinking about
| this?
| adishj wrote:
| looks like I got a network error
| heyyfurqan wrote:
| Damn this is good.
| adishj wrote:
| Thank you! :)
| sashagoncharov wrote:
| best of luck guys!!
| adishj wrote:
| thank you! let us know if you have any feedback!
| rishabhaiover wrote:
| When I see a hn post with no critical comments I assume all
| comments are either seeded or biased (commenting on my own bias)
| adishj wrote:
| scroll down and you'll see all the critical comments about the
| landing page lol
| supportengineer wrote:
| Submarining - well-known issue on HN.
| soperj wrote:
| it's hilarious how many have less than 5 karma.
| Tetraslam wrote:
| this is going to save me so much time, hell yeah guys!
| adishj wrote:
| thank you! let us know if you have any feedback!
| HanClinto wrote:
| I absolutely love your approach of "expert tools". If I
| understand your approach, you aren't just feeding a video into a
| multimodal LLM and asking it "what is the bounding box of the
| optimal caption region?" -- you have built tools with discrete
| algorithms (using traditional CV techniques) that use things like
| object detection boxes + traditional motion analysis techniques
| to give "expert opinions" to the LLM in the form of tool calls --
| such as finding the regions of minimal saliency + minimal
| movement to be the best places for caption placement.
|
| If the LLM needs to place captions, it calls one of these expert
| discrete-algorithm tools to determine the best place to put the
| captions -- you aren't just asking the LLM to do it on its own.
|
| If I'm correct about that, then I absolutely applaud you -- it
| feels like THIS is a fantastic model for how agentic tools should
| be built, and this is absolutely the opposite of AI slop.
|
| Kudos!
| adishj wrote:
| thanks for the comment, thats exactly right
|
| we're using a mix of out-of-the-box multimodal AI capability +
| traditional audio / video analysis techniques as part of our
| video understanding pipeline, all of which become context for
| the agent to use during its editing process
| supportengineer wrote:
| Can we stop with the overloaded names? "Mosaic" is a well-known
| web browser.
| adishj wrote:
| naming is hard
|
| our original name was Frame, only to realize that frame.io
| existed already.
|
| we brainstormed names for a while and had several notes full of
| possible names
|
| mosaic is one which stood out to us because it not only
| represents artwork, but also the tiles (nodes) in the canvas
| come together to form your mosaic -- we thought that was a
| fitting name
| dang wrote:
| " _Please don 't complain about tangential annoyances--e.g.
| article or website formats, name collisions, or back-button
| breakage. They're too common to be interesting._"
|
| https://news.ycombinator.com/newsguidelines.html
| filkny wrote:
| This is one of those ideas that seems obvious after you hear
| about it, yet somehow didn't exist yet. So many potential
| applications. Met the founder back in SF and he's one of the
| coolest, down to earth dudes there is. Best of luck to the team!
| adishj wrote:
| thank you so much for the kind word!
| camcaine wrote:
| Agree this looks very promising.
| adishj wrote:
| thank you! if you get a chance to try it, let me know if you
| have any feedback
| dakshbhatia wrote:
| You can see the care in every little decision, workflow, and
| feature -- I've never had this much fun editing videos.
|
| I didn't expect great video editing to become democratized so
| quickly. Kudos to the team!!
|
| - a happy customer
| homeonthemtn wrote:
| These comments real sus.
| adishj wrote:
| i agree, things are a bit too kind. give me some more feedback.
| primitivesuave wrote:
| Last year, I made a YouTube documentary series showcasing the
| prolific corruption in a small city government. I downloaded all
| the city government meetings, used Whisper to transcribe them,
| and then set up a basic RAG so I could query across a decade of
| committee meetings (around 1 TB of video). Once I got the
| timestamps that I'm interested in, I then have to embark on a
| tedious manual process of locating the file, cutting out a few
| seconds/minutes from a multi-hour video, and then order all the
| clips into a cohesive narrative.
|
| These seem like problems that LLMs are especially well-suited
| for. I might have spent a fraction of the time if there was some
| system that could "index" my content library, and intelligently
| pull relevant clips into a cohesive storyline.
|
| I also spent an ungodly amount of time on animations - it felt
| like "1 hour of work for 1 minute of animation". I would gladly
| pay for a tool which reduces the time investment required to be a
| citizen documentarian.
| adishj wrote:
| hey, thanks for sharing about your documentary series. would
| love to check it out if you don't mind linking it!
|
| we don't yet support that volume of footage (1TB), however if
| you'd like to try this at a smaller scale, you can already do
| this today with the Rough Cut tile -- simply prompt it for the
| moments that you're interested in (it can take visual cues,
| auditory cues, timestamp cues, script cues) and it will create
| an initial rough cut or assembly edit for you.
|
| I'd also recommend checking out the new Motion Graphics tile we
| added for animations. You can also single-point generate motion
| graphics using the utility on the bottom right of the timeline.
| Let me know if you have any questions on that.
| kul wrote:
| Can it work for this use-case? I have lots of 15 seconds to 1 min
| duration videos) of my kids and want to upload them all (let's
| say 10 videos) and have the agent make a single video with all
| the best bits of them?
| adishj wrote:
| yes! you can upload as many videos as you want (file limits
| currently are at 20GB and 90 minutes, per file). then I'd
| recommend using either the Rough Cut tile or the Montage tile
| to stitch them all together. In those tiles, you can prompt
| particular visual cues in terms of how you want the videos to
| be combined. Let me know if any questions.
| news4abhi wrote:
| Been following this team from the early days. Amazing founder
| story, even better product. Just what people need today
| nrhrjrjrjtntbt wrote:
| Loom for Loom?
| adishj wrote:
| loom is focused on screen recordings / demos
___________________________________________________________________
(page generated 2025-11-19 23:00 UTC)