[HN Gopher] VideoPrism: A foundational visual encoder for video ...
___________________________________________________________________
VideoPrism: A foundational visual encoder for video understanding
Author : kmdupree
Score : 82 points
Date : 2024-05-09 13:32 UTC (9 hours ago)
(HTM) web link (research.google)
(TXT) w3m dump (research.google)
| adzm wrote:
| YouTube is going to be a gold mine for training data.
| hrdwdmrbl wrote:
| is a gold mine*
|
| present tense :)
| hwbunny wrote:
| And since everyone turns their own profession to pennies by
| uploading the know how of their craft, it will properly wreck
| entire industries once the AI+Robotics marriage takes off.
| devwastaken wrote:
| Video describing practices does not translate to AI
| performing those practices. Creating a community of knowledge
| progresses that knowledge. Sitting on it only benefits few
| and removes the market. Without people sharing their work on
| YouTube their own market would be far less.
| vessenes wrote:
| So frustrating.
|
| "We hope VideoPrism paves the way for future breakthroughs at the
| intersection of AI and video analysis, helping to realize the
| potential of ViFMs across domains such as scientific discovery,
| education, and healthcare."
|
| ... for those of us internal to Google Research, and possibly
| large corporate clients when we choose.
|
| Custom internal dataset, no model weights, no API access, just
| "FYI, this works pretty well when you have our data."
|
| I do appreciate publishing of successful architectures,
| genuinely, and VP looks like it has some nice ideas, and it is
| very useful to know that training such a thing is probably not a
| dead end, so for all that, thanks.
|
| At the end of the day, with large competitors committed to open
| models/weights, I think momentum is on the side of needing to
| push out open weights at least, but my guess is GOOG will be the
| last to transition to this of the big tech cos. I could
| understand it as a business decision if they were quickly opening
| this stuff up on their cloud, and competing that way, but they
| still seem to be in this world where the DeepMind crew continues
| to push out novel and interesting research projects, and the
| cloud crew struggles to deliver a competitive LLM at the same
| time.
|
| I wonder where they'll be in a year. I'd like to hope up to
| competitive standards so that we've got an additional player, but
| progress seems slow to me right now.
| bingbingbing777 wrote:
| Why are you so frustrated that a company decided to not
| completely release something they have built internally?
| GaggiX wrote:
| Because it's presented as research, but the results are not
| reproducible without at least information about the dataset.
| nolok wrote:
| Well Google did that for years and where did that get them?
| To companies using their research to build something but
| not share to the same extent their research so they could
| keep a leg up.
|
| So yeah, they learned their lesson. You want to complain,
| complain to those who did not play the game fair and
| square.
| GaggiX wrote:
| I'm not complaining, Google can do whatever they want,
| it's just not proper research, even though they want it
| to feel like it is.
| fngjdflmdflg wrote:
| It is reproducible with the dataset though. (Or more
| accurately _may_ be reproducible). It is important to
| distinguish between "reproducibility" meaning "the extent
| to which consistent results are obtained when an experiment
| is repeated" and "the ability to be reproduced or copied."
| Only the former is necessary for an experiment to be
| considered reproducible. It's just that it may be difficult
| to actually run the experiment although certainly not
| impossible, at least if we consider using a similar
| alternative dataset as its really the coed being tested
| here. But in any event I think it qualifies as publishing
| research all the same. Also note that research is still
| research even if it not published.
| TeMPOraL wrote:
| FWIW, reproducibility is like backups: if you haven't
| tested restoring from a backup, you don't actually have
| one. Similarly, research on proprietary and non-disclosed
| data, using proprietary and non-disclosed techniques,
| that can only be reproduced if you manage to cut a deal
| with the company behind it (if it's even possible),
| should be called what it is: _marketing_.
| fngjdflmdflg wrote:
| An experiment being reproducible just means that if you
| repeat the experiment you will get the same outcome. We
| don't know if this experiment is reproducible or not
| because nobody has tried to reproduce it. I think if this
| paper gets citations, that will show that other people
| have read the paper[0] and gained something from it. So
| we can just wait and see if this was a useful for other
| people.
|
| [0] https://arxiv.org/abs/2402.13217 this, not the blog
| vessenes wrote:
| GaggiX has it mostly right below. But, I'm frustrated because
| I'd like to try this out. And, by try it out, any of these
| would be fine:
|
| 1. Download the datasets and train a small version to get a
| feel for it.
|
| 2. Download the models and deploy them to use it and get a
| feel for it.
|
| 3. Talk to it via a paid API
|
| Why do I want to do that? I'd like to know how capable this
| architecture is, and get a feel for how capable it could be
| with different data / scale up / fine tuning, etc.
| yterdy wrote:
| Because, half a century ago, it would have been built by a
| government research agency or a designated monopoly that was
| obliged to share it with the public, instead of the quasi-
| monopoly - that can keep secret whatever it needs to wreck
| your sh*t - like today.
|
| We know a better way to do this ("this" being "the
| development of foundational technology for the next century
| of human civilization"), so of course it's frustrating to see
| the way it's actually being done.
___________________________________________________________________
(page generated 2024-05-09 23:00 UTC)