[HN Gopher] VideoPrism: A foundational visual encoder for video ...
       ___________________________________________________________________
        
       VideoPrism: A foundational visual encoder for video understanding
        
       Author : kmdupree
       Score  : 82 points
       Date   : 2024-05-09 13:32 UTC (9 hours ago)
        
 (HTM) web link (research.google)
 (TXT) w3m dump (research.google)
        
       | adzm wrote:
       | YouTube is going to be a gold mine for training data.
        
         | hrdwdmrbl wrote:
         | is a gold mine*
         | 
         | present tense :)
        
         | hwbunny wrote:
         | And since everyone turns their own profession to pennies by
         | uploading the know how of their craft, it will properly wreck
         | entire industries once the AI+Robotics marriage takes off.
        
           | devwastaken wrote:
           | Video describing practices does not translate to AI
           | performing those practices. Creating a community of knowledge
           | progresses that knowledge. Sitting on it only benefits few
           | and removes the market. Without people sharing their work on
           | YouTube their own market would be far less.
        
       | vessenes wrote:
       | So frustrating.
       | 
       | "We hope VideoPrism paves the way for future breakthroughs at the
       | intersection of AI and video analysis, helping to realize the
       | potential of ViFMs across domains such as scientific discovery,
       | education, and healthcare."
       | 
       | ... for those of us internal to Google Research, and possibly
       | large corporate clients when we choose.
       | 
       | Custom internal dataset, no model weights, no API access, just
       | "FYI, this works pretty well when you have our data."
       | 
       | I do appreciate publishing of successful architectures,
       | genuinely, and VP looks like it has some nice ideas, and it is
       | very useful to know that training such a thing is probably not a
       | dead end, so for all that, thanks.
       | 
       | At the end of the day, with large competitors committed to open
       | models/weights, I think momentum is on the side of needing to
       | push out open weights at least, but my guess is GOOG will be the
       | last to transition to this of the big tech cos. I could
       | understand it as a business decision if they were quickly opening
       | this stuff up on their cloud, and competing that way, but they
       | still seem to be in this world where the DeepMind crew continues
       | to push out novel and interesting research projects, and the
       | cloud crew struggles to deliver a competitive LLM at the same
       | time.
       | 
       | I wonder where they'll be in a year. I'd like to hope up to
       | competitive standards so that we've got an additional player, but
       | progress seems slow to me right now.
        
         | bingbingbing777 wrote:
         | Why are you so frustrated that a company decided to not
         | completely release something they have built internally?
        
           | GaggiX wrote:
           | Because it's presented as research, but the results are not
           | reproducible without at least information about the dataset.
        
             | nolok wrote:
             | Well Google did that for years and where did that get them?
             | To companies using their research to build something but
             | not share to the same extent their research so they could
             | keep a leg up.
             | 
             | So yeah, they learned their lesson. You want to complain,
             | complain to those who did not play the game fair and
             | square.
        
               | GaggiX wrote:
               | I'm not complaining, Google can do whatever they want,
               | it's just not proper research, even though they want it
               | to feel like it is.
        
             | fngjdflmdflg wrote:
             | It is reproducible with the dataset though. (Or more
             | accurately _may_ be reproducible). It is important to
             | distinguish between  "reproducibility" meaning "the extent
             | to which consistent results are obtained when an experiment
             | is repeated" and "the ability to be reproduced or copied."
             | Only the former is necessary for an experiment to be
             | considered reproducible. It's just that it may be difficult
             | to actually run the experiment although certainly not
             | impossible, at least if we consider using a similar
             | alternative dataset as its really the coed being tested
             | here. But in any event I think it qualifies as publishing
             | research all the same. Also note that research is still
             | research even if it not published.
        
               | TeMPOraL wrote:
               | FWIW, reproducibility is like backups: if you haven't
               | tested restoring from a backup, you don't actually have
               | one. Similarly, research on proprietary and non-disclosed
               | data, using proprietary and non-disclosed techniques,
               | that can only be reproduced if you manage to cut a deal
               | with the company behind it (if it's even possible),
               | should be called what it is: _marketing_.
        
               | fngjdflmdflg wrote:
               | An experiment being reproducible just means that if you
               | repeat the experiment you will get the same outcome. We
               | don't know if this experiment is reproducible or not
               | because nobody has tried to reproduce it. I think if this
               | paper gets citations, that will show that other people
               | have read the paper[0] and gained something from it. So
               | we can just wait and see if this was a useful for other
               | people.
               | 
               | [0] https://arxiv.org/abs/2402.13217 this, not the blog
        
           | vessenes wrote:
           | GaggiX has it mostly right below. But, I'm frustrated because
           | I'd like to try this out. And, by try it out, any of these
           | would be fine:
           | 
           | 1. Download the datasets and train a small version to get a
           | feel for it.
           | 
           | 2. Download the models and deploy them to use it and get a
           | feel for it.
           | 
           | 3. Talk to it via a paid API
           | 
           | Why do I want to do that? I'd like to know how capable this
           | architecture is, and get a feel for how capable it could be
           | with different data / scale up / fine tuning, etc.
        
           | yterdy wrote:
           | Because, half a century ago, it would have been built by a
           | government research agency or a designated monopoly that was
           | obliged to share it with the public, instead of the quasi-
           | monopoly - that can keep secret whatever it needs to wreck
           | your sh*t - like today.
           | 
           | We know a better way to do this ("this" being "the
           | development of foundational technology for the next century
           | of human civilization"), so of course it's frustrating to see
           | the way it's actually being done.
        
       ___________________________________________________________________
       (page generated 2024-05-09 23:00 UTC)