[HN Gopher] Mistral releases Pixtral 12B, its first multimodal m...
       ___________________________________________________________________
        
       Mistral releases Pixtral 12B, its first multimodal model
        
       Author : jerbear4328
       Score  : 75 points
       Date   : 2024-09-11 19:47 UTC (3 hours ago)
        
 (HTM) web link (techcrunch.com)
 (TXT) w3m dump (techcrunch.com)
        
       | azinman2 wrote:
       | I'd love to know how much money Mistral is taking in versus
       | spending. I'm very happy for all these open weights models, but
       | they don't have Instagram to help pay for it. These models are
       | expensive to build.
        
         | candiddevmike wrote:
         | No license with this one yet, though you can probably assume
         | it's Apache like the others.
        
           | mdasen wrote:
           | The article says they confirmed it's Apache via email
        
       | edude03 wrote:
       | 12B is pretty small, so I'm doubting it'll be anywhere close to
       | internvl2 however mistral does great work and likely this model
       | is still useful for on device tasks
        
         | Jackson__ wrote:
         | It appears to be slightly worse than Qwen2VL 7B, a model almost
         | half it's size, if you look at the Qwen's official benchmarks
         | instead of Mistral's.
         | 
         | https://xcancel.com/_philschmid/status/1833954941624615151
        
       | ChrisArchitect wrote:
       | Related earlier:
       | 
       |  _New Mistral AI Weights_
       | 
       | https://news.ycombinator.com/item?id=41508695
        
       | buran77 wrote:
       | The "Mistral Pixtral multimodal model" really rolls off the
       | tongue.
       | 
       | > It's unclear which image data Mistral might have used to
       | develop Pixtral 12B.
       | 
       | The days of free web scraping especially for the richer sources
       | of material are almost gone, with anything between technical (API
       | restrictions) and legal (copyright) measures building deep moats.
       | I also wonder what they trained it on. They're not Meta or Google
       | with endless supplies of user content, or exclusive contracts
       | with the Reddits of the internet.
        
         | simonw wrote:
         | What do you mean by copyright measures? Has anything changed on
         | that front in the last two years?
         | 
         | My hunch is that most AI labs are already sitting on a pretty
         | sizable collection of scraped image data - and that data from
         | two years ago will be almost as effective as data scraped
         | today, at least as far as image training goes.
        
           | dartos wrote:
           | The issue with image models is that their style becomes
           | identifiable and stale quite quickly, so you'll need a fresh
           | intake of different, newer, styles every so often and that's
           | going to be harder and harder to get.
        
       | Flockster wrote:
       | Could this be used for a selfhosted handwritten text recognition
       | instance?
       | 
       | Like writing on an ePaper tablet, exporting the PDF and feed this
       | into this model to extract todos from notes for example.
       | 
       | Or what would be the SotA for this application?
        
       ___________________________________________________________________
       (page generated 2024-09-11 23:00 UTC)