[HN Gopher] Mistral releases Pixtral 12B, its first multimodal m...
___________________________________________________________________
Mistral releases Pixtral 12B, its first multimodal model
Author : jerbear4328
Score : 75 points
Date : 2024-09-11 19:47 UTC (3 hours ago)
(HTM) web link (techcrunch.com)
(TXT) w3m dump (techcrunch.com)
| azinman2 wrote:
| I'd love to know how much money Mistral is taking in versus
| spending. I'm very happy for all these open weights models, but
| they don't have Instagram to help pay for it. These models are
| expensive to build.
| candiddevmike wrote:
| No license with this one yet, though you can probably assume
| it's Apache like the others.
| mdasen wrote:
| The article says they confirmed it's Apache via email
| edude03 wrote:
| 12B is pretty small, so I'm doubting it'll be anywhere close to
| internvl2 however mistral does great work and likely this model
| is still useful for on device tasks
| Jackson__ wrote:
| It appears to be slightly worse than Qwen2VL 7B, a model almost
| half it's size, if you look at the Qwen's official benchmarks
| instead of Mistral's.
|
| https://xcancel.com/_philschmid/status/1833954941624615151
| ChrisArchitect wrote:
| Related earlier:
|
| _New Mistral AI Weights_
|
| https://news.ycombinator.com/item?id=41508695
| buran77 wrote:
| The "Mistral Pixtral multimodal model" really rolls off the
| tongue.
|
| > It's unclear which image data Mistral might have used to
| develop Pixtral 12B.
|
| The days of free web scraping especially for the richer sources
| of material are almost gone, with anything between technical (API
| restrictions) and legal (copyright) measures building deep moats.
| I also wonder what they trained it on. They're not Meta or Google
| with endless supplies of user content, or exclusive contracts
| with the Reddits of the internet.
| simonw wrote:
| What do you mean by copyright measures? Has anything changed on
| that front in the last two years?
|
| My hunch is that most AI labs are already sitting on a pretty
| sizable collection of scraped image data - and that data from
| two years ago will be almost as effective as data scraped
| today, at least as far as image training goes.
| dartos wrote:
| The issue with image models is that their style becomes
| identifiable and stale quite quickly, so you'll need a fresh
| intake of different, newer, styles every so often and that's
| going to be harder and harder to get.
| Flockster wrote:
| Could this be used for a selfhosted handwritten text recognition
| instance?
|
| Like writing on an ePaper tablet, exporting the PDF and feed this
| into this model to extract todos from notes for example.
|
| Or what would be the SotA for this application?
___________________________________________________________________
(page generated 2024-09-11 23:00 UTC)