[HN Gopher] Foundation for Human Vision Models
___________________________________________________________________
Foundation for Human Vision Models
Author : yoknapathawa
Score : 42 points
Date : 2024-08-24 16:04 UTC (6 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| yoknapathawa wrote:
| Vision transformer trained on 300M human images with state of the
| art results on a bunch of human tasks (keypoints, segmentation,
| depth, normals).
|
| Disclaimer: Co-author here.
| gimlids wrote:
| always curious what the license allows with these Meta research
| drops, seems all over the place... can this be used
| commercially? (specifically inference) it's creative commons
| and some parts apache?
| ElFitz wrote:
| The Creative Commons seems to be Non-Commercial [0], meaning
| it's very interesting and quite inspiring, but ultimately
| useless outside of research and side projects.
|
| The Apache parts seem to be dependencies.
|
| [0]: https://github.com/facebookresearch/sapiens/blob/main/LI
| CENS...
| doctorpangloss wrote:
| > but ultimately useless outside of research and side
| projects.
|
| "Everything is useless unless it personally, financially
| benefits me."
| nickpsecurity wrote:
| I've seen papers that combined pre-trained vision and language
| models, trained them together on image/text pairs, and then
| used the new model for things like text extraction. Could your
| model be plugged into such a design?
|
| I've always wanted to scan whole books by just feeding Pictures
| of their pages into an AI. Prefer preferably with minimal
| labeling requirements. I also see this as a way to generate
| more training data for language models from old cheap books. Do
| you think your model could help with that?
| ks2048 wrote:
| You might want to update the README where it says run
| "./conda.sh" - it should say there are hard-coded paths in this
| script that need to be changed (the first line is
| CONDA_BASE="/home/rawalk/anaconda3").
|
| I wonder if there is something here that requires conda and not
| a simple requirements.txt or something like that. Every time I
| try conda is seems to mess up my entire environment (usually I
| just use pyenv w/ virtualenv). But trying with conda now,
| keeping my fingers crossed...
|
| EDIT: yep, as usual, conda failed me. (fresh install of
| miniconda). "./conda.sh" finished with 0 exit code and said
| "Installation done!". Yet, now I have no new conda environment
| (I think I saw some warnings and errors deep in the logging
| output).
|
| I see now how this has various requirements.txt for the
| different sub-projects - looks like I'll try to create a pyenv-
| virtualenv and do things manually to try to get an example
| working...
| vessenes wrote:
| Um, this looks really, really good.
|
| Yo @yoknapthawa, can this be finetuned on an M3 chip? How much
| RAM is needed? What are the current low hanging fruit-type tasks
| you think the community could go at? What's latency like? I
| didn't see anything on the page / in the paper / github about
| speeds.
|
| I'm also curious about the classes you use for the segmentation
| task -- do you have a list of them somewhere?
|
| Finally, your generalization results are all on photorealistic
| images, did you do any looking at paintings / animation / other?
| I'm curious how broadly the generalization goes.
|
| As always, thank you for opening the weights.
| aithrowaway1987 wrote:
| The shadiness about Facebook's proprietary dataset of 300 million
| photos is concerning and should draw more attention. At the very
| least it is scientifically unacceptable - we should not high-five
| Big Tech researchers for intentionally unreproducible research.
| And if Meta is harvesting user photos for AI research and
| commercialization, they should tell their users about it directly
| (I am sure there is something buried in the TOS). Does the
| dataset include only public photos, or are Instagram DMs fair
| game? Does it include CSAM? Who cares!
|
| Serious question: who are the people in the illustrations they
| used in the paper?[1] Are they Facebook/Instagram users? Did the
| authors ask permission to use their photos for an arXiv
| publication? Including their kids? Meta researchers really should
| be answering questions like this before they are asked - but
| these authors didn't even include an impact statement!
|
| https://arxiv.org/abs/2408.12569
___________________________________________________________________
(page generated 2024-08-24 23:00 UTC)