[HN Gopher] How to build a working AI only using synthetic data ...
       ___________________________________________________________________
        
       How to build a working AI only using synthetic data in just 5
       minutes
        
       Author : jetbiscuits
       Score  : 69 points
       Date   : 2022-06-13 11:43 UTC (11 hours ago)
        
 (HTM) web link (www.danrose.ai)
 (TXT) w3m dump (www.danrose.ai)
        
       | m3kw9 wrote:
       | Training images today is equivalent of launching a web server
       | using one line of code
        
       | UncleOxidant wrote:
       | I've worked in the AI space, but mostly on the backend tweaking
       | algorithms so I guess I'm asking this as (mostly) a laymen: Isn't
       | using AI to generate training data to train another AI fraught
       | with peril?
        
       | mactournier wrote:
       | Don't waste your time. There are so many statistical atrocities
       | in this article that it makes me shiver.
        
         | kadoban wrote:
         | Where? I don't see any statistics in the article.
        
           | mdp2021 wrote:
           | Probably meant that using synthetic data is recycling within
           | a closed system.
        
             | kadoban wrote:
             | I guess that's possible, but it's not clear to me that that
             | matters. It'll still work right? What do we care?
        
       | Imnimo wrote:
       | >In a short while, it has gone from being an experimental
       | technology. To something, I would hesitate to use for production
       | AI solutions.
       | 
       | What?
        
       | Liveanimalcams wrote:
       | Have you seen Roboflow Universe? They have thousands of projects
       | from their community already labeled for download and use. I
       | always start there when I want t to start a new model. Recently
       | for my trash picking up robot I found a good litter dataset to
       | start from. https://universe.roboflow.com/
        
       | YeGoblynQueenne wrote:
       | Title: "How to build a working AI (...)".
       | 
       | Text: How to build an apple-or-banana image classifier.
       | 
       | Ah. I see. It's an allegory for the state of modern machine
       | learning research.
        
       | drcode wrote:
       | Interesting post, but it seems fragile enough that I'm not sure
       | it could work if you try to classify a yellow apple.
        
       | mudrockbestgirl wrote:
       | I think a more appropriate title would be: How to click an upload
       | data button on a website.
        
       | [deleted]
        
       | axpy906 wrote:
       | And... I am still waiting with baited breath for something like
       | this that would work for tabular data.
        
       | master_yoda_1 wrote:
       | First define what is a "working AI" even mean Be careful AI is
       | not an iPhone app you will not find any funding for BS in this
       | env.
        
       | towaway15463 wrote:
       | I'm curious why you would need to generate the images using
       | another AI. There are masses of free high quality 3d models out
       | there and even if you can't find the model you need you could
       | always use photogrammetry to create one from a real world
       | example. After you have the subject you can render it in a game
       | engine like Unreal Engine from as many different perspectives and
       | lighting conditions as you want. Since the engine is programmable
       | you could even automate this part to a large degree and adding
       | other confounding objects or backgrounds would be simple as well.
        
         | ollysb wrote:
         | What's really interesting with 3d models is that you could give
         | the classifier motor control within it's environment i.e. it
         | could "step to the left" and see how the banana changes in it's
         | visual field. This would allow the classifier to integrate it's
         | movement with the visual stimulus and build a far richer model.
         | It also gives the classifier access to an active learning
         | strategy: predict how the appearance of the banana changes when
         | it "steps to the left", try it, then evaluate the difference
         | and refine.
        
       | tehsauce wrote:
       | It should not take you 5 minutes to make an image classifier in
       | 2022. 30 seconds is a more reasonable amount of time. Dalle is
       | trained using CLIP which you can just use as a zero shot
       | classifier directly, no need to waste time generating images or
       | training a model at all. Just type in the names or descriptions
       | of your classes and your done! Way easier than this :)
        
         | minimaxir wrote:
         | Normally, this type of comment is Hacker News reductiveness,
         | but yes, image classification via CLIP is that easy, especially
         | with Hugging Face's API for it:
         | https://huggingface.co/docs/transformers/model_doc/clip
         | 
         | I created a Python package to generate image embeddings from
         | CLIP's vision model without requiring a ML framework
         | (https://github.com/minimaxir/imgbeddings ), and a simple
         | linear classifier on those embeddings does the trick, demo
         | here:
         | https://github.com/minimaxir/imgbeddings/blob/main/examples/...
        
         | version_five wrote:
         | How big of a model is CLIP? If you're building a phone app that
         | classifies dogs, you may not want to require that it runs some
         | multi-billion parameter monstrosity to perform its
         | comparatively simple task. There is lots of value in building a
         | compact model. "Just" typing in the names ignores the compute
         | you need to have behind the scenes.
        
           | tehsauce wrote:
           | There are various sizes of CLIP, many are not enormous. For
           | example, one of the base models is just a standard resnet50.
           | So very usable on a mobile device.
        
           | minimaxir wrote:
           | A compact model is a constraint that changes the problem
           | entirely and doesn't discredit the quick-but-effective
           | approach that works for nearly every other use case.
        
         | genewitch wrote:
         | Are you saying that DALL-E is impressive? On fediverse it's
         | used for jokes and memes, because it's really, uh, ugly?
         | simplistic? using obvious components in each image. To my eye,
         | it looks like trickery. Maybe the "full, paid, commercial"
         | model and outputs are better; i'm not sure.
         | 
         | I'm actually looking for a decent classifier / object
         | recognition platform to sort on the order of millions of images
         | coarsely - as it stands all of the ones i've tried can't
         | determine if an image is drawn/painted or a photograph, for
         | instance, which reduces my enthusiasm of the whole field.
         | 
         | On the other hand, audio AI/ML stuff - such as spleeter -
         | impresses me, as i can't do that stuff by hand.
        
           | ShamelessC wrote:
           | The original DALL-E was never released. This is a smaller
           | model made by volunteers.
           | 
           | Did you consider using CLIP like parent comment said?
        
         | lumost wrote:
         | I've heard such claims for a long time, I can likewise create a
         | classifier out of a simple dice role. It doesn't say anything
         | about how good it is.
         | 
         | Most software applications have low tolerance for error rates,
         | the ones that do have big money being spent on ensuring their
         | accuracy is better than everybody else's.
         | 
         | So while you can make a classifier out of anything in Y time,
         | that doesn't say anything about whether it's of any practical
         | use.
        
           | minimaxir wrote:
           | No, CLIP is indeed that good. The robustness of its
           | embeddings is the entire reason why VQGAN+CLIP works and can
           | stablely generate images close to the text prompt.
        
             | lumost wrote:
             | I'm sure it's better than a dice role, but does it beat a
             | modern classifier trained on the domain specific data?
             | 
             | EDIT: I raise this issue as over-promises are the death
             | nell for software. Overpromising capability leads to
             | disappointment.
        
               | ShamelessC wrote:
               | It tends to, yes. I suggest reading the paper as they
               | discuss this very thing in detail.
        
               | minimaxir wrote:
               | See the zero-shot performance of CLIP:
               | https://openai.com/blog/clip/
               | 
               | It's definitely better performance than what you'd get
               | working in 5 minutes from more conventional approaches.
        
       ___________________________________________________________________
       (page generated 2022-06-13 23:01 UTC)