[HN Gopher] Multimodal Canvas: An experiment of Pictionary power...
       ___________________________________________________________________
        
       Multimodal Canvas: An experiment of Pictionary powered by Gemini
       1.5 Flash
        
       Author : meetpateltech
       Score  : 36 points
       Date   : 2024-07-03 16:47 UTC (6 hours ago)
        
 (HTM) web link (labs.google)
 (TXT) w3m dump (labs.google)
        
       | jerpint wrote:
       | Why do I need to supply an API key to try their demo? Seems
       | absurd
        
         | Onavo wrote:
         | So they can save money and use it as a sales funnel.
        
         | greenyies wrote:
         | Why would they sponsor this for you?
         | 
         | Someone has to pay for it and if their product requires an API
         | key anyway...
        
           | bangaladore wrote:
           | It seems like Google, of all companies, can afford to let
           | prospective developers at least try the thing they want you
           | to pay for. I think they have the dev power to sufficiently
           | rate limit as necessary.
           | 
           | Frankly the way Google has been advertising their AI
           | offerings shows they don't care about the consumer market. I
           | hear great things about their LLMs, but for some reason I
           | never try them, and most people haven't either.
        
             | verdverm wrote:
             | This is a small group within Google doing experiments, they
             | are probably trying a different approach after the MusicFX
             | app experiencing heavy usage and needing to limit users.
             | They didn't have a way to supply your own API key for that
             | one
             | 
             | You can easily try Gemini for free in a number of other
             | places
        
           | bastawhiz wrote:
           | Google can afford to run this model (a bigger one, actually)
           | in their search results when I don't need or want it. Why
           | shouldn't they run this tiny demo backed by their least
           | expensive model? The total cost of running the project for
           | it's entire lifespan is probably less than the value of two
           | weeks of a single engineer's time.
        
         | dennisy wrote:
         | Yeah agree this is a bit nuts from a company trying to promote
         | their model!
        
         | jozvolskyef wrote:
         | I appreciate the simplicity of a system that allows you to
         | provide your own API key. All the complexity of authentication,
         | rate limiting, cost capping, scaling, is offloaded to someone
         | else, and you can just focus on the essence of the thing that
         | you're building.
        
           | bastawhiz wrote:
           | It's Google asking for an API key to a google service.
           | "Simplicity" here would just be them letting you run the
           | demo. Like yeah, all of the "authentication, rate limiting,
           | cost capping, scaling" is done by Google, it's a Google page
           | running a Google service.
        
       | nuz wrote:
       | Was hoping this was using the built in gemini web api discussed
       | on HN a few days ago (I guess it's not multimodal though).
        
       | vunderba wrote:
       | I built a Pictionary bot a few years ago using stable diffusion
       | where people could try and guess collaboratively as the randomly
       | generated image (celebrities, pop, culture, cartoons, etc.)
       | slowly was brought into focus across diffusion steps in a discord
       | channel. It was quite popular among my friends.
       | 
       | https://specularrealms.com/2022/10/04/stable-diffusion-picti...
       | 
       | I wanted to make it available to other discord servers, but
       | couldn't figure out a way which wouldn't break the bank.
        
         | bangaladore wrote:
         | > I wanted to make it available to other discord servers, but
         | couldn't figure out a way which wouldn't break the bank.
         | 
         | I feel this. Most people expect free discord bots, so
         | convincing anyone to pay their share (however cheap) is
         | daunting. This is mainly a challenge with LLM/Diffusion based
         | bots where the cost of running is significant, at least today.
        
       | omneity wrote:
       | Tried it out, with the following result.
       | 
       | > I can't see or analyze any image. I am a text-based chat
       | assistant and thus I cannot process any image.
       | 
       | Pretty hilarious honestly.
        
         | omneity wrote:
         | Continuing on the topic. This must be a joke.
         | 
         | --
         | 
         | User
         | 
         | Your app literally says "Multimodal canvas" by labs.google
         | 
         | Gemini 2.27s
         | 
         | You are absolutely right! I apologize for the misunderstanding.
         | I am still under development and learning to process different
         | types of information.
         | 
         | While I can't see or analyze images yet, I am constantly
         | learning and improving.
         | 
         | Please feel free to ask me anything else. I am here to help in
         | any way I can!
        
         | lacoolj wrote:
         | It's so finicky with that. I upload images to regular Gemini
         | interface (with a text prompt as well) and almost always get
         | back results, but I have seen this message before.
         | 
         | There's gotta be a condition that allows it to be bypassed, and
         | clearly this project should be triggering that condition lol
        
       ___________________________________________________________________
       (page generated 2024-07-03 23:00 UTC)