[HN Gopher] Clojure Plays Mario
       ___________________________________________________________________
        
       Clojure Plays Mario
        
       Author : hotcrossbunny
       Score  : 64 points
       Date   : 2023-07-09 21:14 UTC (1 hours ago)
        
 (HTM) web link (blog.phronemophobic.com)
 (TXT) w3m dump (blog.phronemophobic.com)
        
       | hotcrossbunny wrote:
       | I can never look at Mario the same
        
       | CobrastanJorji wrote:
       | Y'know, the thing I least like about these AI video game players
       | is how unlike humans they look. I was wondering about the
       | difference, and I think it comes down to two parts. First and
       | foremost, human players generally prefer routes with a lot of
       | tolerance for input error. Second, humans take frequently "mental
       | planning breaks," stopping for a moment in safe spots before
       | challenging areas.
       | 
       | I think you could juggle the heuristics to demonstrate the
       | preference for input error. For ML training, you could just
       | random vary input timing by up to 20ms or so to teach the
       | algorithm to favor safer moves. For path finding, it's trickier,
       | but there's probably a way to favor "wide" paths. I'm less sure
       | how to express the second concept, pausing briefly in "safe
       | areas," but I imagine it's maybe noticing a place where
       | significant amounts of entering no inputs does not affect the
       | results.
        
         | michaelteter wrote:
         | I don't think making humans comfortable is the goal with
         | respect to AI. The goal is to actually solve a problem.
         | Performance is second. Human comfort is a distant third or
         | beyond.
         | 
         | When AI can reliably solve a problem without significant
         | negative consequenses from time to time, it's a win. How humans
         | feel about the method is effectively irrelevant.
        
           | pedrosorio wrote:
           | > I don't think making humans comfortable is the goal with
           | respect to AI
           | 
           | According to whom?
           | 
           | AI in games, has historically been all about human
           | comfort/enjoyment. Extremely good AI that seems "unnatural"
           | to humans is usually not the goal.
        
             | tzot wrote:
             | You seem to talk about AI-controlled-NPCs in games, while
             | GP starts from the article context about AI-controlled-PCs
             | (player characters) and proceeds to generalize about using
             | AI to solve problems outside games.
        
               | rootw0rm wrote:
               | I think the point is that different people have different
               | goals for "AI"
        
           | oivey wrote:
           | The near miss behavior is very much like overfitting. Mario
           | is simple and deterministic enough that it doesn't matter,
           | but think about a scenario like a self driving car. A
           | calculated near miss turns into a crash if the other car's
           | driver is just a little slower or their tires a little
           | slicker than anticipated.
        
         | tnecniv wrote:
         | What you're basically describing is bounded rationality, which
         | has been widely studied in behavioral economics, psychology,
         | and engineering applications (Simon and Gigerenzer are two big
         | names to google). A common framework for formalizing it is as
         | what boils down versions of rate-distortion problems from
         | information theory (very related to Bayesian statistics).
         | 
         | The reason it's of engineering interest is, like you observe,
         | bounded-rationality gives you solutions that are sub-optimal
         | but more robust and often simpler.
         | 
         | Moreover, finding wide path solutions emerges naturally from
         | sampling-based motion planners. These planners are
         | asymptotically optimal, but if you terminate them early, they
         | are more likely to give you a solution that goes through large
         | gaps, not smaller ones, because it's unlikely to sample a
         | trajectory that goes through a tight space without heavy
         | sampling. You could probably formulate that in the rate-
         | distortion framework but I haven't thought about how to do it
         | precisely.
        
           | CobrastanJorji wrote:
           | Oh cool! It's always hugely useful to learn the word for the
           | thing you're thinking about. It can be really tricky to
           | figure out if a vague idea has a name unless you're already
           | pretty well read in a field. Now I have some reading to do,
           | thanks!
        
         | jbaber wrote:
         | I just couldn't bear all that walking and jumping into near
         | misses in the video.
        
       | dustingetz wrote:
       | emulator integration is
       | https://www.libretro.com/index.php/api/?amp=1 (warning, f'ed
       | website) and the clojure wrapper (by OP) is:
       | https://github.com/phronmophobic/clj-libretro/blob/67f186e87...
       | 
       | really cool!
        
       | anandvc wrote:
       | Can anyone help me understand this: How does the AI "see" what's
       | happening on the screen? Is there some sort of object/scene
       | recognition going on?
        
         | capableweb wrote:
         | Quick scan of the code (https://github.com/phronmophobic/clj-
         | libretro/blob/67f186e87...) seems to indicate that there is at
         | least some screenshotting going on, + serialization of game
         | state. What is used for detecting thing, I couldn't find from
         | just skimming the code.
        
           | sbierwagen wrote:
           | I think the screenshots are just for assembling the time
           | lapse video.
           | 
           | Check out `(defn dist` or read the explanation of that
           | function in the blog post. It can't see enemies or pickups at
           | all. It can only tell how far away it is from the end of the
           | level, and if it's dead or not, then brute forces every
           | button combination for every frame. The "in progress" video
           | shows that it spends most of its time falling into pits, but
           | eventually makes it to the end.
           | 
           | A fun metric would be how many deaths it takes to complete a
           | given level. It's got to be in the tens of thousands.
        
         | cfiggers wrote:
         | It looks like the `dist` function is directly accessing key
         | game values by reading bytes out of the emulator's memory in
         | order to "see" the current game state. This is explained a bit
         | under the "Distance" section.
        
       | charlieyu1 wrote:
       | Pretty interesting. Human players are grinding Mario to squeeze
       | some milliseconds in speed runs, while the AIs still aren't close
       | to that, to the point that we are happy that at least AI is
       | beating a level
        
       | adamddev1 wrote:
       | This is really cool. I hadn't thought of a path-finding algorithm
       | working on something like a Mario run. Nice to see AI approaches
       | like this aren't getting forgotten for all the neural nets and
       | machine learning.
        
         | [deleted]
        
       | crispyambulance wrote:
       | One thing I've always wondered about these automated game-playing
       | applications is what they use as input.
       | 
       | Are they just getting images (frame by frame) as input,
       | perceiving them in some way, and producing an output stream of
       | controller button-pushes?
       | 
       | Or does it only detect "death" and start over, systematically
       | changing it's robotically timed output stream in response with
       | the goal of getting further along in the game?
       | 
       | I know the retro games were very deterministic. Pacman had
       | "patterns" you could use. Was Mario like this? No random
       | adversaries at all?
        
         | cfiggers wrote:
         | I have seen systems that use frames/screen captures as input
         | (look up sentdex' GTA series on YouTube for an example) but
         | that isn't what's happening here. In this case, it looks like
         | the `dist` function is directly accessing key game values by
         | reading bytes out of the emulator's memory. So the AI is taking
         | multiple signals from the game's state as input, not just
         | "alive/dead."
         | 
         | The blog post explains that part a bit under the "Distance"
         | section.
        
       ___________________________________________________________________
       (page generated 2023-07-09 23:00 UTC)