[HN Gopher] Clojure Plays Mario
___________________________________________________________________
Clojure Plays Mario
Author : hotcrossbunny
Score : 64 points
Date : 2023-07-09 21:14 UTC (1 hours ago)
(HTM) web link (blog.phronemophobic.com)
(TXT) w3m dump (blog.phronemophobic.com)
| hotcrossbunny wrote:
| I can never look at Mario the same
| CobrastanJorji wrote:
| Y'know, the thing I least like about these AI video game players
| is how unlike humans they look. I was wondering about the
| difference, and I think it comes down to two parts. First and
| foremost, human players generally prefer routes with a lot of
| tolerance for input error. Second, humans take frequently "mental
| planning breaks," stopping for a moment in safe spots before
| challenging areas.
|
| I think you could juggle the heuristics to demonstrate the
| preference for input error. For ML training, you could just
| random vary input timing by up to 20ms or so to teach the
| algorithm to favor safer moves. For path finding, it's trickier,
| but there's probably a way to favor "wide" paths. I'm less sure
| how to express the second concept, pausing briefly in "safe
| areas," but I imagine it's maybe noticing a place where
| significant amounts of entering no inputs does not affect the
| results.
| michaelteter wrote:
| I don't think making humans comfortable is the goal with
| respect to AI. The goal is to actually solve a problem.
| Performance is second. Human comfort is a distant third or
| beyond.
|
| When AI can reliably solve a problem without significant
| negative consequenses from time to time, it's a win. How humans
| feel about the method is effectively irrelevant.
| pedrosorio wrote:
| > I don't think making humans comfortable is the goal with
| respect to AI
|
| According to whom?
|
| AI in games, has historically been all about human
| comfort/enjoyment. Extremely good AI that seems "unnatural"
| to humans is usually not the goal.
| tzot wrote:
| You seem to talk about AI-controlled-NPCs in games, while
| GP starts from the article context about AI-controlled-PCs
| (player characters) and proceeds to generalize about using
| AI to solve problems outside games.
| rootw0rm wrote:
| I think the point is that different people have different
| goals for "AI"
| oivey wrote:
| The near miss behavior is very much like overfitting. Mario
| is simple and deterministic enough that it doesn't matter,
| but think about a scenario like a self driving car. A
| calculated near miss turns into a crash if the other car's
| driver is just a little slower or their tires a little
| slicker than anticipated.
| tnecniv wrote:
| What you're basically describing is bounded rationality, which
| has been widely studied in behavioral economics, psychology,
| and engineering applications (Simon and Gigerenzer are two big
| names to google). A common framework for formalizing it is as
| what boils down versions of rate-distortion problems from
| information theory (very related to Bayesian statistics).
|
| The reason it's of engineering interest is, like you observe,
| bounded-rationality gives you solutions that are sub-optimal
| but more robust and often simpler.
|
| Moreover, finding wide path solutions emerges naturally from
| sampling-based motion planners. These planners are
| asymptotically optimal, but if you terminate them early, they
| are more likely to give you a solution that goes through large
| gaps, not smaller ones, because it's unlikely to sample a
| trajectory that goes through a tight space without heavy
| sampling. You could probably formulate that in the rate-
| distortion framework but I haven't thought about how to do it
| precisely.
| CobrastanJorji wrote:
| Oh cool! It's always hugely useful to learn the word for the
| thing you're thinking about. It can be really tricky to
| figure out if a vague idea has a name unless you're already
| pretty well read in a field. Now I have some reading to do,
| thanks!
| jbaber wrote:
| I just couldn't bear all that walking and jumping into near
| misses in the video.
| dustingetz wrote:
| emulator integration is
| https://www.libretro.com/index.php/api/?amp=1 (warning, f'ed
| website) and the clojure wrapper (by OP) is:
| https://github.com/phronmophobic/clj-libretro/blob/67f186e87...
|
| really cool!
| anandvc wrote:
| Can anyone help me understand this: How does the AI "see" what's
| happening on the screen? Is there some sort of object/scene
| recognition going on?
| capableweb wrote:
| Quick scan of the code (https://github.com/phronmophobic/clj-
| libretro/blob/67f186e87...) seems to indicate that there is at
| least some screenshotting going on, + serialization of game
| state. What is used for detecting thing, I couldn't find from
| just skimming the code.
| sbierwagen wrote:
| I think the screenshots are just for assembling the time
| lapse video.
|
| Check out `(defn dist` or read the explanation of that
| function in the blog post. It can't see enemies or pickups at
| all. It can only tell how far away it is from the end of the
| level, and if it's dead or not, then brute forces every
| button combination for every frame. The "in progress" video
| shows that it spends most of its time falling into pits, but
| eventually makes it to the end.
|
| A fun metric would be how many deaths it takes to complete a
| given level. It's got to be in the tens of thousands.
| cfiggers wrote:
| It looks like the `dist` function is directly accessing key
| game values by reading bytes out of the emulator's memory in
| order to "see" the current game state. This is explained a bit
| under the "Distance" section.
| charlieyu1 wrote:
| Pretty interesting. Human players are grinding Mario to squeeze
| some milliseconds in speed runs, while the AIs still aren't close
| to that, to the point that we are happy that at least AI is
| beating a level
| adamddev1 wrote:
| This is really cool. I hadn't thought of a path-finding algorithm
| working on something like a Mario run. Nice to see AI approaches
| like this aren't getting forgotten for all the neural nets and
| machine learning.
| [deleted]
| crispyambulance wrote:
| One thing I've always wondered about these automated game-playing
| applications is what they use as input.
|
| Are they just getting images (frame by frame) as input,
| perceiving them in some way, and producing an output stream of
| controller button-pushes?
|
| Or does it only detect "death" and start over, systematically
| changing it's robotically timed output stream in response with
| the goal of getting further along in the game?
|
| I know the retro games were very deterministic. Pacman had
| "patterns" you could use. Was Mario like this? No random
| adversaries at all?
| cfiggers wrote:
| I have seen systems that use frames/screen captures as input
| (look up sentdex' GTA series on YouTube for an example) but
| that isn't what's happening here. In this case, it looks like
| the `dist` function is directly accessing key game values by
| reading bytes out of the emulator's memory. So the AI is taking
| multiple signals from the game's state as input, not just
| "alive/dead."
|
| The blog post explains that part a bit under the "Distance"
| section.
___________________________________________________________________
(page generated 2023-07-09 23:00 UTC)