[HN Gopher] Hand-Tracking with Three.js
       ___________________________________________________________________
        
       Hand-Tracking with Three.js
        
       Author : marban
       Score  : 106 points
       Date   : 2023-02-03 11:43 UTC (11 hours ago)
        
 (HTM) web link (rdtr01.xl.digital)
 (TXT) w3m dump (rdtr01.xl.digital)
        
       | xnx wrote:
       | Cool mashup! For anyone interested, I found this codepen from
       | Google where you can play with Mediapipe in your browser:
       | https://codepen.io/mediapipe/pen/RwGWYJw
        
       | philipphutterer wrote:
       | This is really cool to see. I don't have any problems or bugs
       | using it, but it seems like the 3d rendering of the virtual hands
       | is not quite as fast as the tracking itself. I wonder if and how
       | this could be used in a meaningful way. As a feature for things
       | like Google Quick Draw it would be fun.
        
       | bhouston wrote:
       | This is unfortunately a bit buggy. I can see in the 2D image that
       | it is tracking when my hand is turned backwards and relatively
       | flat to the camera. But the 3D is showing my hands are curled.
        
       | julienreszka wrote:
       | fails when both hands touch or cross
        
       | xrd wrote:
       | This is so awesome. Fantastic demo.
       | 
       | An "air-piano" seems well within the realms of possibility now.
        
       | butlersean wrote:
       | the hand tracking is spot on by the small image in the bottom
       | left hand corner. steady and accurate positioning of two hands.
       | :-)
       | 
       | unfortunately the rendering of the hands in the large window
       | jumps all over the place on firefox, ubuntu, razer laptop. :-(
        
         | icoder wrote:
         | Same problem on Chrome (mac M1). For weird positions I can
         | understand as the computer must still interpret the 2D dots in
         | the small screen to 3D coordinates, but even when I just show
         | my hand fingers spread, palm facing the camera, the small image
         | is stable and the 3D image glitches quite profoundly.
        
         | ncr100 wrote:
         | On my Pixel 7 Pro phone, I'm able to ge represent all kinds of
         | strange hand positions correctly on the screen via my selfie
         | camera.
         | 
         | There is some jumpiness but I am holding the phone in one hand
         | and making hand positions with my other so I'm less concerned
         | about the jiggles.
         | 
         | It has a hard time naturally when all of my fingers are
         | occluded by one another.
         | 
         | Way to go.
        
         | dTal wrote:
         | Same problem. The hand tracking is really impressive, basically
         | flawless. The rendered hand is all over the place, doesn't
         | match the debug window at all.
        
         | rikroots wrote:
         | It seems quite stable to me (on Chrome) ... until I weave my
         | fingers from each hand together.
         | 
         | That said, I still think MediaPipe is an excellent piece of ML
         | tech. And, from a dev point of view, quite easy to get working
         | for various things in the browser[1][2]
         | 
         | [1] - MediaPipe Selfie Segmentation for real-life background
         | replacement -
         | https://scrawl-v8.rikweb.org.uk/demo/mediapipe-001.html
         | 
         | [2] - MediaPipe Face Mesh for, well, drawing lines on your face
         | - https://scrawl-v8.rikweb.org.uk/demo/mediapipe-003.html
        
       | speps wrote:
       | It doesn't allow me to select which camera to use unfortunately,
       | Chrome is set to use one but the site probably uses the first one
       | it finds.
        
         | radiojasper wrote:
         | You need a better browser. Firefox lets you choose which camera
         | you want to use.
         | 
         | https://jasper.monster/sharex/firefox_8DYyGO7U9f.png
        
       | jackbach wrote:
       | Hello! I am the creator of this experiment.
       | 
       | Glad to see the conversation about hand tracking in the browser
       | over here.
       | 
       | This demos is done under the context of a series of creative
       | experiments on how to use real time hand tracking in the browser
       | for creative interactions. Will be posting more experiments soon.
       | 
       | Tech background: I am using MediaPipe to control the hand rig in
       | threejs. MediaPipe provides landmarks that are used to control a
       | threejs Skeleton (hierarchy of bones with rotations).
       | 
       | Feel free to ask, I will answer any questions!
        
         | p0w3n3d wrote:
         | One use case I immediately thought for hand movement tracking
         | like this is to help my disabled brother - tetraplegic - steer
         | efficiently. Using mouse for him is sometimes too hard. Only in
         | some cases. If one could use this as a macro launcher or more
         | accurate joystick without attaching real joystick, this could
         | help a lot
        
         | jimmySixDOF wrote:
         | This is amazing I had a browser plugin called flutter years
         | back that was able to do webcam gesture recognition for scroll
         | and forward/back. This is using threejs so I wonder how much is
         | CPU or GPU and also how well this could, now or in the future,
         | run under the hood in the background of a web game (or WebXR!)
         | just as the input device and without too much overhead. Great
         | Proof of Concept !!
        
           | jackbach wrote:
           | Thanks for the nice words! Your plugin sounds like fun. In
           | terms of using hand tracking for web games: my next
           | experiments will use this setup to interact with 3D scenes.
        
         | marban wrote:
         | Based on the sample at
         | https://google.github.io/mediapipe/solutions/hands -- that
         | doesn't even sound all too complex.
        
           | jackbach wrote:
           | I thought the same until I tried!
           | 
           | As a matter of fact, lots of people over twitter have been
           | sharing their frustration when they attempted to do the same
           | thing.
           | 
           | This is the funniest one:
           | 
           | https://twitter.com/isjackwild/status/1617559339891396619
           | 
           | Here's some comments on my implementation:
           | 
           | https://twitter.com/SketchpunkLabs/status/161758661970323049.
           | ..
        
         | mncharity wrote:
         | > creative interactions
         | 
         | Fwiw, some things I've found fun: Clip-on fish-eye lens,
         | intended for phone but fitting on laptop, for expanding webcam
         | field of view. Additional cameras: on sticks above screen tips
         | for high-res stereo positioning over kbd; asymmetric high-off-
         | to-side to trade some resolution for some field of view (meh);
         | high-overhead for whole-workspace tracking. Binocular periscope
         | with webcam splitter and screen-tip mirrors (blech - low-res
         | awkward fiddly). Look-down mirror on webcam, partial or full,
         | to get kbd view (nice in VR). Look-down with curved mirror
         | along top of keyboard to get "out along kbd surface view" and
         | crufty touch detection for kbd-as-touch-surface (cute but
         | fiddly - only makes sense to save a camera or two; caveat I had
         | high-contrast white hands on black thinkpad kbd). Putting
         | tracking markers on fingers (flats, a-frames, or cubes on
         | velcro rings) makes for less jittery tracking, but is awkward
         | (meh). Markers taped around keyboard help with calibration.
         | 
         | Magic wand. I found I could more-or-less manage to type while
         | holding a chopstick. So stuck a marker cube on one end, and an
         | arc-sliced-off a small-Xmass-ball on tip, so it slides smoothly
         | across (thinkpad) keys. Barber-pole rotation marker. Anvil'ed
         | tip pressure sensor, a finger microswitch, and very thin and
         | soft ribbon cable to arduino. But didn't actually get the
         | pressure sensor working before punted on all this. Chopstick
         | was narrow enough to avoid breaking hand tracking.
         | 
         | Some gotchas: 2K camera resolution was painful for tracking.
         | (Several years ago) mediapipe finger tracking was annoyingly
         | noisy for doing stereo. You only get one usb2 camera per usb
         | port, even if it's usb3 (maybe usb3 cameras allow working
         | around that limit nowadays?). If you do hand, arm, face and
         | marker tracking on several cameras, even with native gpu
         | mediapipe, you're burning a lot of gpu just on the human
         | interface device, before your likely-graphical-itself app even
         | starts. If I had it to do over now, I'd punt mirrors, use 4K
         | usb3 cameras, and at least with desktop, more cameras. Nicely
         | merging high-latency camera tracking with lower-latency
         | keyboard, touchpad, and graphics tablets, requires changes to
         | the input event pipeline, and adapting apps to deal with "oh
         | my! That space key pressed several keys ago - it was pressed
         | with a _pointer finger at position 3!_ , so that means we roll
         | back app state and then ...".
         | 
         | Here we are a half-century later, still banging on glorified
         | xerox altos. We're so broken.
        
           | jackbach wrote:
           | Thanks for the pointers! Do you have any
           | video/media/recordings of your experiments?
        
             | mncharity wrote:
             | Np, tnx for the demo. Sigh, sorry, not really, nor easily
             | accessible.
             | 
             | I do that poorly, repeatedly. A mindset of "today's rev n
             | is bad, still unusable; tomorrow's incremental rev n+1 will
             | be slightly better; no point in recording bad, wait for
             | better; will demo at meetup for friends, but otherwise,
             | who'd care?"... left a sparse trail. Sort of: you might
             | take a picture of your nice finished cake, but of baking?
             | There have been HN posts of commitment-hacking as a
             | service, eg, iirc, a Japanese workspace with sign-in like
             | "I'm here to write one chapter, and I'd like person-
             | standing-behind-me level pressure". So perhaps, motivate
             | documenting this week's state as a service? As
             | finding/creating community that's interested in such seems
             | often difficult.
             | 
             | Hmm, here's a snapshot[1] of my late-rev laptop hardware
             | with flop-up kbd cam and (stowed fold-up) stereo cams
             | (wires not connected). Gaff tape, sticks, vecro and
             | cardboard esthetic allows fast and incremental iteration.
             | For wires, I like magnetic usb connectors[2]. Fwiw.
             | 
             | [1] https://twitter.com/mncharity/status/123244695378436915
             | 4/pho... https://pbs.twimg.com/media/ERqCfdkX0AEWTN_?format
             | =jpg&name=... [2]
             | https://twitter.com/mncharity/status/1255300177960808448
        
       | klaussilveira wrote:
       | Doesn't really work well while holding objects or faster
       | movements, which I imagine would be restrictive for gaming or
       | simulation purposes. Might be useful as a replacement for Leap
       | Motion, though. I can see this working for manipulating a desktop
       | environment.
        
         | calny wrote:
         | Congrats OP, very cool and runs great for me in Chrome and
         | Firefox. Mediapipe can indeed work for manipulating desktop
         | environments, I've been working off and on at that for a couple
         | years.[0] It's tricky to make the interactions effective and
         | minimize false positives while also avoiding a heavy cognitive
         | load on the user, but there's lots of potential.
         | 
         | [0] https://www.youtube.com/watch?v=bHjj46AIVxs
        
       | micheljansen wrote:
       | Wow this works much better than I had expected. Well done!
        
       | kinard wrote:
       | Jack, it's amazing. Very impressive stuff.
        
       | futhey wrote:
       | Cool progress, really nice work!
       | 
       | Not quite high enough fidelity to handle ASL though.
       | 
       | Some issues I ran into testing it, if it's something that
       | interests you:
       | 
       | - Cannot distinguish closed vs. open fingers (always adds gaps
       | between fingers, even if they're touching) (B) - Can't handle
       | crossed fingers (R) - Doesn't seem to like extended vs. curled
       | fingers in some cases (H) - Other failed letters: (Q), (E?),
       | (F?), (G), (Q), (S), (U/V)
       | 
       | But, when signing naturally, it seems to get enough of the shapes
       | and orientation correct enough to understand what I'm seeing. I'm
       | sure there's things it'd trip up on because of some of the above
       | weaknesses in detecting hand shapes but it does seem to get
       | movement, orientation, and position "good enough".
        
       ___________________________________________________________________
       (page generated 2023-02-03 23:01 UTC)