hngopher.com

       [HN Gopher] OSWorld: Benchmarking Multimodal Agents for Open-End...
       ___________________________________________________________________
        
       OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in
       Real Computers
        
       Author : kristianpaul
       Score  : 39 points
       Date   : 2024-04-28 19:16 UTC (3 hours ago)
        
 (HTM) web link (os-world.github.io)
 (TXT) w3m dump (os-world.github.io)
        
       | stavros wrote:
       | I built a small Python script so I could let GPT-4 debug my
       | system issues:
       | 
       | https://github.com/skorokithakis/sysaidmin
       | 
       | It works surprisingly well!
        
       | TheRoque wrote:
       | Gotta love people working on replacing themselves. Jokes aside,
       | seeing an AI interacting with a computer is kind of scary. It's
       | not just outputting text anymore, it's doing the full work of a
       | human working on a computer, meaning... a ton of people
        
         | mindcrime wrote:
         | _Gotta love people working on replacing themselves._
         | 
         | So here's the thing: I _want_ the computer to replace me...
         | with regards to things that I don 't want to do, or that aren't
         | a good use of my time (boring, repetitive, trite, tedious,
         | etc.). BUT the caveat is, I want it to be in a context where
         | I'm in control of the computer and own the output, etc.
         | 
         | Now companies and other actors also want the computer to
         | replace me, but _they_ want to be in control and to own the
         | output. But here 's the rub: they're not going to stop trying
         | to come up with ways to replace me and minimize my value just
         | because I don't like it. No man can stop the tide from coming
         | in, or something like that (insert pithy saying here?). But if
         | you know the tide is coming, you can retreat further inland OR
         | build a boat, or climb a pole, etc.
         | 
         | So where I'm at with this is, the more I work on AI, the more I
         | know about AI, the more I can use it to enhance my own value
         | outside of the context of corporate controlled IT systems, the
         | more I can exercise a (possibly insignificant, but maybe not)
         | degree of control over my fate. I can try to build my own boat,
         | or at least outrun the tide for a little while.
         | 
         | I don't know how things are going to turn out, and there's a
         | reason that I am not in fact joking when I crack the occasional
         | remark about "after this conversation I'm going to run to Bass
         | Pro Shops and pick up a few more boxes of ammo" or whatever.
         | Maybe a cyberpunk dystopia is what awaits us all. Maybe not.
         | I'm just going to try to make my best guess at what the
         | possible / likely future scenarios are and take whatever
         | actions I can now to situate myself as favorably as I can.
        
       ___________________________________________________________________
       (page generated 2024-04-28 23:00 UTC)