[HN Gopher] OSWorld: Benchmarking Multimodal Agents for Open-End...
___________________________________________________________________
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in
Real Computers
Author : kristianpaul
Score : 39 points
Date : 2024-04-28 19:16 UTC (3 hours ago)
(HTM) web link (os-world.github.io)
(TXT) w3m dump (os-world.github.io)
| stavros wrote:
| I built a small Python script so I could let GPT-4 debug my
| system issues:
|
| https://github.com/skorokithakis/sysaidmin
|
| It works surprisingly well!
| TheRoque wrote:
| Gotta love people working on replacing themselves. Jokes aside,
| seeing an AI interacting with a computer is kind of scary. It's
| not just outputting text anymore, it's doing the full work of a
| human working on a computer, meaning... a ton of people
| mindcrime wrote:
| _Gotta love people working on replacing themselves._
|
| So here's the thing: I _want_ the computer to replace me...
| with regards to things that I don 't want to do, or that aren't
| a good use of my time (boring, repetitive, trite, tedious,
| etc.). BUT the caveat is, I want it to be in a context where
| I'm in control of the computer and own the output, etc.
|
| Now companies and other actors also want the computer to
| replace me, but _they_ want to be in control and to own the
| output. But here 's the rub: they're not going to stop trying
| to come up with ways to replace me and minimize my value just
| because I don't like it. No man can stop the tide from coming
| in, or something like that (insert pithy saying here?). But if
| you know the tide is coming, you can retreat further inland OR
| build a boat, or climb a pole, etc.
|
| So where I'm at with this is, the more I work on AI, the more I
| know about AI, the more I can use it to enhance my own value
| outside of the context of corporate controlled IT systems, the
| more I can exercise a (possibly insignificant, but maybe not)
| degree of control over my fate. I can try to build my own boat,
| or at least outrun the tide for a little while.
|
| I don't know how things are going to turn out, and there's a
| reason that I am not in fact joking when I crack the occasional
| remark about "after this conversation I'm going to run to Bass
| Pro Shops and pick up a few more boxes of ammo" or whatever.
| Maybe a cyberpunk dystopia is what awaits us all. Maybe not.
| I'm just going to try to make my best guess at what the
| possible / likely future scenarios are and take whatever
| actions I can now to situate myself as favorably as I can.
___________________________________________________________________
(page generated 2024-04-28 23:00 UTC)