[HN Gopher] WebRL: Training LLM Web Agents via Self-Evolving Onl...
___________________________________________________________________
WebRL: Training LLM Web Agents via Self-Evolving Online
Reinforcement Learning
Author : theredsix
Score : 11 points
Date : 2024-11-05 16:03 UTC (6 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
___________________________________________________________________
(page generated 2024-11-05 23:01 UTC)