What framework integrates LLMs and browser control in Python?

'a man in a yellow suit and bow tie'

Browser Use is a framework that acts as a bridge between Large Language Models (LLMs) and web browsers, using Python[1]. It allows LLMs to reason and make decisions while providing the tools to interact with websites, including clicking, typing, and scrolling[1]. Browser Use leverages Playwright to handle low-level browser control[1]. The core framework is free to use and run locally[1].

UI-TARS-1.5, an open-source multimodal agent, is built upon a vision-language model and can effectively perform diverse tasks within virtual worlds[2].