skyvern

Browser automator

Automates browser-based workflows using LLMs and computer vision to replace brittle automation solutions.

Automate browser-based workflows with LLMs and Computer Vision

GitHub

10k stars
64 watching
708 forks
Language: Python
last commit: 6 days ago
apiautomationbrowserbrowser-automationcomputergptllmplaywrightpythonrpavisionworkflow

Related projects:

Repository Description Stars
parisneo/lollms-webui An all-encompassing tool providing a web interface to access and utilize various AI models for tasks such as text generation, image analysis, music generation, and more. 4,344
skypilot-org/skypilot A framework for running AI and batch workloads on any infrastructure, offering unified execution, cost savings, and high GPU availability. 6,801
llmware-ai/llmware A framework for building enterprise LLM-based applications using small, specialized models 6,651
seleniumhq/selenium A platform for automating web browser interactions 30,751
scisharp/llamasharp A C#/.NET library to efficiently run Large Language Models (LLMs) on local devices 2,673
superagent-ai/superagent An open-source AI framework and API for building intelligent applications 5,309
jopemachine/alfred-chromium-workflow A Chromium-based browser workflow for Alfred 5 that provides features such as search, bookmark management, and autofill data retrieval. 125
langroid/langroid A Python framework to build LLM-powered applications by setting up agents with optional components and having them collaboratively solve problems through message exchange 2,654
microsoft/playwright A framework for automating web browsers across multiple platforms and versions with a single API. 66,974
cvat-ai/cvat An interactive video and image annotation tool for computer vision 12,622
automaapp/automa An extension for automating browser tasks by creating a workflow of connected blocks. 11,759
cloudwu/skynet A framework for building multi-user online games using the actor model. 13,340
teamcapybara/capybara A tool for testing web applications by simulating user interactions. 10,028
oegedijk/explainerdashboard A Python library for building interactive dashboards to explain machine learning models 2,311
salesforce/lavis A library that provides pre-trained models and frameworks for multimodal vision-language intelligence tasks such as image captioning and visual question answering. 9,926