skyvern

Browser automator

Automates browser-based workflows using LLMs and computer vision to replace brittle automation solutions.

Automate browser-based workflows with LLMs and Computer Vision

GitHub

11k stars
70 watching
768 forks
Language: Python
last commit: about 1 month ago
apiautomationbrowserbrowser-automationcomputergptllmplaywrightpythonrpavisionworkflow

Related projects:

Repository Description Stars
parisneo/lollms-webui An all-encompassing tool providing a web interface to access and utilize various AI models for tasks such as text generation, image analysis, music generation, and more. 4,394
skypilot-org/skypilot A framework for running AI and batch workloads on any infrastructure, offering unified execution, cost savings, and high GPU availability. 6,905
llmware-ai/llmware A framework for building enterprise LLM-based applications using small, specialized models 8,303
seleniumhq/selenium A platform for automating web browser interactions 30,979
scisharp/llamasharp An efficient C#/.NET library for running Large Language Models (LLMs) on local devices 2,750
superagent-ai/superagent An open-source AI framework and API for building intelligent applications 5,391
jopemachine/alfred-chromium-workflow A Chromium-based browser workflow for Alfred 5 that provides features such as search, bookmark management, and autofill data retrieval. 127
langroid/langroid A Python framework to build LLM-powered applications by setting up agents with optional components and having them collaboratively solve problems through message exchange 2,795
microsoft/playwright A framework for automating web browsers with a single API to enable cross-browser testing and automation. 67,755
cvat-ai/cvat An interactive video and image annotation tool for computer vision 12,821
automaapp/automa An extension for automating browser tasks by creating a workflow of connected blocks. 12,581
cloudwu/skynet A framework for building multi-user online games using the actor model. 13,387
teamcapybara/capybara A tool for testing web applications by simulating user interactions. 10,029
oegedijk/explainerdashboard A Python library for building interactive dashboards to explain machine learning models 2,321
salesforce/lavis A library that provides pre-trained models and frameworks for multimodal vision-language intelligence tasks such as image captioning and visual question answering. 10,058