ComfyUI_VLM_nodes
Model integration tools
Customizable UI nodes for integrating various vision and language models to generate text, music, images, or suggest creative prompts.
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
430 stars
7 watching
39 forks
Language: Python
last commit: about 1 month ago comfyuicustom-nodesimage-captioningimg2sfximg2textjoytagllavallmmllmnodesphi15siglipvlm
Related projects:
Repository | Description | Stars |
---|---|---|
comfysage/chaivim | An easily configurable Neovim system with solid defaults and a cozy editor experience. | 64 |
mbzuai-oryx/groundinglmm | An end-to-end trained model capable of generating natural language responses integrated with object segmentation masks for interactive visual conversations | 797 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
vinhnx/inkchatgpt | An application that enables users to upload documents and converse with an AI-powered language model. | 9 |
comfysage/evergarden | A Lua-based Neovim colorscheme for creating a cozy coding environment. | 166 |
noelyahan/mergi | A Go library and command-line tool for manipulating images | 236 |
lxtgh/omg-seg | Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. | 1,336 |
chadmv/cmt | A collection of Maya plugins developed for personal projects. | 255 |
aaronik/gptmodels.nvim | An AI plugin for Neovim that facilitates LLM-based code analysis and suggestions | 57 |
sy-xuan/pink | This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. | 79 |
irhonin/golem-covid | Visualizes COVID-19 data on a world map and generates an animated GIF | 1 |
canokaue/gvm-vim | Compiles VIM on a Golem Network node and runs it on a native machine. | 0 |
bekaboo/dropbar.nvim | An IDE-like feature for displaying breadcrumbs in an integrated development environment (IDE) like Neovim. | 1,076 |
microsoft/som | Enables visual grounding in large language models by overlaying spatial and speakable marks on images | 1,218 |
vciancio/golem-node-server | Exposes information about a Golem node to the network | 2 |