ComfyUI_VLM_nodes

Model integration tools

Customizable UI nodes for integrating various vision and language models to generate text, music, images, or suggest creative prompts.

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

GitHub

430 stars
7 watching
39 forks
Language: Python
last commit: about 1 month ago
comfyuicustom-nodesimage-captioningimg2sfximg2textjoytagllavallmmllmnodesphi15siglipvlm

Related projects:

Repository Description Stars
comfysage/chaivim An easily configurable Neovim system with solid defaults and a cozy editor experience. 64
mbzuai-oryx/groundinglmm An end-to-end trained model capable of generating natural language responses integrated with object segmentation masks for interactive visual conversations 797
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 517
vinhnx/inkchatgpt An application that enables users to upload documents and converse with an AI-powered language model. 9
comfysage/evergarden A Lua-based Neovim colorscheme for creating a cozy coding environment. 166
noelyahan/mergi A Go library and command-line tool for manipulating images 236
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,336
chadmv/cmt A collection of Maya plugins developed for personal projects. 255
aaronik/gptmodels.nvim An AI plugin for Neovim that facilitates LLM-based code analysis and suggestions 57
sy-xuan/pink This project enables multi-modal language models to understand and generate text about visual content using referential comprehension. 79
irhonin/golem-covid Visualizes COVID-19 data on a world map and generates an animated GIF 1
canokaue/gvm-vim Compiles VIM on a Golem Network node and runs it on a native machine. 0
bekaboo/dropbar.nvim An IDE-like feature for displaying breadcrumbs in an integrated development environment (IDE) like Neovim. 1,076
microsoft/som Enables visual grounding in large language models by overlaying spatial and speakable marks on images 1,218
vciancio/golem-node-server Exposes information about a Golem node to the network 2