👁️

sa2va/4b/image

falai

Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels

Model Type

vision

Cost

20 tokens / 1_megabitpixiels

Features

Model Settings

prompt

e.g. Could you please give me a brief description of the image? Please respond with a detailed image prompt for re-generation in plain text

image url

Upload a image or enter a url here

Join the Discussion

Have questions or want to share your experience with sa2va/4b/image? Join the conversation in our forums.

Visit Forums