Can Dify vectorize input images, then have the knowledge retrieval node retrieve vectorized image sub-segments from a knowledge base, and return the parent block content?

mvp666 · March 27, 2026, 11:56am

Asking the community experts, can Dify achieve the following: inputting images, and after configuring the Qwen3-VL-Embedding model in the knowledge retrieval node to vectorize the images, then retrieve image content from the knowledge base? (The knowledge base image content would also be vectorized using the Qwen3-VL-Embedding model.) I recall that Dify’s v1.11.0 version supported inputting images into the knowledge base for vectorization using multimodal Embedding models.