How do LLM nodes support multimodal input?

andrew · December 13, 2025, 8:31am

To use doubao-seed-1-6 or qwen3-omni-flash, you need to pass image_url or video_url. How can this be implemented using an LLM node?

page · December 26, 2025, 3:04am

For handling image_url with LLM integration, my implementation approach is as follows (welcome more sharing and discussion):

Add a “HTTP Request” node
Use a GET request to image_url to obtain the output result, which is a list of files.
Add a “List Operation” node
Input the file list from the previous step to obtain the first_record as the output file.
Add an “LLM” node
At this point, you can reference the file from the previous node.

Note: The HTTP request in step 1 may return images in GIF format (even if the file extension is jpg/png/jpeg, etc.), which could cause the LLM to fail to parse. Therefore, an if condition should be added beforehand to prevent LLM errors.

keyleaf · December 30, 2025, 6:38am

andrew · January 9, 2026, 7:34am

It’s a bit troublesome. I’m currently using an HTTP node instead of LLM. There’s a reply downstairs mentioning that the LLM node now supports document input.

andrew · January 9, 2026, 7:35am

Which version are you using? I’m using 1.10.1 and there’s no document input option.

keyleaf · January 12, 2026, 1:00am

The self-referential visualization logic has been modified, and corresponding configurations have been added for the document model.

page · January 16, 2026, 7:51am

There are multiple ways to implement this. My approach mainly involves parsing the file first via HTTP requests, then using an LLM-VL model combined with prompts to extract content from the image. Friends below, you can try directly extracting using a model that supports this image. Both methods are worth trying, and you can evaluate them based on model costs.

Topic		Replies	Views
使用豆包视觉模型doubao-seed-1.6-vision，在LLM节点无法选择【视觉】，模型管理界面也没有视觉选项 Discussion	2	181	November 24, 2025
有没有方法可以将2张图片拼接成一张图片后，输入LLM的视觉中（以file等可以输入的类型） Discussion	0	40	December 31, 2025
如何将用户上传的文件放入知识库？ Discussion	0	41	January 23, 2026
Dify 的最新版本仍然不支持视频上传吗？ Discussion	15	127	January 27, 2026
Dify是不是只能发挥模型的文本能力？ Discussion case , commuity , readme	3	81	January 18, 2026
LLM Node – Using Language Models in Workflows English 🇬🇧 ai , course-beginner	0	22	January 28, 2026
Dify chatflow 最后的直接回复节点引用了开始节点的files Discussion	1	36	January 13, 2026
Llm节点同时连接两个知识库时，上下文只能选择一个知识库结果吗？ Discussion	0	50	January 4, 2026
智谱 glm-4.x 显示不支持结构化输出？ Discussion	0	52	December 26, 2025
Markdown转换器无法获取到文件的存储位置吗，输出变量不可用？ Discussion	6	53	January 23, 2026

How do LLM nodes support multimodal input?

Related topics