Does the Dify platform's document extractor not support scanned PDFs?

zhouciming · January 16, 2026, 8:13am

Does the Dify platform’s document extractor not support scanned PDFs?

I uploaded a scanned PDF, and after processing it with the file extractor, the output text was “”, so when passed to the large model, it became “No file content detected!”

PinkBanana · January 26, 2026, 2:40am

Nope, this node can only extract texts from the PDF files, if you want to extract things in the images, please use minerU or PaddleOCR instead.

Topic		Replies	Views
Dify本地化部署，它默认不内置文档解析引擎的吗？ Discussion	11	461	January 24, 2026
以流水线创建知识库，节点工具dify文本提取器始终出错 Discussion	0	83	December 15, 2025
Dify有没有工作流或者插件可以将word(doc和docx)转成pdf或者将word(doc和docx)直接解析出来 Discussion	5	369	January 22, 2026
文档提取工具节点一直卡在运行状态 Discussion case	7	303	January 26, 2026
Where is the Doc Extractor? Help Me Build	3	250	December 17, 2025
工作流中同时上传了文件和一个问题，发现llm在思考时说未收到任何上传的文档 Discussion	20	758	January 21, 2026
30分钟快速入门教程是否有内容缺失？ Discussion	4	312	March 10, 2026
dify(Version 1.10.1) 通过api上传pptx文件到流水线知识库报错 Discussion case	2	306	January 1, 2026
Dify 的最新版本仍然不支持视频上传吗？ Discussion	15	678	January 27, 2026
Dify是不是只能发挥模型的文本能力？ Discussion readme , commuity , case	3	156	January 18, 2026

Does the Dify platform's document extractor not support scanned PDFs?

Related topics