This is a very classic problem. According to similar discussions in the community (especially the user’s own feedback in another post /t/topic/683), the main reason is that the standard Document Extractor (Doc Extractor) node does not support OCR text recognition for scanned PDFs (image-only PDFs) by default.
Here is an analysis and proposed solution for this problem:
Core Reason Analysis
- Scanned PDFs have no text layer: The user uploaded a scanned document, which is essentially a PDF packaged with images.
- Extractor Limitations: Dify’s built-in “Document Extractor” node typically uses tools like
pypdfium2when processing PDFs. These tools can only extract selectable text from files and cannot perform OCR on images. - Result: The
textvariable output by the extractor is an empty string. - LLM’s Response: Since the variable content passed to the Prompt is empty, the LLM truly “saw” no content, so its answer (“No uploaded documents received”) is completely honest and correct.
Suggested Solutions
Please advise the user not to use the built-in “Document Extractor” node, but instead adopt one of the following alternative solutions:
Solution One: Use the Unstructured Plugin (Recommended)
The Unstructured plugin is available in the Dify Plugin Marketplace, which supports more advanced parsing strategies.
- Delete the original “Document Extractor” in the workflow.
- Search for and add the Unstructured plugin in the “Plugins” or “Tools” section on the right.
- When configuring the plugin, ensure that an OCR-supported strategy is enabled or selected (e.g.,
hi_resor configure OCR-related parameters). - Note: This usually requires the user to deploy the Unstructured service themselves or use its API Key, and correctly configure credentials in Dify’s settings.
Solution Two: Use a Dedicated OCR Plugin
If Unstructured is not used, you can also search for OCR related tools in the plugin marketplace (e.g., General OCR, Aliyun OCR, etc.), first pass the file to the OCR tool to obtain text, and then pass the OCR output text to the LLM.
Summary of Reply Example for Users
You can reply to the user as follows:
The cause of this problem has been identified: Dify’s built-in “Document Extractor” node does not support OCR text recognition for scanned PDFs (image-only) by default. Therefore, the extracted
textis actually empty, and the LLM indeed did not receive any text content.Solution:
Please go to the Plugin Marketplace to find an OCR-enabled plugin to replace the original Document Extractor node.
- Recommended to try the Unstructured plugin (requires configuring the corresponding service or API).
- Or search for other OCR-type plugins.
You actually touched upon the core reason in another post (Topic 683), which is that locally deployed ETL configurations are primarily used for knowledge bases and do not directly act on built-in nodes in the workflow. Processing scanned documents in a workflow must explicitly use an OCR-enabled tool node.
Related Documentation and Discussions:
- Unstructured Plugin - Dify Marketplace
- Related community discussion: /t/dify/683 (User’s own another related question)