Document extraction tool node stuck in running state

dify · January 22, 2026, 6:14am

20260122141329_rec_
It was normal before, but now it’s stuck at this point.

dify · January 22, 2026, 6:26am

@Dify_Forum_Helper How to solve this problem

Dify_Forum_Helper · January 22, 2026, 6:26am

It seems your setup was working fine before, but now the “Document Extractor” node is suddenly stuck in a running state and won’t finish. Most likely, the underlying parsing service or queue is stuck. Since I don’t have access to your instance’s logs and configuration, I can only provide you with a systematic troubleshooting approach that you can try step by step:

1. First, Confirm the Scope of Impact

Test with a very small, simple file
- For example, a few KB plain text .txt file or a very small PDF.
- If small files also get stuck: This indicates an issue with the parsing service / platform itself.
- If small files work normally but large files get stuck: It might be due to the file being too large / timeout / insufficient resources.
Create a new, simplest workflow in the same application for testing
- Start → Document Extractor → Output (only output text).
- Check if this new workflow also gets stuck at the extractor node.
- If the new workflow works normally but the old one doesn’t: The old workflow’s configuration might have an issue (e.g., variable binding, no result from the previous node, etc.).

2. Check Workflow Run Details

Open the “Run History / Logs” for this run
- Check if the Document Extractor node returned any error messages, status codes, timeouts, etc.
- If it only shows “Executing” without finishing or reporting errors, it’s highly likely that the request has not returned (e.g., an external parsing service call is stuck).
Check the Document Extractor node’s configuration
- Are the input variables using the output of the previous node?
- You can temporarily change it to “Manually upload file” for a test to rule out variable issues.

3. Key Checks for Self-Hosted Environments (If you are using Docker / Server Deployment)

If you have deployed Dify yourself:

Restart relevant services
- Directly restarting the entire suite of services is usually the fastest method, as it can clear many queue blockages and stuck child processes.
- For Docker deployments, it’s usually:
```
docker compose down
docker compose up -d
```
- After restarting, test the same workflow again.
Check logs (Crucial)
- In the logs of the backend API / worker services, search for the following near the time of this run:
  - unstructured, document extractor, extractor related errors
  - Timeout, OOM (Out Of Memory), connection failed, and other errors
- If there are clear errors in the logs, you can paste the error messages into the post to facilitate further investigation.
Resource Utilization
- Is the server’s CPU / memory maxed out?
- If resources are insufficient, parsing large files can easily get stuck or become very slow, appearing as if the node is continuously running.
Have you configured an external document parsing service?
- If you previously integrated unstructured or other third-party parsing services:
  - Confirm that these services are still running and network connectivity is normal (check if container names / IPs have changed).
  - If the service is down, Dify waiting for a response might also get stuck indefinitely.

4. Cloud Version / Non-Self-Hosted

If you are using the official cloud version:

Try:
- Try again with a different browser or in incognito mode (to rule out browser cache / plugin interference).
- Duplicate this workflow, create an identical one, and see if the duplicated one works normally.
Record this information to facilitate posting it or submitting it to official support:
- Your Dify version (or state “Cloud Version”).
- The App ID / Workflow ID for this process (do not post sensitive information).
- Run ID or approximate timestamp.
- Does the same node also get stuck in other workflows?

5. Information Recommended to Supplement in Your Current Post

To help you pinpoint the issue faster, please supplement this post with a few more details:

You are currently using:
- Official cloud, or
- Self-deployed (Docker / K8s, what is the version number)?
Document type and approximate size (e.g., PDF, 20MB, includes images/scans).
This issue:
- Does it get stuck for all files? Or only for a few specific files?
- Are all other workflows using the Document Extractor also stuck?
If self-hosted: Paste a small snippet of backend logs from the relevant time (after anonymization).

You can first quickly try steps 1-3 above (small file test, create simplest workflow, restart services) and reply with the results in this post. I can then help you narrow down the problem based on those results.

dify · January 22, 2026, 6:29am

@Dify_Forum_Helper I’m using the official cloud, and it keeps getting stuck even with empty Word documents.

Dify_Forum_Helper · January 22, 2026, 6:30am

Your additional information is crucial, thank you for clarifying.

We can now confirm a few phenomena:

Using ‘official cloud’ instead of self-hosting
Document type is Word, and even ‘empty content’ gets stuck at the document extractor node
This indicates it’s not strongly related to file size or content complexity, but rather some kind of ‘task not ending correctly’ issue (possibly related to recent version updates or backend service anomalies)

Since it’s a cloud environment, what you can primarily do is ‘rule out frontend issues + help the official team pinpoint the problem’; backend logs can only be checked by the official team. I suggest you quickly try the steps below and post the results directly:

1. Quickly Conduct 3 Comparative Tests

Create a Minimal Workflow (to avoid interference from existing process configurations)

Create a brand new workflow application
Only include three nodes:
- Start (User Input)
- Document Extractor (Upload File)
- Output (only output the extracted text)
Test once with a ‘blank Word’ document to see if it still gets stuck at the document extractor:

A. If the newly created minimal process also gets stuck:

There’s a high probability that the document parsing service for the current account / current region has an issue, requiring official investigation.

B. If the new process works normally, but the original process is still stuck:

It’s possible that a configuration in the old process is ‘broken’ (e.g., variable type changes, remnants of old version nodes, etc.), and you might consider rebuilding the process from scratch.

Conduct AB Testing by Changing File Types

Also try again in the newly created minimal process:
- A few KB .txt text file
- A very small .pdf
Compare the results:
- If txt / pdf works normally, but only Word gets stuck: this indicates that the cloud’s current Word parsing might have an issue. You can clearly state in the post ‘txt / pdf works normally, but Word (even empty documents) gets stuck’.
- If all types get stuck: this indicates an anomaly in the entire document parsing pipeline.

Test again with a different browser / incognito mode

The purpose is simply to rule out interference from browser cache, plugins, etc.
Run the same minimal process again in an incognito window to confirm if the phenomenon is exactly the same.

2. Supplement These Key Troubleshooting Results in the Post

It is recommended that you directly supplement the current post with information in the format below, which will help the official team or community members quickly pinpoint the issue and make it more likely to be treated as a ‘platform-side problem’:

Usage Environment:
- ‘Official Cloud Dify (non-self-hosted)’
Reproduction Steps (as simple and clear as possible):
- Create workflow: Start → Document Extractor (Upload File) → Output
- Uploaded file:
  - Blank Word (.docx)
  - Did you also test txt / pdf (write the results clearly)
- Result:
  - The document extractor node continuously shows ‘Running’, does not finish even after several minutes, and there is no error message.
Add some time information:
- Approximately when did the problem start (e.g., ‘started around 14:00 this afternoon’)
- The same process was normal before.
If convenient:
- Attach a screenshot of the newly created minimal process + being stuck at the document extractor node.

This way, the official team will find it easier to determine if it’s:

A document parsing service/queue anomaly in a specific region
Or if a recent upgrade to the document extractor introduced a bug (especially for Word / empty documents)

3. For now, you can temporarily bypass it this way (if you need to use it urgently)

Before the issue is officially fixed, if you just need to continue processing documents, you can consider:

First, use a local tool to export the Word document as .txt or .pdf, then try uploading it to the document extractor.
If txt / pdf works normally on your end, at least you can ensure business continuity;
After official confirmation / fix, then resume using Word normally.

If you are willing, you can provide:

The test results of the newly created minimal process (performance of word / txt / pdf respectively)
Approximate reproduction time point

Reply directly below this post, and I can help you organize it into a ‘reproduction description’ template that can be directly used for official support; you just need to copy and paste it.

dify · January 22, 2026, 6:35am

@Dify_Forum_Helper PDF format is fine, but Word documents don’t work. Hope the official team fixes it soon.

Dify_Forum_Helper · January 22, 2026, 6:36am

You’ve already helped narrow down the scope very clearly:

Official cloud
Document Extractor
PDF works normally
Word (even empty documents) consistently causes a freeze

This basically indicates a new issue with the ‘Word parsing pipeline’ on the cloud side, rather than a problem with your account or workflow configuration.

Currently, there are two things you can do: one is a temporary workaround, and the other is to provide complete information to facilitate official investigation.

1. The Most Stable Temporary Solution

Before the official fix, I suggest you:

First, convert Word to PDF or TXT locally
- Save Word as .pdf / .txt
- Upload PDF / TXT in the Document Extractor, instead of the original .docx
- You’ve already verified that PDF can be processed normally, which can ensure your operations aren’t blocked.
If you have a small amount of content, you can also:
- Directly copy the Word text content and paste it into ‘User Input’, instead of uploading a file (suitable for when there isn’t much content).

2. I Suggest You Add a “For Official Review” Message to Your Post

You just said:

PDF format works, but Word documents don’t. Hope the officials fix it soon.

You can write it a bit ‘more like a bug reproduction description’, for example, by replying with (you can copy and modify this):

Environment: Official Dify Cloud
Reproduction Steps:

Create the simplest workflow: Start → Document Extractor (upload file) → Output text

Upload an empty or any content Word (.docx) file
Expected: Node execution completes, outputting the extracted text
Actual: The Document Extractor node remains “running” indefinitely, without error
Comparison: Under the same workflow, uploading a small PDF file completes normally

This description makes it easier for official staff to ‘reproduce it directly’ and is more likely to be treated as a platform bug.

3. How to Follow Up

Since this is a cloud-side issue, you cannot see the backend logs and can only wait for the official operations/development team to investigate.
You can continue to use the “Word → PDF/TXT then upload” method as a temporary measure.
If an official staff member replies to the post later with a fix progress or timeline, you can also @ me, and I can help you confirm if your existing workflow configuration needs further adjustment.

If you’d like, I can also help you refine a “bilingual Chinese-English” reproduction description, which would be convenient for you to submit to support or use on other channels later.

Sherry_M · January 26, 2026, 3:27am

What is the size of your Word document? Is the PDF used for testing converted from a Word document?

Topic		Replies	Views
Dify有没有工作流或者插件可以将word(doc和docx)转成pdf或者将word(doc和docx)直接解析出来 Discussion	5	254	January 22, 2026
工作流中同时上传了文件和一个问题，发现llm在思考时说未收到任何上传的文档 Discussion	20	394	January 21, 2026
The dify process executed without any errors, but the API did not return a value（dify流程执行没有报错，但是API没有返回值） Discussion	15	394	April 16, 2026
Dify本地化部署，它默认不内置文档解析引擎的吗？ Discussion	11	320	January 24, 2026
Markdown转换器无法获取到文件的存储位置吗，输出变量不可用？ Discussion	6	317	January 23, 2026
以流水线创建知识库，节点工具dify文本提取器始终出错 Discussion	0	66	December 15, 2025
Dify平台的文档提取器不支持扫描版的pdf的吗？ Discussion	2	107	January 26, 2026
在“代码执行”节点中无法单独关联 input_file, rules_file, stock_file，并怀疑与 userinput.files 有关。请求提供正确映射变量或处理遗留字段的官方方法。 Discussion commuity , case	7	78	January 26, 2026
Dify Version 1.9.2 知识库转成工作流形式之后文件无法上传，单独知识库文件可以上传。查看api日志发现没有请求过来 Help Me Build commuity	1	291	December 4, 2025
[HELP] Is Anyone Experiencing Extremely Slow or Freezing Chatflow Editing? Discussion readme , commuity , case	4	382	January 13, 2026