先说结论:目前你的用法(让 LLM 输出带 ## 标题的 Markdown,希望 Markdown 转换器按 ## 自动切 Sheet)在 Dify 里是不保证生效的——Markdown 转换器现在并没有一个“官方约定”的规则:## 一定会变成 Excel 的工作表名。
换句话说:
- 它确实能把 Markdown 表格转成 Excel;
- 但“多 Sheet + Sheet 命名逻辑”目前是比较「黑盒」的,不是一个对外文档化、可配置的能力,所以你看到“Markdown 看起来没问题,但没按
##分 Sheet”是符合当前实现现状的。
下面分几块说你可以怎么处理。
1. Understanding the Current Behavior
Based on the structure you provided:
{
"text": "## XXX\n\n| 列1 | 列2 |\n| --- | --- |\n| ... | ... |"
}
Your expectation is:
## XXX→ An Excel Sheet namedXXXappears## YYY→ A new Sheet is created
However, the Markdown converter currently behaves more like this:
- It treats the entire Markdown section as a single document;
- The focus is on the ‘table itself’ rather than ‘heading level → Sheet’.
Therefore, common phenomena are:
- It might only generate one Sheet;
- Or it uses a default Sheet name (e.g.,
Sheet1) instead of the text within##.
This is not because your Markdown is incorrect, but rather because the current product design does not treat ## as a formal protocol for ‘Sheet boundary + Sheet name’.
2. Solutions for Achieving ‘Split Sheets by Heading’
If you have a strong requirement to ‘split Sheets by ##’, you can consider bypassing the Markdown converter’s default rules by using a ‘code node + Excel library’ approach to explicitly write your desired structure into multiple Sheets.
Approach A: LLM Outputs Structured JSON, Then Code Generates Excel
-
In the LLM node, do not directly have it output Markdown; instead, have it output structured JSON, for example:
{ "sheets": [ { "name": "SheetA", "table": [ ["列1", "列2"], ["a1", "a2"], ["b1", "b2"] ] }, { "name": "SheetB", "table": [ ["列1", "列2"], ["x1", "x2"] ] } ] } -
Use a code node (Python recommended) to parse this JSON, and use libraries like
openpyxlorpandasto create the Excel file yourself, with full control over multiple tables and Sheets.Pseudocode example:
import io from openpyxl import Workbook import json data = json.loads(inputs["llm"]["text"]) # Assuming LLM outputs the JSON above wb = Workbook() # Delete default sheet default_ws = wb.active wb.remove(default_ws) for sheet in data["sheets"]: ws = wb.create_sheet(title=sheet["name"][:31]) # Excel sheet names are max 31 characters for row in sheet["table"]: ws.append(row) # Save to memory and return to subsequent nodes as a file buffer = io.BytesIO() wb.save(buffer) buffer.seek(0) outputs["excel_file"] = { "type": "document", "filename": "result.xlsx", "content": buffer.read() } -
Subsequently, you can provide this
excel_fileas a regular file for users to download, or pass it to other nodes.
Advantages:
- Completely independent of the Markdown converter’s internal rules;
- Sheet names, number of Sheets, and content of each Sheet are all under your control;
- The LLM’s task is also clearer: it’s only responsible for ‘structured planning,’ not Excel details.
Approach B: Continue Having LLM Output Markdown, But Use Code to Split and Write Excel
If your current LLM Prompt is already fixed to output Markdown, and you prefer Markdown’s readability, you can:
-
Still require the LLM to use a similar structure:
## SheetA | 列1 | 列2 | | --- | --- | | a1 | a2 | | b1 | b2 | ## SheetB | 列1 | 列2 | | --- | --- | | x1 | x2 | -
Add a code node downstream to do two things:
- Use regular expressions / Markdown parsing libraries to split the text into blocks by
##headings; - Extract the first table in each block as a 2D array, then write multiple Excel Sheets in a similar way to the above.
For example, a simple regex approach (pseudocode):
import re md = inputs["llm"]["text"] # Split into sheet blocks blocks = re.split(r'^##\s+', md, flags=re.MULTILINE) # blocks[0] might be content before the first ##, can be ignored sheet_blocks = blocks[1:] sheets = [] for block in sheet_blocks: # The first line until the newline is the sheet name lines = block.splitlines() sheet_name = lines[0].strip() sheet_body = "\n".join(lines[1:]) # Then parse the first markdown table in sheet_body into a 2D array # (You can write your own parser or use an existing library) table = parse_markdown_table(sheet_body) sheets.append({"name": sheet_name, "table": table}) # Then reuse the openpyxl writing logic from above - Use regular expressions / Markdown parsing libraries to split the text into blocks by
-
Ultimately, this code node will generate the Excel file, rather than relying on the Markdown converter to automatically infer Sheets.
3. What If You Absolutely Must Continue to Rely on the Markdown Converter?
Based on currently available information, there is no ‘guaranteed version’ of documentation describing:
- The mapping rules between headings like
##,###, etc., and Sheets; - Or a switch that can enable ‘splitting Sheets by heading’.
Therefore:
- Even if you currently discover a way that ‘seems to split Sheets by heading,’ it might be an implementation detail and not necessarily stable in future versions;
- For production scenarios, it’s still recommended to use the aforementioned ‘LLM + code-generated Excel’ approach to avoid relying on undocumented behavior.
4. Practical Advice (Minimal Change Version)
On your existing workflow, if you don’t want to make major changes, you can fine-tune it like this:
-
Upstream LLM:
-
Continue to output ‘Markdown with
##’; -
But additionally, in the Prompt, ensure the model outputs a JSON structure simultaneously, for example:
First, provide the Markdown for readability; Then, provide a JSON code block with the following structure: ```json { "sheets": [ { "name": "...", "table": [["表头1","表头2"],["...","..."]] } ] }
-
-
Downstream, add a new code node:
- Specifically read the JSON section, ignoring the Markdown section;
- Generate the file using the JSON → multi-Sheet Excel logic;
- No longer use the Markdown converter to convert to Excel.
This way:
- Humans can still view elegant Markdown in chat records;
- When the machine generates Excel, it follows a ‘structured + controllable’ path.
Summary
- The core issue you’re facing is: The Markdown converter does not have a reliable, publicly documented rule for ‘
##→ Excel sheet’, so it won’t create multiple Sheets in the way you expect. - To stably control Sheet names and structure, it’s recommended to use:
- ‘LLM outputs structured JSON → code node generates Excel’; or
- ‘LLM outputs Markdown with
##→ code node splits by##→ writes Excel itself’.
- This way, you don’t rely on undocumented internal implementation details, and future version upgrades will be safer.
If it’s convenient for you, you can paste a small, complete example of your current LLM output (after anonymization), and I can directly help you write a corresponding code node example to convert this output into a multi-Sheet Excel file.