How to adapt custom chunked data output from code nodes to the parent_child_structure (multimodal) validation of knowledge base nodes?

Problem Background

I am developing a custom code parsing tool based on the Dify workflow, specifically targeting the Zig language. To achieve more precise RAG results, I have disabled the system’s automatic chunking and instead manually implemented a “parent-child chunking” logic within a code node (Node.js/Python):

  • Parent Chunk: A complete function implementation or type definition.
  • Child Chunk: Fine-grained semantic units derived from the parent chunk (e.g., comments, function signatures).

Core Pain Point

The biggest obstacle I’m facing is: The JSON object output by the code node cannot be recognized by the knowledge base node, or it prompts Output parent_child_structure is missing. Although I’ve tried mimicking the output format of tool nodes, the lack of official documentation defining the Schema for the (multimodal)parent_child_structure type has led to frequent failures in variable mapping.

Actions Taken

  1. Data Structure Restructuring: I’ve tried returning a plain array, as well as an Object containing parent_mode and parent_child_chunks.
  2. Output Variable Definition: In the code node’s “Output Variables,” I manually declared result as type Object, but the variable selector in the downstream knowledge base node still fails to correctly parse its internal sub-properties.
  3. Environment Check: Confirmed that the Embedding model is functioning normally, and child_contents are all non-empty string arrays.

Questions for Guidance

  1. Official Schema Definition: What is the complete JSON Schema for the strongly typed variable parent_child_structure? Besides parent_mode and parent_child_chunks, are there hidden metadata fields or specific $schema identifier requirements?
  2. Variable Recognition Logic: Why is the Object output by the code node often filtered out (not displayed) in the knowledge base node’s variable selector? Is there a specific variable naming convention or “Output Variable” declaration method that must be followed?
  3. Best Practices for Manual Chunking: If I want to bypass Dify’s default cleaning logic and directly store preprocessed parent-child chunks into the knowledge base, aside from the “code node → knowledge base node” path, is there a more mature API or plugin approach?

Attachment: Current Output Format Reference

{
  "parent_child_structure": {
    "parent_mode": "paragraph",
    "parent_child_chunks": [
      {
        "parent_content": "pub fn main() void { ... }",
        "child_contents": ["pub fn main()", "void { ... }"]
      }
    ]
  }
}

I would greatly appreciate guidance from official documentation or experienced users—thank you very much!