[BUG?] Metadata Filter in Knowledge Search Node Not Working Since v1.13.0
In a workflow that was working correctly until about a week ago (around the v1.13.0 release), the metadata filter suddenly stopped working.
Environment
Dify Cloud
Current Status
Metadata product_name = "ProductA" is set in the knowledge base.
product_name = [Variable: user_input] is set in the metadata filter of the Knowledge Search Node.
Even when executed with user_input = "ProductA", { "result": [] } is returned.
The cencern value is not correctly passing the knowledge, and it’s possible that the search is not being performed using metadata.
Steps Taken
Confirmed field name match.
Even when changing the Variable to a Constant (fixed value \"ProductA\"), [] is returned.
Even when rolling back the workflow to last month’s version, it does not improve.
In v1.11.3, fix DatasetRetrieval._process_metadata_filter_func miss in operator (#30199) was fixed, but a regression may have occurred in a subsequent update.
If the part in the attached photo is a variable, searching is not possible; if it’s a constant, searching works without issues. Until recently, I was able to search using variables, so if there’s a way to search for metadata using variables, please let me know.
This appears to be a bug and has already been reported on GitHub.
I plan to fix it as is, but it will likely be tomorrow or later, and perhaps the people inside (the project/company) might be faster.
Is there any alternative solution? The problem is as follows:
The Dify system has encountered an issue. A file I originally uploaded, kb2-test (text), was used in the KB2 stage with the input b9101, which is the file name identified by the previous node. The KB2 node was able to read that file and answer accordingly (showing that the file had been invoked). However, files uploaded later cannot be invoked at this node after being bound.
The system clearly invokes the original kb2-test file based on the file name input at KB2 (even after I deleted b9101 from within the file, it can still retrieve the content). I do not know why, but the KB2 files uploaded later cannot be invoked.
I am certain that for kb2-test, regardless of whether a metadata filter is added (added after upload, since there is no option to add metadata filter parameters during upload), the system can still invoke it (that is, the KB2 node can correctly invoke the file based on input b9101). However, for files uploaded later, regardless of whether metadata is set (post-upload), they cannot be invoked.
I even changed the file name (to b1234), and the system can still invoke it. It is therefore clear that the system uses content-based retrieval for the file kb2-test, rather than name-based retrieval, nor retrieval based on the file name field within the document.
My questions are:
Can the Dify system use a metadata filter (which I believe is a basic function), including setting it during upload (I did not see any function to add metadata filters when uploading knowledge files; there is only “sync from Notion”, as well as “access to API” on the embedding page, or “create knowledge from pipeline” — what are these functions?), and then using that metadata filter in the system after the file has metadata set?
If metadata cannot be used, why can the system not directly invoke files by file name? If Dify can only perform retrieval based on the semantic content within documents, is that not absurd?
Is there a better method to achieve either:
(1) invoking files via metadata filter, or
(2) invoking files by file name?
If the system can only perform retrieval based on the semantic meaning of all document contents, it is like a reader going into a library to find a book, and the librarian has to pull out all books and search through them. Is that not absurd?