how to handle queries without obvious keywords?

Août 20, 2025

—

I’m working on a legal QA app and I’ve hit a bit of a roadblock. I generated embeddings using LegalBERT and set up retrieval, but I’m running into issues when testing.

Here’s the situation:
When I test relational quality, I try a question and check the top-5 retrieved results. If the query includes clear keywords, the system works well. But if the query is less explicit, the results are far off.

For example, suppose I ask:

The correct retrieval should be the Second Amendment, but unless I explicitly include the word “firearm” or “weapon”, my model doesn’t find it. Adding keywords makes it work (which makes sense), but this limits usability.

How can I handle cases where the user query doesn’t share an obvious keyword overlap with the underlying text? Are there effective techniques for this type of embedding problem?

submitted by /u/Interesting_Good8181 to r/learnmachinelearning
[link] [comments]

how to handle queries without obvious keywords?

Commentaires

Laisser un commentaire Annuler la réponse