Bug - "." and "," in output vectors make search unreliable

#2
by IngLP - opened

Hi,
I am trying to use the model for a production project. Unfortunately, all resulting sparse vectors contain "." and "," with high weight:

image

image

this makes totally unrelated texts match.
Is this expected? Should those tokens be manually removed?
Thanks!

Hi,
thanks for the issue, I update the model as soon as possible.

@IngLP , It should be fixed with the new Loss Function.
I am publishing these models for study purposes. While I am flattered by their use in production, I do not feel comfortable recommending it.

nickprock changed discussion status to closed

Sign up or log in to comment