ValueError loading Helsinki-NLP tokenizers

Thank you. Yes, I did get the sentencepiece error when I initially switched to MarianTokenizer.from_pretrained(), but all hunky dory once I installed it.

Now my issue is that these models don’t work with DataParallel anymore, but that is another issue (174223)(throws a StopTermination attempting to access self.model.device in its bowels). I may have to hand-jam my own threads for parallel inference (I had to do that for the ModernBERT models a while back, so code reuse is king!).

1 Like