--- language: - en - de - es - it - nl - pt - pl - ro - sv - da - fi - hu - el - fr - ru - uk - tr - ar - hi - jp - ko - zh - vi - la - ha - sw - yo - wo library: xvasynth tags: - emotion - audio - text-to-speech - speech-to-speech - voice conversion - tts pipeline_tag: text-to-speech --- GitHub project, inference Windows/Electron app: https://github.com/DanRuta/xVA-Synth Fine-tuning app: https://github.com/DanRuta/xva-trainer The base model for training other [🤗 xVASynth's](https://huggingface.co/spaces/Pendrokar/xVASynth-TTS) "xVAPitch" type models (v3). Model itself is used by the xVATrainer TTS model training app and not for inference. All created by Dan ["@dr00392"](https://huggingface.co/dr00392) Ruta. `The v3 model now uses a slightly custom tweaked VITS/YourTTS model. Tweaks including larger capacity, bigger lang embedding, custom symbol set (a custom spec of ARPAbet with some more phonemes to cover other languages), and I guess a different training script.` - Dan Ruta When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted. xVAPitch_5820651 model sample: There are hundreds of fine-tuned models on the web. But most of them use non-permissive datasets. ## xVASynth Editor v3 walkthrough video ▶: [![Video](https://img.youtube.com/vi/5u4xpI-cAd8/hqdefault.jpg)](https://www.youtube.com/watch?v=5u4xpI-cAd8) ## xVATrainer v1 walkthrough video ▶: [![Video](https://img.youtube.com/vi/PXv_SeTWk2M/hqdefault.jpg)](https://www.youtube.com/watch?v=PXv_SeTWk2M) Papers: - VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech - https://arxiv.org/abs/2106.06103 - YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for Everyone - https://arxiv.org/abs/2112.02418 Referenced papers within code: - Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf - Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155 - SDP - https://arxiv.org/pdf/2106.06103.pdf - Spline Flow - https://arxiv.org/abs/1906.04032 Used datasets: Unknown/Non-permissiable data