[Guide] How to clone any voice locally in python

Nautilus · (This post was last modified: 26 Feb 2025, 05:27 PM by Nautilus.)

Detailed guide for cloning any voice from a mp3/webm file locally on your PC
For educational purposes only.

Requirements:

1. 6GB VRAM (if you want to to run on GPU otherwise you can also run it on CPU but the voice cloning will take longer to complete)
2. Audio file for the voice you want to clone

Demo output: https://imgur.com/a/HdN5tG4

Steps:

1. Install python requirements

Install pytorch if not already installed: https://pytorch.org/get-started/locally/
Install zonos using pip:

Code:
pip install zonos

2. Copy paste the following code:

Code:
import torch

import torchaudio

from zonos.model import Zonos

from zonos.conditioning import make_cond_dict

# model = Zonos.from_pretrained("Zyphra/Zonos-v0.1-hybrid", device="cuda")

model = Zonos.from_pretrained("Zyphra/Zonos-v0.1-transformer", device="cuda")

wav, sampling_rate = torchaudio.load("assets/exampleaudio.mp3")

speaker = model.make_speaker_embedding(wav, sampling_rate)

cond_dict = make_cond_dict(text="Hello, world!", speaker=speaker, language="en-us")

conditioning = model.prepare_conditioning(cond_dict)

codes = model.generate(conditioning)

wavs = model.autoencoder.decode(codes).cpu()

torchaudio.save("sample.wav", wavs[0], model.autoencoder.sampling_rate)

Create a new file sample.py, paste the above code in it & replace "assets/exampleaudio.mp3" with the audio file path you want to clone.
I'll be using this audio file: Click here to view

3. Run the python file

Code:
python sample.py

NOTE: You might get the error "ModuleNotFoundError: No module named 'zonos' " even after installing it with pip. In this case, do the following:

- create a new folder named 'VoiceClone'
- create a new file clone.py & paste the above python code in it
- open a terminal in this folder
- run the following command

Code:
git clone https://github.com/Zyphra/Zonos

- go inside the Zonos folder and change "assets/exampleaudio.mp3" to the path for your audio file in sample.py file
- Run sample.py

4. The output will be saved as sample.wav