TTSGen - Free Text to Speech Onlinettsgen - Free TTS Online

Free Text To Speech Converter

1.0

My Text To Speech Conversions

Voice Samples

TTSGen Tool Options - Customize Your Speech Output

TTSGen allows you to fine-tune how your text is converted to speech using a variety of options. Below is a detailed guide to help you make the most of each setting.


🌐 Supported Languages & Voices

Select from a wide range of languages and voices. Each language comes with multiple voice options (male and female), offering different tones, accents, and personalities.

  • 🇺🇸 American English: 19 voices – including Heart, Alloy, Jessica, Nova, Echo, Liam, and Santa.
  • 🇬🇧 British English: 8 voices – including Alice, Emma, Daniel, and Fable.
  • 🇯🇵 Japanese: 5 voices – including Gongitsune, Nezumi, and Kumo.
  • 🇨🇳 Mandarin Chinese: 8 voices – including Xiaobei, Xiaoni, Yunxi, and Yunyang.
  • 🇪🇸 Spanish: 3 voices – including Dora and Alex.
  • 🇫🇷 French: 1 voice – Siwis (female).
  • 🇮🇳 Hindi: 4 voices – Alpha, Beta, Omega, and Psi.
  • 🇮🇹 Italian: 2 voices – Sara and Nicola.
  • 🇧🇷 Brazilian Portuguese: 3 voices – Dora, Alex, and Santa.

🗣️ Voice Selection

After selecting a language, you'll be able to choose from the available voices for that language. Voices differ in tone, accent, and gender. For example, you might prefer a softer voice for storytelling or a neutral tone for professional narration.


🎧 Audio Format

Choose the output format for your audio file. Each format serves different use cases based on quality, size, and compatibility.

  • WAV (Waveform Audio File Format):
    Uncompressed, high-fidelity audio. Best for editing, mastering, or professional applications where quality is critical. Larger file size.
  • MP3 (MPEG Audio Layer III):
    Compressed audio with good quality and small file size. Suitable for web use, sharing, streaming, and mobile apps. Universally supported.
  • AAC (Advanced Audio Coding):
    A modern audio codec that delivers better sound quality than MP3 at similar or smaller sizes. Used in platforms like YouTube, iOS, and Android.

Tip: Choose WAV for quality, MP3 for compatibility, and AAC for efficient mobile use.


⏩ Speech Speed

Adjust the playback speed of the generated voice:

  • 0.5x – 0.9x: Slower speech, suitable for language learners or clarity-focused narration.
  • 1.0x (Default): Natural, balanced pace ideal for most uses.
  • 1.1x – 4.0x: Faster delivery for rapid consumption or shorter audio length.

Use the speed slider to test what pace works best for your audience or content type.


🎙️ With TTSGen, you're in control. Whether you're generating audio for videos, e-learning, podcasts, or accessibility, these settings help you create the perfect voice experience.

🔊 Credits

This tool is powered by Kokoro-82M, a high-quality open-source multilingual text-to-speech model developed by the open-source community.

Kokoro-82M delivers natural, expressive speech synthesis and supports a wide range of languages and voice styles. It is designed to be efficient, flexible, and adaptable for diverse applications, from accessibility to creative projects.

Special thanks to the developers and contributors of the Kokoro-82M project on Hugging Face for making high-quality TTS freely available to everyone.

TTSGen is built around Kokoro with the goal of making speech synthesis accessible, fast, and intuitive for all users.