Article Submission

Publisher

View Articles


Guidelines for Authors

Abstracting/Indexing

Order Journal
Volume 3, Issue 1, January 2023

FORUM PAPER


Exploring TorToise TTS Tool – A Human Voice Resemblance Through Speech Synthesis

T. Shivani . P. Pranav . I. Saikalyan . A. Veswanth

T. Shivani Department of Computer Science and Engineering, National Institute of Technology Warangal, Telangana, India E-mail: stcs21109@student.nitw.ac.in P. Pranav Department of Computer Science and Engineering, National Institute of Technology Warangal, Telangana, India E-mail: pp22csm1s02@student.nitw.ac.in I. Saikalyan Department t. of Computer Science and Engineering, National Institute of Technology Warangal, Telangana, India, E-mail: induri_961932@student.nitw.ac.in A. Veswanth Department of Computer Science and Engineering, GMR Institute of Technology, Andhra Pradesh, India, E-mail: 19341a05i5@gmrit.edu.in

Received in final form on January 01, 2023

Abstract
digitalization. In this positioning paper, we analyze the TorToise TTS tool using several performance metrics and testing it exhaustively using different quantitative scaling of audio datasets. TorToise TTS converts given text to fine-tuned input speech corpus. The tool combines the advantages of autoregressive decoders as well as the Denoising Diffusion Probabilistic Models (DDPMs). Our study shows that the tool performs well for real-world scenarios. The conditional input and reranking feature of the tool further enhance the quality of the speech generated from the text making it more similar to the input training dataset. We compare the results with SV2TTS and resemble.ai. We observe that male voices show more similarity as compared to female voices. TorToise gives well pretrained models covering large speech corpus leading to high resemblance in down-streamed fine - tuning tasks.


Keywords
Text-to-Speech, Autoregressive decoders, denoising diffusion models.


Cite This Article
T. Shivani . P. Pranav . I. Saikalyan and A. Veswanth, Exploring TorToise TTS Tool – A Human Voice Resemblance Through Speech Synthesis, J. Innovation Sciences and Sustainable Technologies, 3(1)(2023), 41 - 50. https://doie.org/10.0421/JISST.2023733023


    156    17    Download