Contact Us

A special corpus of Indian languages covering 13 major languages of India. It comprises of 10000+ spoken sentences/uttererances each of mono and english recorded by both Male and Female native speakers. Speech waveform files are available in .wav format along with the corresponding text. We hope that these recordings will be useful for researchers and speech technologists working on synthesis and recognition. You can request zip archives of the entire database here. The statistics of datasets available are given here.

Due to security reasons, we are no longer providing direct download links through this website. Please write to us at smtiitm@gmail.com for accessing the data. Please use subject line: Indic Database Download Request. In the body of the email, please include:
(1) Language
(2) Gender
(3) Type (Mono/English)
Example 1: Language : 'Hindi' and Type : 'English' means that it contains English sentences spoken by a person whose native language is Hindi.
Example 2: Language : 'Hindi' and Type : 'Mono' means that it contains Hindi sentences spoken by a person whose native language is Hindi.

Please note that the requests would be processed only twice a week - Tuesday (IST 15:00) and Friday (IST 15:00). Requests sent after IST 15:00 on these days will be processed in the next batch only. By requesting the data, you are confirming that you have read and agreed to be bound by the License For Use of Indic TTS.

Sign Up

Welcome Back!

Please enter your Username Please enter your password
Forgot password

Sign Up for Free!

Please enter your name Please enter a valid email address Please enter affiliation Please enter designation Please enter purpose of download