School of Art, Media, and Technology

CSAW Voice Biometrics & Speech Synthesis Competition

Written by:


The deadline for the voice biometrics and speech synthesis competition is approaching. Get your entries in!

About The Contest:
Human voice is ubiquitous; it is perhaps the most unobtrusive of biometric modalities. Voice however is also readily accessible; it is not too difficult for an attacker to obtain voice samples of the person being targeted. This challenge calls on contestants to assume the role of this attacker; given a comprehensive sample of a person’s voice, the task is to extract phones from this sample and utilize them to synthesize a completely new sentence (not included in the provided sample). This synthesized statement is compared with an authentic recording; the comparison for similarity is twofold, the synthesized and authentic statements are first compared utilizing a state of the art voice biometrics algorithm and then ranked for innovation by a panel of judges(Click on the Judging Criteria button below for details).

Go to: to find out about each phase of the contest: explaining in detail the registration and submission process; guidelines for submission, and judging criteria.

Cash prizes for winners:
1st place: $1,000
2nd place: $750
3rd place: $500

All prize recipients will be awarded airfare and hotel accommodation to attend the 2012 CSAW competition in New York City.

Judging Criteria:
The primary criterion for Judging is your ability to breach the provided Speaker Verification system utilizing a completely synthesized voice. Access to this Speaker Verification system is provided in the resources section, including a detailed description of the Voice Biometrics algorithm utilized.

Similarity Criteria:
In addition to making an ACCEPT/REJECT decision for each input, the speaker verification system also generates a distance measure. This measure assesses the similarity between the input phrase, generated by the contest participant, and the enrollment phrase, generated by a human speaker. The similarity criteria will be used as a tie-breaker in the situation where more than three participants breach the Speaker Verification system. In the case where no contestant breaches the Speaker Verification system, three contestants with the best similarity scores as generated by the Speaker Verification system will be chosen. Utilizing the Similarity Criteria three prize winning contestants are chosen, these contestants are further ranked utilizing the criteria describe below.

Technological Innovation and Synthesized Voice Quality:
Along with the test phrases, participants will describe the approach they used in generating the imposter phrase. The three prize winning contestants are ranked based on how innovative their approach was to this hacking task, including the quality of their synthesized voice (in relation to the human speaker) as determined by human judges.

Cash prizes for winners
1st place: $1,000
2nd place: $750
3rd place: $500
All prize recipients will be awarded airfare and hotel accommodation to attend the 2012 CSAW competition in New York City.

In order to enter this contest one must first register at the above link. Registration will be open until the 29th of February 2012(11:59pm EST).

Once registered, you may proceed to download the voice recording ( available in the resources section. The recording is obtained from a paragraph of English readout by a native English speaker.

You must also obtain the paragraph of text ( from the resources section. The essential task of this contest is to: synthesize into speech the paragraph of text, utilizing the provided recording i.e. must attempt to replicate the voice of the subject in the provided recording.

Please retain your account information, as you will require it to submit your entry to this contest. You are also welcome to utilize the forums by clicking on the link above, the account created during the registration process can also be used to access the forums.

One submission per contestant is allowed at the submission link in the resources section. Submission requires the username and password obtained during the registration process. The deadline for submission is 29th of February 2012(11:59pm EST). You may submit as many times as you like, however only the latest submission will be considered.

Only currently registered students (high school, undergraduate, or graduate) are eligible.

Your Synthesized Voice must be of 8,000 Hz 16 bit PCM – Mono WAV format.
Your WAV file must be named “-.wav”
You must also include a Microsoft Word 2010 compatible file Detailing your methodology.
Both these files must be archived utilizing a zip compatible format; this archived file must be submitted at the submission link.

Your submission must be no more than 20 Mega Bytes in size. Your latest submission at the time of contest deadline (29th of February 2012(11:59pm EST)) will be the only one taken into consideration.

Only Submissions that strictly compliant with the above instructions will be considered. Please do not make submissions via email. Emailed submissions will not be considered. –

All Rights Reserved © 2024. Parsons School of Design.