Jump to ratings and reviews
Rate this book

Text-to-Speech Synthesis

Rate this book
Text-to-Speech Synthesis provides a complete, end-to-end account of the process of generating speech by computer. Giving an in-depth explanation of all aspects of current speech synthesis technology, it assumes no specialized prior knowledge. Introductory chapters on linguistics, phonetics, signal processing and speech signals lay the foundation, with subsequent material explaining how this knowledge is put to use in building practical systems that generate speech. Including coverage of the very latest techniques such as unit selection, hidden Markov model synthesis, and statistical text analysis, explanations of the more traditional techniques such as format synthesis and synthesis by rule are also provided. Weaving together the various strands of this multidisciplinary field, the book is designed for graduate students in electrical engineering, computer science, and linguistics. It is also an ideal reference for practitioners in the fields of human communication interaction and telephony.

626 pages, Hardcover

First published February 19, 2009

11 people are currently reading
31 people want to read

About the author

Paul Taylor

316 books12 followers
Librarian Note: There is more than one author with this name in the Goodreads data base.

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
5 (31%)
4 stars
9 (56%)
3 stars
2 (12%)
2 stars
0 (0%)
1 star
0 (0%)
Displaying 1 - 3 of 3 reviews
Profile Image for bartosz.
158 reviews13 followers
March 6, 2018
There are many Natural Language Processing books that cover text to speech synthesis. Similarly, there are many books that cover particular facets of speech synthesis (prosody in speech synthesis, synthesizers for a specific language). Yet, Paul Taylor's Text to Speech Synthesis seemed to be one of the few books dedicated solely to this topic.

The book is pretty comprehensive - going through text segmentation, part of speech analysis, signal processing, acoustic models of speech and finally speech and prosody synthesis. The author focuses on "first generation" speech synthesis (formant synthesis) and concatenative and HMM synthesis. Articulatory synthesis is only mentioned in passing and the book was published before the deep learning boom (so neural network synthesis is also not covered).

I was impressed how focused the book is. The author goes into detail how many engineering trade-offs one can perform when the domain is constricted only to speech synthesis. For example full part of speech tagging is often unnecessary when the words are homophones and homographs may be often disambiguated on a case-by-case basis by simpler algorithms (i.e bass/fish versus bass/guitar may be discerned by a simple classifier trained on the neighboring words).

The author claims to present a practical engineering book but in fact, Text to Speech Synthesis is very light on the practicalities. Being spoiled by programming books I expected source code, or at least large sections of pseudocode, when in fact the book is very scant on the details and I doubt I could recreate even half the described methods without further investigation.

The book is broad in scope but not very deep. By nature, the author couldn't indulge in signal processing or go into NLP techniques and some ideas couldn't be explained in detail. The author does a good job of explaining simple concepts but I felt quite lost at times for example in the chapters on acoustic models of speech.

One other pretty big annoyance of the book were the persistent errors in bibliography - wrong entries, missing entries, or telltale signs of misusing latex - "[?]" where a bibliographical entry should be.

All in all, I was somewhat disappointed by Text to Speech Synthesis but the book is pretty approachable introductory book on the topic of speech synthesis.
26 reviews10 followers
September 6, 2016
Excellent introduction to Text-to-Speech synthesis, covering many aspects of linguistic science and speech technology. Detailed review of both historical and new methods for TTS including format synthesis, HMM and unit selection. Great read for people new to Text-to-Speech. It is written by Paul Taylor, who actually spent 18 years working on developing TTS systems that I could feel in every chapter of his book.
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.