Brain-to-Text: Decoding Spoken Phrases from Phone Representations in the Brain
Abstract
It has long been speculated whether communication between humans and machines based on natural speech relatedcortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolatedaspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However,until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrateassociated with speech and language processing. Here, we show for the first time that continuously spoken speechcan be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, weimplemented a system, which we call Brain-To-Text that models single phones, employs techniques from automaticspeech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textualrepresentation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone errorrates below 50%. Additionally, our approach contributes to the current understanding of the neural basis ofcontinuous speech production by identifying those cortical regions that hold substantial information about individualphones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward humanmachinecommunication based on imagined speech.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 18, 2015
- Accession Number
- AD1069856
Entities
People
- Adriana De Pesters
- Christian Herff
- Dominic Heger
- Dominic Telaar
- Gerwin Schalk
- Peter Brunner
- Tanja Schultz
Organizations
- Health Research, Incorporated