Mehr lesen
Informationen zum Autor Björn Schuller , Technische Universität München, Germany Anton Batliner , Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany Klappentext This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics ('paralinguistics') expressed by or embedded in human speech and language.It is the first book to provide such a systematic survey of paralinguistics in speech and language processing. The technology described has evolved mainly from automatic speech and speaker recognition and processing, but also takes into account recent developments within speech signal processing, machine intelligence and data mining.Moreover, the book offers a hands-on approach by integrating actual data sets, software, and open-source utilities which will make the book invaluable as a teaching tool and similarly useful for those professionals already in the field.Key features:* Provides an integrated presentation of basic research (in phonetics/linguistics and humanities) with state-of-the-art engineering approaches for speech signal processing and machine intelligence.* Explains the history and state of the art of all of the sub-fields which contribute to the topic of computational paralinguistics.* C overs the signal processing and machine learning aspects of the actual computational modelling of emotion and personality and explains the detection process from corpus collection to feature extraction and from model testing to system integration.* Details aspects of real-world system integration including distribution, weakly supervised learning and confidence measures.* Outlines machine learning approaches including static, dynamic and context-sensitive algorithms for classification and regression.* Includes a tutorial on freely available toolkits, such as the open-source 'openEAR' toolkit for emotion and affect recognition co-developed by one of the authors, and a listing of standard databases and feature sets used in the field to allow for immediate experimentation enabling the reader to build an emotion detection model on an existing corpus. Zusammenfassung This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics ( paralinguistics ) expressed by or embedded in human speech and language. Inhaltsverzeichnis Preface xiii Acknowledgements xv List of Abbreviations xvii Part I Foundations 1 Introduction 3 1.1 What is Computational Paralinguistics? A First Approximation 3 1.2 History and Subject Area 7 1.3 Form versus Function 10 1.4 Further Aspects 12 1.4.1 The Synthesis of Emotion and Personality 12 1.4.2 Multimodality: Analysis and Generation 13 1.4.3 Applications, Usability and Ethics 15 1.5 Summary and Structure of the Book 17 References 18 2 Taxonomies 21 2.1 Traits versus States 21 2.2 Acted versus Spontaneous 25 2.3 Complex versus Simple 30 2.4 Measured versus Assessed 31 2.5 Categorical versus Continuous 33 2.6 Felt versus Perceived 35 2.7 Intentional versus Instinctual 37 2.8 Consistent versus Discrepant 38 2.9 Private versus Social 39 2.10 Prototypical versus Peripheral 40 2.11 Universal versus Culture-Specific 41 2.12 Unimodal versus Multimodal 43 2.13 All These Taxonomies - So What? 44 2.13.1 Emotion Data: The FAU AEC 45 2.13.2 Non-native Data: The C-AuDiT corpus 47 References 48 3 Aspects of Modelling 53 3.1 Theories and Models of Personality 53 3.2 Theories and Models of Emotion and Affect 55 3.3 Type and Segmentation of...
Inhaltsverzeichnis
Preface xiii
Acknowledgements xv
List of Abbreviations xvii
Part I Foundations
1 Introduction 3
1.1 What is Computational Paralinguistics? A First Approximation 3
1.2 History and Subject Area 7
1.3 Form versus Function 10
1.4 Further Aspects 12
1.4.1 The Synthesis of Emotion and Personality 12
1.4.2 Multimodality: Analysis and Generation 13
1.4.3 Applications, Usability and Ethics 15
1.5 Summary and Structure of the Book 17
References 18
2 Taxonomies 21
2.1 Traits versus States 21
2.2 Acted versus Spontaneous 25
2.3 Complex versus Simple 30
2.4 Measured versus Assessed 31
2.5 Categorical versus Continuous 33
2.6 Felt versus Perceived 35
2.7 Intentional versus Instinctual 37
2.8 Consistent versus Discrepant 38
2.9 Private versus Social 39
2.10 Prototypical versus Peripheral 40
2.11 Universal versus Culture-Specific 41
2.12 Unimodal versus Multimodal 43
2.13 All These Taxonomies - So What? 44
2.13.1 Emotion Data: The FAU AEC 45
2.13.2 Non-native Data: The C-AuDiT corpus 47
References 48
3 Aspects of Modelling 53
3.1 Theories and Models of Personality 53
3.2 Theories and Models of Emotion and Affect 55
3.3 Type and Segmentation of Units 58
3.4 Typical versus Atypical Speech 60
3.5 Context 61
3.6 Lab versus Life, or Through the Looking Glass 62
3.7 Sheep and Goats, or Single Instance Decision versus Cumulative Evidence and Overall Performance 64
3.8 The Few and the Many, or How to Analyse a Hamburger 65
3.9 Reifications, and What You are Looking for is What You Get 67
3.10 Magical Numbers versus Sound Reasoning 68
References 74
4 Formal Aspects 79
4.1 The Linguistic Code and Beyond 79
4.2 The Non-Distinctive Use of Phonetic Elements 81
4.2.1 Segmental Level: The Case of /r/ Variants 81
4.2.2 Supra-segmental Level: The Case of Pitch and Fundamental Frequency - and of Other Prosodic Parameters 82
4.2.3 In Between: The Case of Other Voice Qualities, Especially Laryngealisation 86
4.3 The Non-Distinctive Use of Linguistics Elements 91
4.3.1 Words and Word Classes 91
4.3.2 Phrase Level: The Case of Filler Phrases and Hedges 94
4.4 Disfluencies 96
4.5 Non-Verbal, Vocal Events 98
4.6 Common Traits of Formal Aspects 100
References 101
5 Functional Aspects 107
5.1 Biological Trait Primitives 109
5.1.1 Speaker Characteristics 111
5.2 Cultural Trait Primitives 112
5.2.1 Speech Characteristics 114
5.3 Personality 115
5.4 Emotion and Affect 119
5.5 Subjectivity and Sentiment Analysis 123
5.6 Deviant Speech 124
5.6.1 Pathological Speech 125
5.6.2 Temporarily Deviant Speech 129
5.6.3 Non-native Speech 130
5.7 Social Signals 131
5.8 Discrepant Communication 135
5.8.1 Indirect Speech, Irony, and Sarcasm 136
5.8.2 Deceptive Speech 138
5.8.3 Off-Talk 139
5.9 Common Traits of Functional Aspects 140
References 141
6 Corpus Engineering 159
6.1 Annotation 160
6.1.1 Assessment of Annotations 161
6.1.2 New Trends 164
6.2 Corpora and Benchmarks: Some Examples 164
6.2.1 FAU Aibo Emotion Corpus 165
6.2.2 aGender Corpus 165