Human-Centric Interfaces for Ambient Intelligence (eBook)
542 Seiten
Elsevier Science (Verlag)
978-0-08-087850-8 (ISBN)
To create truly effective human-centric ambient intelligence systems both engineering and computing methods are needed. This is the first book to bridge data processing and intelligent reasoning methods for the creation of human-centered ambient intelligence systems. Interdisciplinary in nature, the book covers topics such as multi-modal interfaces, human-computer interaction, smart environments and pervasive computing, addressing principles, paradigms, methods and applications. This book will be an ideal reference for university researchers, R&D engineers, computer engineers, and graduate students working in signal, speech and video processing, multi-modal interfaces, human-computer interaction and applications of ambient intelligence.
Hamid Aghajan is a Professor of Electrical Engineering (consulting) at Stanford University, USA. His research is on user-centric vision applications in smart homes, assisted living / well being, smart meetings, and avatar-based social interactions. He is Editor-in-Chief of Journal of Ambient Intelligence and Smart Environments, has chaired ACM/IEEE ICDSC 2008, and organized workshops/sessions/tutorials at ECCV, ACM MM, FG, ECAI, ICASSP, CVPR.
Juan Carlos Augusto is a Lecturer at the University of Ulster, UK. He is conducting research on Smart Homes and Classrooms. He has given tutorials at IJCAI'07 and AAAI'08. He is Editor-in-Chief of the Book Series on Ambient Intelligence and Smart Environments and the Journal of Ambient Intelligence and Smart Environments. He has co-Chaired ICOST'06, AITAmI'06/07/08, and is Workshops Chair for IE'09.
Ram?n L?pez-C?zar Delgado is a Professor at the Faculty of Computer Science and Telecommunications of the University of Granada, Spain. His research interests include speech recognition and understanding, dialogue management and Ambient Intelligence. He is a member of ISCA (International Speech Communication Association), SEPLN (Spanish Society on Natural Language Processing) and AIPO (Spanish Society on HCI).
- Integrates engineering and computing methods that are essential for designing and implementing highly effective ambient intelligence systems
- Contains contributions from the world's leading experts in academia and industry
- Gives a complete overview of the principles, paradigms and applications of human-centric ambient intelligence systems
To create truly effective human-centric ambient intelligence systems both engineering and computing methods are needed. This is the first book to bridge data processing and intelligent reasoning methods for the creation of human-centered ambient intelligence systems. Interdisciplinary in nature, the book covers topics such as multi-modal interfaces, human-computer interaction, smart environments and pervasive computing, addressing principles, paradigms, methods and applications. This book will be an ideal reference for university researchers, R&D engineers, computer engineers, and graduate students working in signal, speech and video processing, multi-modal interfaces, human-computer interaction and applications of ambient intelligence. Hamid Aghajan is a Professor of Electrical Engineering (consulting) at Stanford University, USA. His research is on user-centric vision applications in smart homes, assisted living / well being, smart meetings, and avatar-based social interactions. He is Editor-in-Chief of "e;Journal of Ambient Intelligence and Smart Environments"e;, has chaired ACM/IEEE ICDSC 2008, and organized workshops/sessions/tutorials at ECCV, ACM MM, FG, ECAI, ICASSP, CVPR. Juan Carlos Augusto is a Lecturer at the University of Ulster, UK. He is conducting research on Smart Homes and Classrooms. He has given tutorials at IJCAI'07 and AAAI'08. He is Editor-in-Chief of the Book Series on "e;Ambient Intelligence and Smart Environments"e; and the "e;Journal of Ambient Intelligence and Smart Environments"e;. He has co-Chaired ICOST'06, AITAmI'06/07/08, and is Workshops Chair for IE'09. Ramon Lopez-Cozar Delgado is a Professor at the Faculty of Computer Science and Telecommunications of the University of Granada, Spain. His research interests include speech recognition and understanding, dialogue management and Ambient Intelligence. He is a member of ISCA (International Speech Communication Association), SEPLN (Spanish Society on Natural Language Processing) and AIPO (Spanish Society on HCI). - Integrates engineering and computing methods that are essential for designing and implementing highly effective ambient intelligence systems- Contains contributions from the world's leading experts in academia and industry- Gives a complete overview of the principles, paradigms and applications of human-centric ambient intelligence systems
Front Cover 1
Human-Centric Interfaces for Ambient Intelligence 4
Copyright Page 5
Contents 6
Foreword 18
Preface 20
Ambient Intelligence 20
Human-Centric Design 21
Vision and Visual Interfaces 23
Speech Processing and Dialogue Management 25
Multimodal Interfaces 26
Smart Environment Applications 27
Conclusions 28
Acknowledgments 28
Part 1: Vision and Visual Interfaces 30
Chapter 1: Face-to-Face Collaborative Interfaces 32
1.1 Introduction 33
1.2 Background 36
1.3 Surface User Interface 39
1.4 Multitouch 41
1.4.1 Camera-Based Systems 42
1.4.2 Capacitance-Based Systems 47
1.5 Gestural Interaction 49
1.6 Gestural Infrastructures 53
1.6.1 Gestural Software Support 54
1.7 Touch versus Mouse 55
1.8 Design Guidelines for SUIs for Collaboration 56
1.8.1 Designing the Collaborative Environment 57
1.9 Conclusions 58
References 58
Chapter 2: Computer Vision Interfaces for Interactive Art 62
2.1 Introduction 63
2.1.1 A Brief History of (Vision in) Art 63
2.2 A Taxonomy of Vision-Based Art 64
2.3 Paradigms for Vision-Based Interactive Art 66
2.3.1 Mirror Interfaces 67
2.3.2 Performance 71
2.4 Software Tools 73
2.4.1 Max/MSP, Jitter, and Puredata 73
2.4.2 EyesWeb 74
2.4.3 processing 74
2.4.4 OpenCV 74
2.5 Frontiers of Computer Vision 74
2.6 Sources of Information 75
2.7 Summary 76
Acknowledgments 76
References 77
Chapter 3: Ubiquitous Gaze: Using Gazeat the Interface 78
3.1 Introduction 79
3.2 The Role of Gaze in Interaction 79
3.3 Gaze as an Input Device 82
3.3.1 Eyes on the Desktop 84
3.3.2 Conversation-Style Interaction 86
3.3.3 Beyond the Desktop 87
Ambient Displays 88
Human–Human Interaction in Ambient Environments 89
Activity detection 89
Interest level 90
Hot spot detection 90
Participation status 90
Dialogue acts 90
Interaction structure 90
Dominance and influence 90
3.4 Mediated Communication 93
3.5 Conclusion 94
References 95
Chapter 4: Exploiting Natural Language Generation in Scene Interpretation 100
4.1 Introduction 101
4.2 Related Work 101
4.3 Ontology-Based User Interfaces 103
4.4 Vision and Conceptual Levels 104
4.5 The NLG Module 107
4.5.1 Representation of the Discourse 109
4.5.2 Lexicalization 111
4.5.3 Surface Realization 111
4.6 Experimental Results 112
4.7 Evaluation 114
4.7.1 Qualitative Results 116
4.7.2 Quantitative Results 116
4.8 Conclusions 118
Acknowledgments 119
Appendix Listing of Detected Facts Sorted by Frequency of Use 119
References 121
Chapter 5: The Language of Action: A New Tool for Human-CentricInterfaces 124
5.1 Introduction 125
5.2 Human Action 126
5.3 Learning the Languages of Human Action 128
5.3.1 Related Work 129
5.4 Grammars of Visual Human Movement 132
5.5 Grammars of Motoric Human Movement 137
5.5.1 Human Activity Language: A Symbolic Approach 141
5.5.2 A Spectral Approach: Synergies 150
5.6 Applications to Health 155
5.7 Applications to Artificial Intelligence and Cognitive Systems 156
5.8 Conclusions 157
Acknowledgments 158
References 158
Part 2: Speech Processing and Dialogue Management 162
Chapter 6: Robust Speech Recognition Under Noisy Ambient Conditions 164
6.1 Introduction 165
6.2 Speech Recognition Overview 167
6.3 Variability in the Speech Signal 170
6.4 Robust Speech Recognition Techniques 171
6.4.1 Speech Enhancement Techniques 172
6.4.2 Robust Feature Selection and Extraction Methods 174
6.4.3 Feature Normalization Techniques 176
6.4.4 Stereo Data-Based Feature Enhancement 176
6.4.5 The Stochastic Matching Framework 177
Model-Based Model Adaptation 178
Model-Based Feature Enhancement 180
Adaptation-Based Compensation 180
Uncertainty in Feature Enhancement 182
6.4.6 Special Transducer Arrangement to Solve the Cocktail Party Problem 184
6.5 Summary 184
References 185
Chapter 7: Speaker Recognition in Smart Environments 192
7.1 Principles and Applications of Speaker Recognition 193
7.1.1 Features Used for Speaker Recognition 194
7.1.2 Speaker Identification and Verification 195
7.1.3 Text-Dependent, Text-Independent, and Text-Prompted Methods 196
7.2 Text-Dependent Speaker Recognition Methods 197
7.2.1 DTW-Based Methods 197
7.2.2 HMM-Based Methods 197
7.3 Text-Independent Speaker Recognition Methods 198
7.3.1 Methods Based on Long-Term Statistics 198
7.3.2 VQ-Based Methods 198
7.3.3 Methods Based on Ergodic HMM 198
7.3.4 Methods Based on Speech Recognition 199
7.4 Text-Prompted Speaker Recognition 200
7.5 High-Level Speaker Recognition 201
7.6 Normalization and Adaptation Techniques 201
7.6.1 Parameter Domain Normalization 202
7.6.2 Likelihood Normalization 202
7.6.3 HMM Adaptation for Noisy Conditions 203
7.6.4 Updating Models and A Priori Thresholds for Speaker Verification 204
7.7 ROC and DET Curves 204
7.7.1 ROC Curves 204
7.7.2 DET Curves 205
7.8 Speaker Diarization 206
7.9 Multimodal Speaker Recognition 208
7.9.1 Combining Spectral Envelope and Fundamental Frequency Features 208
7.9.2 Combining Audio and Visual Features 209
7.10 Outstanding Issues 209
References 210
Chapter 8: Machine Learning Approaches to Spoken Language Understanding for Ambient Intelligence 214
8.1 Introduction 215
8.2 Statistical Spoken Language Understanding 217
8.2.1 Spoken Language Understanding for Slot-Filling Dialogue System 217
8.2.2 Sequential Supervised Learning 219
8.3 Conditional Random Fields 221
8.3.1 Linear-Chain CRFs 221
8.3.2 Parameter Estimation 222
8.3.3 Inference 223
8.4 Efficient Algorithms for Inference and Learning 225
8.4.1 Fast Inference for Saving Computation Time 225
8.4.2 Feature Selection for Saving Computation Memory 228
8.5 Transfer Learning for Spoken Language Understanding 230
8.5.1 Transfer Learning 230
8.5.2 Triangular-Chain Conditional Random Fields 231
Model1 232
Model2 233
8.5.3 Parameter Estimation and Inference 234
8.6 Joint Prediction of Dialogue Acts and Named Entities 235
8.6.1 Data Sets and Experiment Setup 235
8.6.2 Comparison Results for Text and Spoken Inputs 236
8.6.3 Comparison of Space and Time Complexity 239
8.7 Multi-Domain Spoken Language Understanding 240
8.7.1 Domain Adaptation 241
8.7.2 Data and Setup 242
8.7.3 Comparison Results 245
8.8 Conclusion and Future Direction 250
Acknowledgments 251
References 251
Chapter 9: The Role of Spoken Dialogue in User-Environment Interaction 254
9.1 Introduction 255
9.2 Types of Interactive Speech Systems 257
9.3 The Components of an Interactive Speech System 261
9.3.1 Input Interpretation 261
9.3.2 Output Generation 264
9.3.3 Dialogue Management 264
9.4 Examples of Spoken Dialogue Systems for Ambient Intelligence Environments 269
9.4.1 Chat 269
9.4.2 SmartKom and SmartWeb 270
9.4.3 Talk 274
9.4.4 Companions 275
9.5 Challenges for Spoken Dialogue Technology in Ambient Intelligence Environments 277
9.5.1 Infrastructural Challenges 277
9.5.2 Challenges for Spoken Dialogue Technology 278
9.6 Conclusions 279
References 279
Chapter 10: Speech Synthesis Systems in Ambient Intelligence Environments 284
10.1 Introduction 285
10.2 Speech Synthesis Interfaces for Ambient Intelligence 287
10.3 Speech Synthesis 290
10.3.1 Text Processing 290
10.3.2 Speech Signal Synthesis 292
Articulatory Synthesis 293
Formant Synthesis 293
Concatenative Synthesis 296
10.3.3 Prosody Generation 298
10.3.4 Evaluation of Synthetic Speech 299
10.4 Emotional Speech Synthesis 299
10.5 Discussion 301
10.5.1 Ambient Intelligence and Users 302
10.5.2 Future Directions and Challenges 302
10.6 Conclusions 303
Acknowledgments 303
References 304
Part 3: Multimodal Interfaces 308
Chapter 11: Tangible Interfaces for Ambient Augmented Reality Applications 310
11.1 Introduction 311
11.1.1 Rationale for Ambient AR Interfaces 311
11.1.2 Augmented Reality 313
11.2 Related Work 314
11.2.1 From Tangibility... 314
11.2.2 . . .To the AR Tangible User Interface 315
11.3 Design Approach for Tangible AR Interfaces 316
11.3.1 The Tangible AR Interface Concept 316
11.4 Design Guidelines 317
11.5 Case Studies 318
11.5.1 AR Lens 318
11.5.2 AR Tennis 320
11.5.3 MagicBook 322
11.6 Tools for Ambient AR Interfaces 324
11.6.1 Software Authoring Tools 324
11.6.2 Hardware Authoring Tools 325
11.7 Conclusions 327
References 328
Chapter 12: Physical Browsing and Selection-Easy Interaction with Ambient Services 332
12.1 Introduction to Physical Browsing 333
12.2 Why Ambient Services Need Physical Browsing Solutions 334
12.3 Physical Selection 335
12.3.1 Concepts and Vocabulary 335
12.3.2 Tou 335
12.3.3 Pointing 336
12.3.4 Scanning 337
12.3.5 Visualizing Physical Hyperlinks 338
12.4 Selection as an Interaction Task 338
12.4.1 Selection in Desktop Computer Systems 339
12.4.2 About the Choice of Selection Technique 339
12.4.3 Selection in Immersive Virtual Environments 340
12.4.4 Selection with Laser Pointers 341
12.4.5 The Mobile Terminal as an Input Device 343
12.5 Implementing Physical Selection 344
12.5.1 Implementing Pointing 344
12.5.2 Implementing Touching 346
RFID as an Implementation Technology 346
User Interaction Considerations 347
12.5.3 Other Technologies for Connecting Physical and Digital Entities 348
Visual Technologies for Mobile Terminals 348
Body Communication 349
12.6 Indicating and Negotiating Actions After the Selection Event 350
12.6.1 Activation by Selection 350
12.6.2 Action Selection by a Different Modality 351
12.6.3 Actions by Combining Selection Events 351
12.6.4 Physical Selection in Establishing Communication 352
12.7 Conclusions 352
References 353
Chapter 13: Nonsymbolic Gestural Interaction for Ambient Intelligence 356
13.1 Introduction 357
13.2 Classifying Gestural Behavior for Human-Centric Ambient Intelligence 357
13.3 Emotions 361
13.4 Personality 365
13.5 Culture 366
13.6 Recognizing Gestural Behavior for Human-Centric Ambient Intelligence 369
13.6.1 Acceleration-Based Gesture Recognition 369
13.6.2 Gesture Recognition Based on Physiological Input 371
13.7 Conclusions 372
References 372
Chapter 14: Evaluation of Multimodal Interfaces for Ambient Intelligence 376
14.1 Introduction 377
14.2 Performance and Quality Taxonomy 379
14.3 Quality Factors 380
14.4 Interaction Performance Aspects 381
14.5 Quality Aspects 382
14.6 Application Examples 384
14.6.1 INSPIRE and MediaScout 384
14.6.2 Evaluation Constructs 385
14.6.3 Evaluation of Output Metaphors 387
Rationale 387
Experimental Design 387
Insights 389
14.6.4 Evaluation of the Quality of an Embodied Conversational Agent 390
Rationale 390
Experimental Design 391
Insights 392
14.6.5 Comparison of Questionnaires 392
Rationale 392
Experimental Design 393
Insights 394
Comparison of questionnaire results 394
Comparison of quality and performance metrics 395
14.7 Conclusions and Future Work 396
Acknowledgment 397
References 397
Part 4: Smart Environment Applications 400
Chapter 15: New Frontiers in Machine Learning for Predictive User Modeling 402
15.1 Introduction 403
15.1.1 Multimodal Affect Recognition 404
15.1.2 Modeling Interruptability 406
15.1.3 Classifying Voice Mails 406
15.1.4 Brain–Computer Interfaces for Visual Recognition 406
15.2 A Quick Primer: Gaussian Process Classification 407
15.3 Sensor Fusion 408
15.3.1 Multimodal Sensor Fusion for Affect Recognition 409
15.3.2 Combining Brain–Computer Interface with Computer Vision 410
15.4 Semisupervised Learning 412
15.4.1 Semisupervised Affect Recognition 413
15.5 Active Learning 415
15.5.1 Modeling Interruptability 415
15.5.2 Classifying Voice Mails 416
15.6 Conclusions 419
Acknowledgments 419
References 419
Chapter 16: Games and Entertainment in Ambient Intelligence Environments 422
16.1 Introduction 423
16.2 Ambient Entertainment Applications 423
16.2.1 Ubiquitous Devices 424
16.2.2 Exergames 424
16.2.3 Urban Gaming 425
16.2.4 Dancing in the Streets 426
16.3 Dimensions in Ambient Entertainment 426
16.3.1 Sensors and Control 426
16.3.2 Location 429
16.3.3 Social Aspects of Gaming 430
16.4 Designing for Ambient Entertainment and Experience 432
16.4.1 Emergent Games 432
16.4.2 Rhythm and Temporal Interaction 433
16.4.3 Performance in Play 435
16.4.4 Immersion and Flow 437
16.5 Conclusions 438
Acknowledgments 439
References 439
Chapter 17: Natural and Implicit Information-Seeking Cues in Responsive Technology 444
17.1 Introduction 445
17.2 Information Seeking and Indicative Cues 446
17.2.1 Analysis of the Hypothetical Shopping Scenario 446
17.2.2 A Framework for Information Seeking 447
17.2.3 Indicative Cues by Phase 449
17.3 Designing Systems for Natural and Implicit Interaction 451
17.3.1 Natural Interaction 451
17.3.2 Implicit Interaction 453
17.4 Clothes Shopping Support Technologies 454
17.4.1 Fitting Room Technologies 454
17.4.2 Virtual Fittings 455
17.4.3 Reactive Displays 455
17.5 Case Study: Responsive Mirror 455
17.5.1 Concept 455
17.5.2 Privacy Concerns 457
Disclosure 458
Identity 458
Temporal 458
17.5.3 Social Factors: Reflecting Images of Self and Others 458
17.5.4 Responsive Mirror Prototype 460
17.5.5 Vision System Description 461
Shopper Detection 461
Orientation Estimation 464
Clothes Recognition 465
Subjectivity of Clothing Similarity 466
Clothing Similarity Algorithm 468
Feature Extraction—Shirt Parts Segmentation 469
Feature Extraction—Sleeve Length Detection 469
Feature Extraction—Collar Detection 469
Feature Extraction—Button Detection 471
Feature Extraction—Pattern Detection 471
Feature Extraction—Emblem Detection 472
17.5.6 Design Evaluation 473
Method 473
Task and Procedure 474
Results 474
Fitting Room Behavior 475
User Suggestions for Enhancement 475
Use of Images of Other People 476
Results from Privacy-Related Questions 477
17.6 Lessons for Ambient Intelligence Designs of Natural and Implicit Interaction 478
Acknowledgments 479
References 480
Chapter 18: Spoken Dialogue Systems for Intelligent Environments 482
18.1 Introduction 483
18.2 Intelligent Environments 484
18.2.1 System Architecture 484
18.2.2 The Role of Spoken Dialogue 485
Network Speech Recognition 488
Distributed Speech Recognition 488
ETSI DSR front-end standards 490
A Java ME implementation of the DSR front-end 490
18.2.3 Proactiveness 492
18.3 Information Access in Intelligent Environments 493
18.3.1 Pedestrian Navigation System 493
System Description 494
Evaluation 495
18.3.2 Journey-Planning System 498
18.3.3 Independent Dialogue Partner 500
Proactive Dialogue Modeling 503
Usability Evaluation 504
18.4 Conclusions 504
Acknowledgments 505
References 505
Chapter 19: Deploying Context-Aware Health Technology at Home: Human-Centric Challenges 508
19.1 Introduction 509
19.2 The Opportunity: Context-Aware Home Health Applications 510
19.2.1 Medical Monitoring 510
19.2.2 Compensation 511
19.2.3 Prevention 511
19.2.4 Embedded Assessment 512
19.3 Case Study: Context-Aware Medication Adherence 512
19.3.1 Prototype System 513
19.3.2 Evaluation 513
19.3.3 Human-Centric Design Oversights 514
19.4 Detecting Context: Twelve Questions to Guide Research 516
19.4.1 Sensor Installation (“Install It”) 516
Question 1: What Type of Sensors Will Be Used? 516
Question 2: Are the Sensors Professionally Installed or Self-Installedin the Home? 517
Question 3: What Is the Cost of (end-user) Installation? 518
Question 4: Where Do Sensors Need to Go? 518
Question 5: How Are Sensors Selected, Positioned, and Labeled? 519
Selection 519
Positioning 520
Labeling 520
19.4.2 Activity Model Training (“Customize It”) 521
Question 6: What Type of Training Data Do the Activity Models Require? 521
Question 7: How Many Examples Are Needed? 523
19.4.3 Activity Model Maintenance (“Fix It”) 524
Question 8: Who Will Maintain the System as Activities Change, the Environment Changes, and Sensors Break? 524
Question 9: How Does the User Know What Is Broken? 526
Question 10: Can the User Make Instantaneous, Nonoscillating Fixes? 526
Question 11: What Will Keep the User’s Mental Model in Line with the Algorithmic Model? 527
Question 12: How Does a User Add a New Activity to Recognize? 527
19.5 Conclusions 527
Acknowledgments 528
References 528
Epilogue: Challenges and Outlook 534
Index 540
Erscheint lt. Verlag | 25.9.2009 |
---|---|
Sprache | englisch |
Themenwelt | Informatik ► Software Entwicklung ► User Interfaces (HCI) |
Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
Technik ► Elektrotechnik / Energietechnik | |
Technik ► Nachrichtentechnik | |
ISBN-10 | 0-08-087850-4 / 0080878504 |
ISBN-13 | 978-0-08-087850-8 / 9780080878508 |
Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich