Fr. 186.00

Dependable Computing - Design and Assessment

English · Hardback

Shipping usually within 3 to 5 weeks

Description

Read more

Dependable Computing
 
Covering dependability from software and hardware perspectives
 
Dependable Computing: Design and Assessment looks at both the software and hardware aspects of dependability.
 
This book:
* Provides an in-depth examination of dependability/fault tolerance topics
* Describes dependability taxonomy, and briefly contrasts classical techniques with their modern counterparts or extensions
* Walks up the system stack from the hardware logic via operating systems up to software applications with respect to how they are hardened for dependability
* Describes the use of measurement-based analysis of computing systems
* Illustrates technology through real-life applications
* Discusses security attacks and unique dependability requirements for emerging applications, e.g., smart electric power grids and cloud computing
* Finally, using critical societal applications such as autonomous vehicles, large-scale clouds, and engineering solutions for healthcare, the book illustrates the emerging challenges faced in making artificial intelligence (AI) and its applications dependable and trustworthy.
 
This book is suitable for those studying in the fields of computer engineering and computer science. Professionals who are working within the new reality to ensure dependable computing will find helpful information to support their efforts. With the support of practical case studies and use cases from both academia and real-world deployments, the book provides a journey of developments that include the impact of artificial intelligence and machine learning on this ever-growing field. This book offers a single compendium that spans the myriad areas in which dependability has been applied, providing theoretical concepts and applied knowledge with content that will excite a beginner, and rigor that will satisfy an expert. Accompanying the book is an online repository of problem sets and solutions, as well as slides for instructors, that span the chapters of the book.

List of contents

About the Authors xxiii
 
Preface xxv
 
Acknowledgments xxvii
 
About the Companion Website xxix
 
1 Dependability Concepts and Taxonomy 1
 
1.1 Introduction 1
 
1.2 Placing Classical Dependability Techniques in Perspective 2
 
1.3 Taxonomy of Dependable Computing 4
 
1.3.1 Faults, Errors, and Failures 5
 
1.4 Fault Classes 6
 
1.5 The Fault Cycle and Dependability Measures 6
 
1.6 Fault and Error Classification 7
 
1.7 Mean Time Between Failures 11
 
1.8 User- perceived System Dependability 13
 
1.9 Technology Trends and Failure Behavior 14
 
1.10 Issues at the Hardware Level 15
 
1.11 Issues at the Platform Level 17
 
1.12 What is Unique About this Book? 18
 
1.13 Overview of the Book 19
 
References 20
 
2 Classical Dependability Techniques and Modern Computing Systems: Where and How Do They Meet? 25
 
2.1 Illustrative Case Studies of Design for Dependability 25
 
2.2 Cloud Computing: A Rapidly Expanding Computing Paradigm 31
 
2.3 New Application Domains 37
 
2.4 Insights 52
 
References 52
 
3 Hardware Error Detection and Recovery Through Hardware- Implemented Techniques 57
 
3.1 Introduction 57
 
3.2 Redundancy Techniques 58
 
3.3 Watchdog Timers 67
 
3.4 Information Redundancy 69
 
3.5 Capability and Consistency Checking 93
 
3.6 Insights 93
 
References 96
 
4 Processor Level Error Detection and Recovery 101
 
4.1 Introduction 101
 
4.2 Logic- level Techniques 104
 
4.3 Error Protection in the Processors 115
 
4.4 Academic Research on Hardware- level Error Protection 122
 
4.5 Insights 134
 
References 137
 
5 Hardware Error Detection Through Software- Implemented Techniques 141
 
5.1 Introduction 141
 
5.2 Duplication- based Software Detection Techniques 142
 
5.3 Control- Flow Checking 146
 
5.4 Heartbeats 166
 
5.5 Assertions 173
 
5.6 Insights 174
 
References 175
 
6 Software Error Detection and Recovery Through Software Analysis 179
 
6.1 Introduction 179
 
6.2 Diverse Programming 183
 
6.3 Static Analysis Techniques 194
 
6.4 Error Detection Based on Dynamic Program Analysis 217
 
6.5 Processor- Level Selective Replication 233
 
6.6 Runtime Checking for Residual Software Bugs 239
 
6.7 Data Audit 242
 
6.8 Application of Data Audit Techniques 246
 
6.9 Insights 252
 
References 253
 
7 Measurement- based Analysis of System Software: Operating System Failure Behavior 261
 
7.1 Introduction 261
 
7.2 MVS (Multiple Virtual Storage) 262
 
7.3 Experimental Analysis of OS Dependability 273
 
7.4 Behavior of the Linux Operating System in the Presence of Errors 275
 
7.5 Evaluation of Process Pairs in Tandem GUARDIAN 295
 
7.6 Benchmarking Multiple Operating Systems: A Case Study Using Linux on Pentium, Solaris on SPARC, and AIX on POWER 308
 
7.7 Dependability Overview of the Cisco Nexus Operating System 326
 
7.8 Evaluating Operating Systems: Related Studies 330
 
7.9 Insights 331
 
References 332
 
8 Reliable Networked and Distributed Systems 337
 
8.1 Introduction 337
 
8.2 System Model 339
 
8.3 Failure Models 340
 
8.4 Agreement Protocols 342
 
8.5 Reliable Broadcast 346
 
8.6 Reliable Group Communication 351
 
8.7 Replication 358
 
8.8 Replication of Multithrea

About the author










Ravishankar K. Iyer is George and Ann Fisher Distinguished Professor of Engineering at the University of Illinois Urbana-Champaign, USA. He holds joint appointments in the Departments of Electrical & Computer Engineering and Computer Science as well as the Coordinated Science Laboratory (CSL), the National Center for Supercomputing Applications (NCSA), and the Carl R. Woese Institute for Genomic Biology. The winner of numerous awards and honors, he was the founding chief scientist of the Information Trust Institute at UIUC-a campus-wide research center addressing security, reliability, and safety issues in critical infrastructures. Zbigniew T. Kalbarczyk is a Research Professor in the Department of Electrical & Computer Engineering and the Coordinated Science Laboratory of the University of Illinois Urbana-Champaign, USA. He is a member of the IEEE, the IEEE Computer Society, and IFIP Working Group 10.4 on Dependable Computing and Fault Tolerance. Dr. Kalbarczyk's research interests are in the design and validation of reliable and secure computing systems. His current work explores emerging computing technologies, machine learning-based methods for early detection of security attacks, analysis of data on failures and security attacks in large computing systems, and more. Nithin M. Nakka received his B. Tech (hons.) degree from the Indian Institute of Technology, Kharagpur, India, and his M.S. and Ph.D. degrees from the University of Illinois Urbana-Champaign, USA. He is a Technical Leader at Cisco Systems and has worked on most layers of the networking stack, from network data-plane hardware, including layer-2 and layer-3 (control plane), network controllers, and network fabric monitoring. His areas of research interest include systems reliability, network telemetry, and hardware-implemented fault tolerance.

Summary

The only recent book on dependability/fault-tolerance that covers both software and hardware aspects of dependability, Dependable Computing Design and Assessment addresses the new reality of dependability.

Customer reviews

No reviews have been written for this item yet. Write the first review and be helpful to other users when they decide on a purchase.

Write a review

Thumbs up or thumbs down? Write your own review.

For messages to CeDe.ch please use the contact form.

The input fields marked * are obligatory

By submitting this form you agree to our data privacy statement.