Read more
Informationen zum Autor MALCOLM ATKINSON, PhD, is Professor of e-Science in the School of Informatics at the University of Edinburgh in Scotland. He is also Data-Intensive Research Group leader, Director of the e-Science Institute, IT architect for the ADMIRE and VERCE EU projects and UK e-Science Envoy. Professor Atkinson has been leading research projects for several decades and served on many advisory bodies. Klappentext Complete guidance for mastering the tools and techniques of the digital revolutionWith the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections.Emphasizing data-intensive thinking and interdisciplinary collaboration, The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book:* Outlines the concepts and rationale for implementing data-intensive computing in organizations* Covers from the ground up problem-solving strategies for data analysis in a data-rich world* Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL* Features in-depth case studies in customer relations, environmental hazards, seismology, and more* Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering* Includes sample program snippets throughout the text as well as additional materials on a companion websiteThe Data Bonanza is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing. Zusammenfassung This book presents the most up-to-date opportunities and challenges emerging in knowledge discovery, helping readers develop the technical skills to design and develop data-intensive methods and processes. Inhaltsverzeichnis CONTRIBUTORS xv FOREWORD xvii PREFACE xix THE EDITORS xxix PART I STRATEGIES FOR SUCCESS IN THE DIGITAL-DATA REVOLUTION 1 1. The Digital-Data Challenge 5 Malcolm Atkinson and Mark Parsons 1.1 The Digital Revolution 5 1.2 Changing How We Think and Behave 6 1.3 Moving Adroitly in this Fast-Changing Field 8 1.4 Digital-Data Challenges Exist Everywhere 8 1.5 Changing How We Work 9 1.6 Divide and Conquer Offers the Solution 10 1.7 Engineering Data-to-Knowledge Highways 12 2. The Digital-Data Revolution 15 Malcolm Atkinson 2.1 Data, Information, and Knowledge 16 2.2 Increasing Volumes and Diversity of Data 18 2.3 Changing the Ways We Work with Data 28 3. The Data-Intensive Survival Guide 37 Malcolm Atkinson 3.1 Introduction: Challenges and Strategy 38 3.2 Three Categories of Expert 39 3.3 The Data-Intensive Architecture 41 3.4 An Operational Data-Intensive System 42 3.5 Introducing DISPEL 44 3.6 A Simple DISPEL Example 45 3.7 Supporting Data-Intensive Experts 47 3.8 DISPEL in the Context of Contemporary Systems 48 3.9 Datascopes 51 3.10 Ramps for Incremental Engagement 54 3.11 Readers' Guide to t...
List of contents
CONTRIBUTORS xv
FOREWORD xvii
PREFACE xix
THE EDITORS xxix
PART I STRATEGIES FOR SUCCESS IN THE DIGITAL-DATA REVOLUTION 1
1. The Digital-Data Challenge 5
Malcolm Atkinson and Mark Parsons
1.1 The Digital Revolution 5
1.2 Changing How We Think and Behave 6
1.3 Moving Adroitly in this Fast-Changing Field 8
1.4 Digital-Data Challenges Exist Everywhere 8
1.5 Changing How We Work 9
1.6 Divide and Conquer Offers the Solution 10
1.7 Engineering Data-to-Knowledge Highways 12
2. The Digital-Data Revolution 15
Malcolm Atkinson
2.1 Data, Information, and Knowledge 16
2.2 Increasing Volumes and Diversity of Data 18
2.3 Changing the Ways We Work with Data 28
3. The Data-Intensive Survival Guide 37
Malcolm Atkinson
3.1 Introduction: Challenges and Strategy 38
3.2 Three Categories of Expert 39
3.3 The Data-Intensive Architecture 41
3.4 An Operational Data-Intensive System 42
3.5 Introducing DISPEL 44
3.6 A Simple DISPEL Example 45
3.7 Supporting Data-Intensive Experts 47
3.8 DISPEL in the Context of Contemporary Systems 48
3.9 Datascopes 51
3.10 Ramps for Incremental Engagement 54
3.11 Readers' Guide to the Rest of This Book 56
4. Data-Intensive Thinking with DISPEL 61
Malcolm Atkinson
4.1 Processing Elements 62
4.2 Connections 64
4.3 Data Streams and Structure 65
4.4 Functions 66
4.5 The Three-Level Type System 72
4.6 Registry, Libraries, and Descriptions 81
4.7 Achieving Data-Intensive Performance 86
4.8 Reliability and Control 108
4.9 The Data-to-Knowledge Highway 116
PART II DATA-INTENSIVE KNOWLEDGE DISCOVERY 123
5. Data-Intensive Analysis 127
Oscar Corcho and Jano van Hemert
5.1 Knowledge Discovery in Telco Inc. 128
5.2 Understanding Customers to Prevent Churn 130
5.3 Preventing Churn Across Multiple Companies 134
5.4 Understanding Customers by Combining Heterogeneous Public and Private Data 137
5.5 Conclusions 144
6. Problem Solving in Data-Intensive Knowledge Discovery 147
Oscar Corcho and Jano van Hemert
6.1 The Conventional Life Cycle of Knowledge Discovery 148
6.2 Knowledge Discovery Over Heterogeneous Data Sources 155
6.3 Knowledge Discovery from Private and Public, Structured and Nonstructured Data 158
6.4 Conclusions 162
7. Data-Intensive Components and Usage Patterns 165
Oscar Corcho
7.1 Data Source Access and Transformation Components 166
7.2 Data Integration Components 172
7.3 Data Preparation and Processing Components 173
7.4 Data-Mining Components 174
7.5 Visualization and Knowledge Delivery Components 176
8. Sharing and Reuse in Knowledge Discovery 181
Oscar Corcho
8.1 Strategies for Sharing and Reuse 182
8.2 Data Analysis Ontologies for Data Analysis Experts 185
8.3 Generic Ontologies for Metadata Generation 188
8.4 Domain Ontologies for Domain Experts 189
8.5 Conclusions 190
PART III DATA-INTENSIVE ENGINEERING 193
9. Platforms for Data-Intensive Analysis 197
David Snelling
9.1 The Hourglass Reprise 198
9.2 The Motivation for a Platform 200
9.3 Realization 201
10. Definition of the DISPEL Language 203
Paul Martin and Gagarine Yaikhom
10.1 A Simple Example 204
10.2 Processing Elements 205