Fr. 189.00

Peter C. Bruce, Bruce Peter C., Nitin R. Patel, Patel Nitin R., G Shmueli, Gali Shmueli...

Data Mining for Business Analytics Concepts, Techniques, and - Applications With Jmp Pro

English · Hardback

Shipping usually within 1 to 3 weeks (not available at short notice)

Description

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) presents an applied and interactive approach to data mining.Featuring hands-on applications with JMP Pro(r), a statistical package from the SAS Institute, the bookuses engaging, real-world examples to build a theoretical and practical understanding of key data mining methods, especially predictive models for classification and prediction. Topics include data visualization, dimension reduction techniques, clustering, linear and logistic regression, classification and regression trees, discriminant analysis, naive Bayes, neural networks, uplift modeling, ensemble models, and time series forecasting.Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) also includes:* Detailed summaries that supply an outline of key topics at the beginning of each chapter* End-of-chapter examples and exercises that allow readers to expand their comprehension of the presented material* Data-rich case studies to illustrate various applications of data mining techniques* A companion website with over two dozen data sets, exercises and case study solutions, and slides for instructorsData Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) is an excellent textbook for advanced undergraduate and graduate-level courses on data mining, predictive analytics, and business analytics. The book is also a one-of-a-kind resource for data scientists, analysts, researchers, and practitioners working with analytics in the fields of management, finance, marketing, information technology, healthcare, education, and any other data-rich field.Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks, and book chapters, including Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner(r), Third Edition, also published by Wiley.Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective and co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner (r), Third Edition, both published by Wiley.Mia Stephens is Academic Ambassador at JMP(r), a division of SAS Institute. Prior to joining SAS, she was an adjunct professor of statistics at the University of New Hampshire and a founding member of the North Haven Group LLC, a statistical training and consulting company. She is the co-author of three other books, including Visual Six Sigma: Making Data Analysis Lean, Second Edition, also published by Wiley.Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. He is co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner(r), Third Edition, also published by Wiley.

List of contents

Dedication iForeword xviiPreface xviiiAcknowledgments xxPART I PRELIMINARIESCHAPTER 1 Introduction 31.1 What is Business Analytics? 31.2 What is Data Mining? 51.3 Data Mining and Related Terms 51.4 Big Data 61.5 Data science 71.6 Why Are There So Many Different Methods? 81.7 Terminology and Notation 91.8 Road Maps to This Book 11Order of Topics 12CHAPTER 2 Overview of the Data Mining Process 152.1 Introduction 152.2 Core Ideas in Data Mining 162.3 The Steps in Data Mining 192.4 Preliminary Steps 202.5 Predictive Power and Overfitting 282.6 Building a Predictive Model with JMP Pro 332.7 Using JMP Pro for Data Mining 422.8 Automating Data Mining Solutions 42Data Mining Software Tools (Herb Edelstein) 44Problems 47PART II DATA EXPLORATION AND DIMENSION REDUCTIONCHAPTER 3 Data Visualization 523.1 Uses of Data Visualization 523.2 Data Examples 54Example 1: Boston Housing Data 54Example 2: Ridership on Amtrak Trains 553.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots 55Distribution Plots 58Heatmaps: visualizing correlations and missing values 613.4 Multi-Dimensional Visualization 63Adding Variables: Color, Hue, Size, Shape, Multiple Panels, Animation 63Manipulations: Re-scaling, Aggregation and Hierarchies, Zooming and Panning, Filtering 67Reference: Trend Line and Labels 70Scaling Up: Large Datasets 72Multivariate Plot: Parallel Coordinates Plot 73Interactive Visualization 743.5 Specialized Visualizations 76Visualizing Networked Data 76Visualizing Hierarchical Data: Treemaps 77Visualizing Geographical Data: Maps 783.6 Summary of Major Visualizations and Operations, According to Data Mining Goal 80Prediction 80Classification 81Time Series Forecasting 81Unsupervised Learning 82Problems 83CHAPTER 4 Dimension Reduction 854.1 Introduction 854.2 Curse of Dimensionality 864.3 Practical Considerations 86Example 1: House Prices in Boston 874.4 Data Summaries 884.5 Correlation Analysis 914.6 Reducing the Number of Categories in Categorical Variables 924.7 Converting A Categorical Variable to A Continuous Variable 944.8 Principal Components Analysis 94Example 2: Breakfast Cereals 95Principal Components 101Normalizing the Data 102Using Principal Components for Classification and Prediction 1044.9 Dimension Reduction Using Regression Models 1044.10 Dimension Reduction Using Classification and Regression Trees 106Problems 107PART III PERFORMANCE EVALUATIONCHAPTER 5 Evaluating Predictive Performance 1115.1 Introduction 1115.2 Evaluating Predictive Performance 112Benchmark: The Average 112Prediction Accuracy Measures 1135.3 Judging Classifier Performance 115Benchmark: The Naive Rule 115Class Separation 115The Classification Matrix 116Using the Validation Data 117Accuracy Measures 117Cutoff for Classification 118Performance in Unequal Importance of Classes 122Asymmetric Misclassification Costs 1235.4 Judging Ranking Performance 1275.5 Oversampling 131Problems 138PART IV PREDICTION AND CLASSIFICATION METHODSCHAPTER 6 Multiple Linear Regression 1416.1 Introduction 1416.2 Explanatory vs. Predictive Modeling 1426.3 Estimating the Regression Equation and Prediction 143Example: Predicting the Price of Used Toyota Corolla Automobiles . 1446.4 Variable Selection in Linear Regression 149Reducing the Number of Predictors 149How to Reduce the Number of Predictors 150Manual Variable Selection 151Automated Variable Selection 151Problems 160CHAPTER 7 k-Nearest Neighbors (kNN) 1657.1 The k-NN Classifier (categorical outcome) 165Determining Neighbors 165Classification Rule 166Example: Riding Mowers 166Choosing k 167Setting the Cutoff Value 1697.2 k-NN for a Numerical Response 1717.3 Advantages and Shortcomings of k-NN Algorithms 172Problems 174CHAPTER 8 The Naive Bayes Classifier 1768.1 Introduction 176Example 1: Predicting Fraudulent Financial Reporting 1778.2 Applying the Full (Exact) Bayesian Classifier 1788.3 Advantages and Shortcomings of the Naive Bayes Classifier 187Advantages and Shortcomings of the naive Bayes Classifier 187Problems 191CHAPTER 9 Classification and Regression Trees 1949.1 Introduction 1949.2 Classification Trees 195Example 1: Riding Mowers 1969.3 Growing a Tree 198Growing a Tree Example 198Growing a Tree with CART 2039.4 Evaluating the Performance of a Classification Tree 203Example 2: Acceptance of Personal Loan 2039.5 Avoiding Overfitting 204Stopping Tree Growth: CHAID 205Pruning the Tree 2079.6 Classification Rules from Trees 2089.7 Classification Trees for More Than two Classes 2109.8 Regression Trees 210Prediction 213Evaluating Performance 2149.9 Advantages and Weaknesses of a Tree 2149.10 Improving Prediction: Multiple Trees 2169.11 CART, and Measures of Impurity 218Measuring Impurity 218Problems 221CHAPTER 10 Logistic Regression 22410.1 Introduction 22410.2 The Logistic Regression Model 226Example: Acceptance of Personal Loan 227Model with a Single Predictor 229Estimating the Logistic Model from Data: Computing Parameter Estimates 23110.3 Evaluating Classification Performance 234Variable Selection 23610.4 Example of Complete Analysis: Predicting Delayed Flights 237Data Preprocessing 240Model Fitting, Estimation and Interpretation - A Simple Model 240Model Fitting, Estimation and Interpretation - The Full Model 241Model Performance 243Variable Selection 24510.5 Appendix: Logistic Regression for Profiling 249Appendix A: Why Linear Regression Is Inappropriate for a Categorical Response 249Appendix B: Evaluating Explanatory Power 250Appendix C: Logistic Regression for More Than Two Classes 253Problems 257CHAPTER 11 Neural Nets 26011.1 Introduction 26011.2 Concept and Structure of a Neural Network 26111.3 Fitting a Network to Data 261Example 1: Tiny Dataset 262Computing Output of Nodes 263Preprocessing the Data 266Training the Model 267Using the Output for Prediction and Classification 272Example 2: Classifying Accident Severity 273Avoiding overfitting 27511.4 User Input in JMP Pro 27711.5 Exploring the Relationship Between Predictors and Response 28011.6 Advantages and Weaknesses of Neural Networks 281Problems 282CHAPTER 12 Discriminant Analysis 28412.1 Introduction 284Example 1: Riding Mowers 285Example 2: Personal Loan Acceptance 28512.2 Distance of an Observation from a Class 28612.3 From Distances to Propensities and Classifications 28812.4 Classification Performance of Discriminant Analysis 29212.5 Prior Probabilities 29312.6 Classifying More Than Two Classes 294Example 3: Medical Dispatch to Accident Scenes 29412.7 Advantages and Weaknesses 296Problems 299CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 30213.1 Ensembles 303Why Ensembles Can Improve Predictive Power 303Simple Averaging 305Bagging 306Boosting 306Advantages and Weaknesses of Ensembles 30713.2 Uplift (Persuasion) Modeling 308A-B Testing 308Uplift 308Gathering the Data 309A Simple Model 310Modeling Individual Uplift 311Using the Results of an Uplift Model 312Creating Uplift Models in JMP Pro 31313.3 Summary 315Problems 316PART V MINING RELATIONSHIPS AMONG RECORDSCHAPTER 14 Cluster Analysis 32014.1 Introduction 320Example: Public Utilities 32214.2 Measuring Distance Between Two Observations 324Euclidean Distance 324Normalizing Numerical Measurements 324Other Distance Measures for Numerical Data 326Distance Measures for Categorical Data 327Distance Measures for Mixed Data 32714.3 Measuring Distance Between Two Clusters 32814.4 Hierarchical (Agglomerative) Clustering 330Single Linkage 332Complete Linkage 332Average Linkage 333Centroid Linkage 333Dendrograms: Displaying Clustering Process and Results 334Validating Clusters 335Limitations of Hierarchical Clustering 33914.5 Nonhierarchical Clustering: The k-Means Algorithm 340Initial Partition into k Clusters 342Problems 350PART VI FORECASTING TIME SERIESCHAPTER 15 Handling Time Series 35515.1 Introduction 35515.2 Descriptive vs. Predictive Modeling 35615.3 Popular Forecasting Methods in Business 357Combining Methods 35715.4 Time Series Components 358Example: Ridership on Amtrak Trains 35815.5 Data Partitioning and Performance Evaluation 362Benchmark Performance: Naive Forecasts 362Generating Future Forecasts 363Problems 365CHAPTER 16 Regression-Based Forecasting 36816.1 A Model with Trend 368Linear Trend 368Exponential Trend 372Polynomial Trend 37416.2 A Model with Seasonality 37516.3 A Model with Trend and Seasonality 37816.4 Autocorrelation and ARIMA Models 378Computing Autocorrelation 380Computing Autocorrelation 380Improving Forecasts by Integrating Autocorrelation Information 383Improving Forecasts by Integrating Autocorrelation Information383Fitting AR Models to Residuals 384Fitting AR Models to Residuals 384Evaluating Predictability 387Evaluating Predictability 387Problems 389CHAPTER 17 Smoothing Methods 39917.1 Introduction 39917.2 Moving Average 400Centered Moving Average for Visualization 400Trailing Moving Average for Forecasting 401Choosing Window Width (w) 40417.3 Simple Exponential Smoothing 405Choosing Smoothing Parameter 406Relation Between Moving Average and Simple Exponential Smoothing 40817.4 Advanced Exponential Smoothing 409Series with a trend 409Series with a Trend and Seasonality 410Problems 414PART VII CASESCHAPTER 18 Cases 42518.1 Charles Book Club 42518.2 German Credit 434Background 434Data 43418.3 Tayko Software Cataloger 43918.4 Political Persuasion 442Background 442Predictive Analytics Arrives in US Politics 442Political Targeting 442Uplift 443Data 444Assignment 44418.5 Taxi Cancellations 446Business Situation 446Assignment 44618.6 Segmenting Consumers of Bath Soap 448Appendix 45118.7 Direct-Mail Fundraising 45218.8 Predicting Bankruptcy 45518.9 Time Series Case: Forecasting Public Transportation Demand 458References 460Data Files Used in the Book 461Index 463

About the author

GALIT SHMUELI, PhD is Distinguished Professor at the Institute of Service Science, National Tsing Hua University, Taiwan. She is co-author of the best-selling textbook Data Mining for Business Analytics, among other books and numerous publications in top journals. She has designed and instructed courses on forecasting, data mining, statistics and other data analytics topics at University of Maryland's Smith School of Business, the Indian School of Business, National Tsing Hua University and online at statistics.com
For more information see galitshmueli.com

Summary

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) presents an applied and interactive approach to data mining.

Featuring hands-on applications with JMP Pro(r), a statistical package from the SAS Institute, the book
uses engaging, real-world examples to build a theoretical and practical understanding of key data mining methods, especially predictive models for classification and prediction. Topics include data visualization, dimension reduction techniques, clustering, linear and logistic regression, classification and regression trees, discriminant analysis, naive Bayes, neural networks, uplift modeling, ensemble models, and time series forecasting.

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) also includes:
* Detailed summaries that supply an outline of key topics at the beginning of each chapter
* End-of-chapter examples and exercises that allow readers to expand their comprehension of the presented material
* Data-rich case studies to illustrate various applications of data mining techniques
* A companion website with over two dozen data sets, exercises and case study solutions, and slides for instructors

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(r) is an excellent textbook for advanced undergraduate and graduate-level courses on data mining, predictive analytics, and business analytics. The book is also a one-of-a-kind resource for data scientists, analysts, researchers, and practitioners working with analytics in the fields of management, finance, marketing, information technology, healthcare, education, and any other data-rich field.

Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks, and book chapters, including Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner(r), Third Edition, also published by Wiley.

Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective and co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner (r), Third Edition, both published by Wiley.

Mia Stephens is Academic Ambassador at JMP(r), a division of SAS Institute. Prior to joining SAS, she was an adjunct professor of statistics at the University of New Hampshire and a founding member of the North Haven Group LLC, a statistical training and consulting company. She is the co-author of three other books, including Visual Six Sigma: Making Data Analysis Lean, Second Edition, also published by Wiley.

Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. He is co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications in XLMiner(r), Third Edition, also published by Wiley.

Product details

Authors	Peter C. Bruce, Bruce Peter C., Nitin R. Patel, Patel Nitin R., G Shmueli, Gali Shmueli, Galit Shmueli, Galit Bruce Shmueli, Galit Patel Shmueli, Mia L et al Stephens, Mia L. Stephens, Stephens Mia L., Inba Yahav, Inbal Yahav
Publisher	Wiley, John and Sons Ltd

Languages	English
Product format	Hardback
Released	30.06.2016

EAN	9781118877432
ISBN	978-1-118-87743-2
No. of pages	464
Subjects	Natural sciences, medicine, IT, technology > Mathematics > Probability theory, stochastic theory, mathematical statistics Social sciences, law, business > Business > Economics Statistik, Informatik, Data Mining, Statistics, Business Intelligence, computer science, business analytics, Business & management, Wirtschaft u. Management, Data Mining Statistics, Database & Data Warehousing Technologies, Datenbanken u. Data Warehousing, Theorie der Entscheidungsfindung, Decision Sciences, JMP (Software)

Customer reviews

No reviews have been written for this item yet. Write the first review and be helpful to other users when they decide on a purchase.

Write a review

Thumbs up or thumbs down? Write your own review.

Your contact at CeDe