Read more
Informationen zum Autor MICHAEL BOWLES teaches machine learning at UC Berkeley, University of New Haven and Hacker Dojo in Silicon Valley, consults on machine learning projects, and is involved in a number of startups in such areas as semi conductor inspection, drug design and optimization and trading in the financial markets. Following an assistant professorship at MIT, Michael went on to found and run two Silicon Valley startups, both of which went public. His courses are always popular and receive great feedback from participants. Klappentext Machine Learning with Spark and Python Essential Techniques for Predictive Analytics, Second Edition simplifies ML for practical uses by focusing on two key algorithms. This new second edition improves with the addition of Spark--a ML framework from the Apache foundation. By implementing Spark, machine learning students can easily process much large data sets and call the spark algorithms using ordinary Python code.Machine Learning with Spark and Python focuses on two algorithm families (linear methods and ensemble methods) that effectively predict outcomes. This type of problem covers many use cases such as what ad to place on a web page, predicting prices in securities markets, or detecting credit card fraud. The focus on two families gives enough room for full descriptions of the mechanisms at work in the algorithms. Then the code examples serve to illustrate the workings of the machinery with specific hackable code. Zusammenfassung Machine Learning with Spark and Python Essential Techniques for Predictive Analytics! Second Edition simplifies ML for practical uses by focusing on two key algorithms. This new second edition improves with the addition of Spark--a ML framework from the Apache foundation. By implementing Spark! machine learning students can easily process much large data sets and call the spark algorithms using ordinary Python code.Machine Learning with Spark and Python focuses on two algorithm families (linear methods and ensemble methods) that effectively predict outcomes. This type of problem covers many use cases such as what ad to place on a web page! predicting prices in securities markets! or detecting credit card fraud. The focus on two families gives enough room for full descriptions of the mechanisms at work in the algorithms. Then the code examples serve to illustrate the workings of the machinery with specific hackable code. Inhaltsverzeichnis Introduction xxi Chapter 1 The Two Essential Algorithms for Making Predictions 1 Why are These Two Algorithms So Useful? 2 What are Penalized Regression Methods? 7 What are Ensemble Methods? 9 How to Decide Which Algorithm to Use 11 The Process Steps for Building a Predictive Model 13 Framing a Machine Learning Problem 15 Feature Extraction and Feature Engineering 17 Determining Performance of a Trained Model 18 Chapter Contents and Dependencies 18 Summary 20 Chapter 2 Understand the Problem by Understanding the Data 23 The Anatomy of a New Problem 24 Different Types of Attributes and Labels Drive Modeling Choices 26 Things to Notice about Your New Data Set 27 Classification Problems: Detecting Unexploded Mines Using Sonar 28 Physical Characteristics of the Rocks Versus Mines Data Set 29 Statistical Summaries of the Rocks Versus Mines Data Set 32 Visualization of Outliers Using a Quantile-Quantile Plot 34 Statistical Characterization of Categorical Attributes 35 How to Use Python Pandas to Summarize the Rocks Versus Mines Data Set 36 Visualizing Properties of the Rocks Versus Mines Data Set 39 Visualizing with Parallel Coordinates Plots 39 Visualizing Interrelationships between Attributes and Labels 41 Visualizing Attribute and Label Correlations Using a Heat Map 48