Read more
How do you use R to import, manage, visualize, and analyze real-world data? With this short, hands-on tutorial, you learn how to collect online data, massage it into a reasonable form, and work with it using R facilities to interact with web servers, parse HTML and XML, and more. Rather than use canned sample data, you'll plot and analyze current home foreclosure auctions in Philadelphia.
This practical mashup exercise shows you how to access spatial data in several formats locally and over the Web to produce a map of home foreclosures. It's an excellent way to explore how the R environment works with R packages and performs statistical analysis.
Parse messy data from public foreclosure auction postings
Plot the data using R's PBSmapping package
Import US Census data to add context to foreclosure data
Use R's lattice and latticeExtra packages for data visualization
Create multidimensional correlation graphs with the pairs() scatterplot matrix package
List of contents
Introduction
Chapter 1: Mapping Foreclosures
Chapter 2: Statistics of Foreclosure
Getting Started
About the author
Jeremy Leipzig is a bioinformatics software developer at DuPont Crop Genetics. He has conducted academic research in viral integration, metagenomics, schizophrenia, and alternative splicing. While a graduate student, he developed one of the first faculty-review websites and wrote "Work Issues in Software Engineering", a survey-based study of "death march" projects.
Xiao-Yi Li is a biostatistician with an M.Sc. from University of Michigan. In fact, her entire education experience has be revolving statistics, a percentile or otherwise. Currently, she works in the bioinformatics group at DuPont as a statistical consultant. Her work consists mostly of design of experiments and analysis for phenotypic screens, quality control in microarrays, and association mapping.
Summary
Data analysis is more than means and standard deviations. This ebook is a case study of how you can push R into new territory to analyze online real-world data.