Data Preparation for Analytics Using SAS, edited by Gerhard Svolba, 407 pp., US$67.95.
This is another useful SAS (statistical analysis system) book that is specifically designed for business
but still can be used in other disciplines. Steps and points in preparations and acquisition of data were described by an experienced analyst that will help a beginner following the logic steps.
The number of chapters was designated for outlining general and specific business matters (do and don't) while other chapters were designated for SAS statements with explanations for program submission.
Five-Parter
This book consists of five parts:
- Data preparation in the business field
- Data structure and modeling
- Data mart coding and content
- Sampling, scoring, and automation
- Case study
Contents of each part were further portioned into chapters where SAS statements and its output were introduced and explained. A case study was also presented where five business related data were introduced with SAS statements on how to manipulate and analyze them.
Prior to any data analyses, exploratory data analyses are an important step to learn and to understand the nature of the data. The process of finalizing the data for the analyses is time consuming but helps the analyst to gain knowledge to whether, for example, remove or retain an outlier.
Deciphering the Data
The author put forward his experience in dealing with data and analyses by summarizing his experience in detailed steps to follow, and what kind of analyses that needs to be done.
Issues such as accessing data from SAS (SQL, Oracle (Nasdaq: ORCL)
, etc.), Microsoft (Nasdaq: MSFT)
Office, text files, hierarchical text files, and other media were described. Data manipulation, such as grouping and transposing, are just a few to name and were explained via examples with SAS statements to accomplish a task. Macros were used more often in SAS statements in addition to regular SAS statements.
In controlling the generalization error, validation method using training, validation and test data sets were presented. Sampling from raw data using simple random, stratified and clustered sampling were introduced with their SAS codes and related restrictions.
These sampling methods can also be used in preparing samples for enterprise miner for data exploration. Prediction for a non-measured response (target) variable was presented in scoring statements using standard and logistic regressions. At the end of this book, a case study was presented where five data sets in marketing
were presented.
Analysts and students in business and other disciplines may benefit from this book as they can find many tips to how and what kind of analyses need to be done.
© 2008 Technometrics. All rights reserved.
© 2008 ECT News Network. All rights reserved.