Detailed Course Outline
- Data Science Overview
- What Is Data Science?
- The Growing Need for Data Science
- The Role of a Data Scientist
- Use Cases
- Finance
- Retail
- Advertising
- Defense and Intelligence
- Telecommunications and Utilities
- Healthcare and Pharmaceuticals
- Project Lifecycle
- Steps in the Project Lifecycle
- Lab Scenario Explanation
- Data Acquisition
- Where to Source Data
- Acquisition Techniques
- Evaluating Input Data
- Data Formats
- Data Quantity
- Data Quality
- Data Transformation
- Anonymization
- File Format Conversion
- Joining Datasets
- Data Analysis and Statistical Methods
- Relationship Between Statistics and Probability
- Descriptive Statistics
- Inferential Statistics
- Fundamentals of Machine Learning
- Overview
- The Three Cs of Machine Learning
- Spotlight: Naïve Bayes Classifiers
- Importance of Data and Algorithms
- Recommender Overview
- What Is a Recommender System?
- Types of Collaborative Filtering
- Limitations of Recommender
- Systems Fundamental Concepts
- Introduction to Apache Mahout
- What Apache Mahout Is (and Is Not)
- A Brief History of Mahout
- Availability and Installation
- Demonstration: Using Mahout’s Item-Based Recommender
- Implementing Recommenders with Apache Mahout
- Overview
- Similarity Metrics for Binary Preferences
- Similarity Metrics for Numeric Preferences
- Scoring
- Experimentation and Evaluation
- Measuring Recommender Effectiveness
- Designing Effective Experiments
- Conducting an Effective Experiment
- User Interfaces for Recommenders
- Production Deployment and Beyond
- Deploying to Production
- Tips and Techniques for Working at Scale
- Summarizing and Visualizing Results
- Considerations for Improvement
- ext Steps for Recommenders
- Appendix A: Hadoop Overview
- Appendix B: Mathematical Formulas
- Appendix C: Language and Tool Reference