Statistics Tutorials : Beginner to Advanced

This page is a complete repository of statistics tutorials which are useful for learning basic, intermediate, advanced Statistics and machine learning algorithms with SAS, R and Python.;It covers some of the most important modeling and prediction techniques, along with relevant applications. Topics include hypothesis testing, linear regression, logistic regression, classification, market basket analysis, random forest, ensemble techniques, clustering, and many more.

In these days, knowledge of statistics and machine learning is one of the most sought-after skills. People who possess hands-on experience of these techniques are paid well in job market. In the world of automation, it's important to gain experience of machine learning algorithms to survive in the market.

Top Statistics Tutorials

Statistics / Analytics Tutorials

The following is a list of tutorials which are ideal for both beginners and advanced analytics professionals. It's a step by step guide to learn statistics with popular statistical tools such as SAS, R and Python. It would give you an idea how these algorithms works in background and how to perform these statistical techniques with statistical packages. It includes both theoretical as well as technical explanation.

  1. Basic Statistics: Types of Variables
  2. Descriptive Statistics
  3. When to Use Mean vs. Median
  4. When and why to standardize variables
  5. Significance Testing: Independent T-Test
  6. Partial and Semipartial Correlation
  7. Linear Regression with R
  8. Linear Regression in Python
  9. Learn Python for Data Science from Scratch
  10. Logistic Regression with R
  11. Logistic Regression with SAS
  12. 15 Types of Regression
  13. Cluster Analysis using SAS
  14. Cluster Analysis with R
  15. Validate Cluster Analysis
  16. Market Basket Analysis with R
  17. Principal Component Analysis with SAS
  18. Variable Selection / Reduction with R
  19. Variable Selection with Boruta Package
  20. Selecting the Best Linear Regression Model
  21. Ridge Regression with SAS
  22. Mixed Regression Simplified
  23. Time Series Forecasting : ARIMA
  24. Support Vector Machine Simplified
  25. Variable Clustering (PROC VARCLUS)
  26. Detecting Multicollinearity in Categorical Variables
  27. Detecting Non-Linear and Non-Monotonic Relationship
  28. Model Performance in Logistic Regression
  29. Model Validation in Logistic Regression
  30. Model Monitoring in Logistic Regression
  31. Bootstrapping Logistic Regression
  32. Effect of Oversampling for Rare Events
  33. Weight of Evidence (WOE) and Information Value (IV)
  34. Difference between linear regression and logistic regression
  35. Checking Assumptions of Multiple Linear Regression
  36. Homoscedasticity Explained
  37. Detecting and correcting multicollinearity problem
  38. Detecting and solving outlier problem
  39. Difference between R-Squared and Adjusted R-Squared
  40. Standardized vs Unstandardized Coefficients
  41. Difference between CHAID and CART
  42. Relative Importance Analysis with SPSS
  43. Detecting Interaction in Regression Model
  44. Variable Selection - Wald Chi Square Analysis
  45. Learn Area under Curve (AUC)
  46. Gini Coefficient, Cumulative Accuracy Profile, AUC
  47. Chi-Square : Variable Reduction Technique
  48. Modeling Myth: General linear model and generalized linear model mean the same thing

Data Mining and Machine Learning Tutorials

The following tutorials would provide explanation of popular predictive modeling and machine learning algorithms. It covers steps of data preparation, variable selection / dimensionality reduction, model development, model performance and model validation.  Also it includes practical application of dealing assumptions of statistical techniques and how to treat them if they get violated.  You would also learn how to improve accuracy of a predictive model.
  1. Observation and Performance Window
  2. Bias-Variance Tradeoff
  3. Variable Selection / Reduction
  4. Decision Tree with R
  5. Random Forest in R
  6. K-Nearest Neighbor with R
  7. Support Vector Machine Simplified
  8. Ensemble Learning : Boosting and Bagging
  9. Random Forest on Imbalance Data
  10. Calculating Variable Importance with Random Forest
  11. Shortcomings in Random Forest Variable Importance
  12. Gradient Boosting Model (GBM)
  13. Market Basket Analysis with R
  14. Ways to Correct Class Imbalances / Rare Events
  15. Weighting in Conditional Tree and SVM
  16. Ensemble Learning - Stacking (Blending)
  17. Missing Imputation Techniques
  18. Cost Sensitive Learning For Churn Model
  19. Impute Missing Values with Decision Tree
  20. Treatment of Insignificant Levels of a Categorical Variable
  21. Calculating AUC of Validation Data with SAS

Text Mining with R

It includes fundamentals of text mining with practical case studies. It also covers how to visualize results of text mining. The popular techniques of text mining are also described in the following articles. These tutorials would help you to get started with text analytics and how to perform social media mining with R.

  1. Text Mining Basics
  2. Creating WordCloud with R
  3. Creating WordCloud by Demographic
  4. Twitter Analytics with R
  5. Named Entity Recognition with Python
  6. Sentiment Analysis with Python

Graphs

  1. How to read a box plot
  2. Understand Gain and Lift Charts

Other Resources

The links below would assist you to excel into analytics field. It includes tutorials ranging from 'How to enter into analytics' to 'What are the career prospects in analytics'. It would answer a lot of your questions - scope of SAS and R - if you are novice in analytics field. These resources would train you to work on a real world data science project.

  1. How to get into Analytics Field
  2. Free Data Sources for Predictive Modeling and Text Mining
  3. Free Ebooks on R, Python and Data Science
  4. Companies using R
  5. Analytics Companies Using SAS in India
  6. Analytics Companies Using SPSS in India
  7. Data Analysis Tools : Excel, SPSS and SAS
  8. List of free statistical softwares
  9. List of free econometrics softwares
  10. Statistics Jokes
  11. AI Jokes
  12. Data Science Jokes