Prerequisite: Attendees should have some coding experience, basic statistics, and will need to bring a laptop computer with RStudio installed prior to the session. When you register for the class you will receive detailed instructions for download and installation of RStudio.
With the advent of big data, there is an increased focus on data mining and the value that can be derived from large data sets. Data mining is the process of selecting, exploring, and modeling large amounts of data to uncover previously unknown information for business benefit.
R is an open source software environment for statistical computing and graphics and is very popular with data scientists. R is being used for data analysis, extracting and transforming data, fitting models, drawing inferences, making predictions, plotting, and reporting results. Learn how to use R basics, working with data frames, data reshaping, basic statistics, graphing, linear models, non-linear models, clustering, and model diagnostics.
You Will Learn
- How to configure the RStudio environment and load R packages
- How to use R basics such as basic math, data types, vectors, and calling functions
- How to use advanced data structures such as data frames, lists, and matrices
- How to use R base graphics
- How to use R basic statistics, correlation, and covariance
- How to use linear models such as simple linear regression, logistic regression
- How to use non-linear models such as decision trees and Random Forests
- How to apply clustering using K-means
- How to complete model diagnostics
- Anyone interested in learning to use data mining techniques to find insights in data and who has at least some statistical and programming experience.