xavier data science homework 4
- Identify all questions that you attempted in this template Q1 Chapter 04 Classification ExamplesPart 1 Review logistic regression in Chapter 4 – Classificationhttps://github.com/JWarmenhoven/ISLR-pythonUse the examples to review 4.3 logistic regression for the ISLR Text
- Plot Figure 4.1
- Plot Figure 4.2
- Table 4.1, 4.2, 4.3
- Plot Figure 4.3
Hint use – https://nbviewer.jupyter.org/github/JWarmenhoven/ISL-python/blob/master/Notebooks/Chapter%204.ipynb#4.3-Logistic-Regression
Part 2 Application to Caravan Insurance Data¶Use Caravan.csv to apply KNN and Logistic Regression to the Caravan dataHint – use https://nbviewer.jupyter.org/github/JWarmenhoven/ISL-python/blob/master/Notebooks/Chapter%204.ipynb#4.6.5-K-Nearest-NeighborsQ2. Classification Textbook ExamplesUsing the Boston data set, fit classification models in order to predict whether a given suburb has a crime rate above or below the median. Explore logistic regression, and KNN models using various subsets of the predictors. Describe your findings.Hint – use: https://botlnec.github.io/islp/sols/chapter4/exercise13/Q3 Iris Data Set and Classification (iris.csv)The Iris dataset was used in R.A. Fisher’s classic 1936 paper. It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. The columns in this dataset are:- Id
- Sepal Length Cm
- Sepal Width Cm
- Petal Length Cm
- Petal Width Cm
- Species
- Plot the iris dataset – i) “Sepal Length vs Sepal Width†ii) “Petal Length vs Petal Widthâ€
Split into Training / Test and
- Apply Naïve Bayes Classifier to classify species with the decision boundaries
- Apply logistic regression to classify species with the decision boundaries
- Apply KNN algorithm to classify species with the decision boundaries
- Compare the “Truth matrix†and Accuracy of the three algorithms
TP TN FP FN Accuracy Naïve Bayes Logistic Regression KNN HintNaïve Bayes – https://xavierbourretsicotte.github.io/Naive_Bayes_Classifier.htmlLogistic Regression –https://scikit-learn.org/stable/auto_examples/linear_model/plot_iris_logistic.htmlhttps://www.datacamp.com/community/tutorials/understanding-logistic-regression-pythonKNN Algorithm –https://www.ritchieng.com/machine-learning-k-nearest-neighbors-knn/Q4 Titanic Data Set and Classification (titanic.zip – already separated as test, train)
- Perform Exploratory Data Analysis
- Do Feature Engineering
- Apply logistic regression
- Apply KNN algorithm
Hinthttps://www.kaggle.com/angps95/basic-classification-methods-for-titanicQ5. How does k-fold cross validation and grid search on the Social Ads Network dataUse the references the explain how the two work together to evaluate a modelhttps://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.htmlhttps://sebastianraschka.com/faq/docs/evaluate-a-model.htmltitanic.zip
Q5 k_fold and grid search.zip
ISLR4_1to4_3.py
Iris.csv
Default.xlsx
Caravan.csv
HW04.docx