write separate responses for the below questions
1.Discuss the importance of preprocessing the datasets to ensure better data quality for data mining techniques. Give an example from your own personal experience.
2.Discuss the advantages and disadvantages of using sampling to reduce the number of data objects that need to be displayed. Would simple random sampling (without replacement) be a good approach to sampling? Why or why not?
3.Discuss the major issues in classification model overfitting. Give some examples to illustrate your points.
4.Compare different Ensemble methods with appropriate examples.
5.Discuss the strengths and weaknesses of using K-Means clustering algorithm to cluster multi class data sets. How do you compare it with a hierarchical clustering technique.
6.Compare and contrast the different techniques for anomaly detection. Discuss techniques for combining multiple anomaly detection techniques to improve the identification of anomalous objects.
Note: No Plagiarism.