Machine Learning MCQ Questions And Answers

11. How do you handle missing or corrupted data in a dataset?

Drop missing rows or columns
Replace missing values with mean/median/mode
Assign a unique category to missing values
All of the above

Answer : D
Explanation: All of the above techniques are different ways of imputing the missing or corrupted data in a dataset.

12. The most widely used metrics and tools to assess a classification model are:

Confusion matrix
Cost-sensitive accuracy
Area under the ROC curve
All of the above

Answer : D
Explanation: None

13. A model of language consists of the categories which do not include?

Language units
Structural units
Role structure of units
System constraints

Answer : B
Explanation: A model of language consists of categories which does not include structural units.

14. Suppose we would like to perform clustering on spatial data such as the geometrical locations of houses. We wish to produce clusters of many different sizes and shapes. Which of the following methods is the most appropriate?

Decision Trees
Model-based clustering
K-means clustering
Density-based clustering

Answer : D
Explanation: The density-based clustering methods recognize clusters based on the density function distribution of the data object. For clusters with arbitrary shapes, these algorithms connect regions with sufficiently high densities into clusters.

15. Which of the following is a disadvantage of decision trees?

Factor analysis
Decision trees are robust to outliers
Decision trees are prone to be overfit
None of the above

Answer : C
Explanation: Allowing a decision tree to split to a granular degree makes decision trees prone to learning every point extremely well to the point of perfect classification that is overfitting.

16. Which of the following is true about Naive Bayes?

Assumes that all the features in a dataset are equally important
Assumes that all the features in a dataset are independent
Both A and B
None of the above options

Answer : C
Explanation: None

17. Among the following which is not a horn clause?

p → Øq
p
p → q
Øp V q

Answer : A
Explanation: p → Øq is not a horn clause from the above options.

18. Which of the following techniques can not be used for normalization in text mining?

Stop Word Removal
Stemming
Lemmatization
None of the above

Answer : A
Explanation: Stop word removal is not but Lemmatization and stemming are the techniques of keyword normalization.

19. Which of the following is a reasonable way to select the number of principal components “k”?

Choose k to be the smallest value so that at least 99% of the varinace is retained
Use the elbow method
Choose k to be 99% of m (k = 0.99*m, rounded to the nearest integer)
Choose k to be the largest value so that 99% of the variance is retained

Answer : A
Explanation: Choose k to be the smallest value so that at least 99% of the variance is retained and This will maintain the structure of the data and also reduce its dimension.

20. In which of the following cases will K-means clustering fail to give good results?

Data points with outliers
Data points with different densities
Data points with nonconvex shapes

1 & 2
1, 2, & 3
2 & 3
1 & 3

Answer : B
Explanation: K-means clustering algorithm of Machine Learning fails to give good results when the data contains outliers, the density spread of data points across the data space is different, and when the data points with nonconvex shapes.

Pages: 1 2

11. How do you handle missing or corrupted data in a dataset?

12. The most widely used metrics and tools to assess a classification model are:

13. A model of language consists of the categories which do not include?

14. Suppose we would like to perform clustering on spatial data such as the geometrical locations of houses. We wish to produce clusters of many different sizes and shapes. Which of the following methods is the most appropriate?

15. Which of the following is a disadvantage of decision trees?

16. Which of the following is true about Naive Bayes?

17. Among the following which is not a horn clause?

18. Which of the following techniques can not be used for normalization in text mining?

19. Which of the following is a reasonable way to select the number of principal components “k”?

20. In which of the following cases will K-means clustering fail to give good results?

Please Share This Share this content

You Might Also Like

Data Science MCQ Questions And Answers

Computer Graphics MCQ Questions And Answers

Software Engineering MCQ Questions And Answers

Share this content