Reference no: EM132747218
1. Naïve Bayesian Classifiers are among the most successful known algorithms for learning to classify text documents and they are used to detect fraud.
True
False
2. What is the most accurate statement that describe what "Leaf Nodes" are in a decision tree?
Decisions that are often used as components in ensemble techniques predictive models which will all vote.
The nature of the variable, you may need to include an equal to component on one branch.
Decision at the end of the last branch on the tree. These represent the outcome of all the prior decisions. They are the class labels, or the segment in which all observations that follow the path to the leaf would be placed.
None of the above.
3. What is confusion matrix and what are the values, rates, metrics associated with the matrix?
4. Decision Trees are a flexible method very commonly deployed in data mining applications. There are two types of decision trees. What are the two trees and the descriptions of both?
5. Decision Trees take only categorical variables. They cannot handle many distinct values such as the zip code in the data and are limited to only one attribute.
True
False
6. Time Series Analysis is the analysis of sequential data across equally spaced units of time. Time Series is a basic research methodology in which data for one or more variables are collected for many observations at different time periods. What are the two main objectives in Time Series Analysis?
7. In the Box-Jenkins model, Autoregressive (AR) models can be coupled with moving average (MA) models to form:
The input for the model that trend and are seasonality adjusted in time series and the output that provides an expected future value of the time series.
The next stage to determine the p and q in the ARIMA (p, d, q) model.
The assurance level of obtaining the highest forecasting accuracy possible in terms of the variables on which the forecast is based.
A general and useful class of time series models called Autoregressive Moving Average (ARMA) models.
8. The Information Gain is defined as the difference between the base entropy and the conditional entropy of the attribute.
True
False
9. Pure enough" usually means that other information can be gained by splitting on other attributes.
True
False
10. The key application of Time Series Analysis is in forecasting. In regard to Time Series data, what are the 6 fields in society that Time Series data provide useful information about the physical, biological, social or economic systems generating the time series?