Which technique do we use to reduce the number of variables

Assignment Help Management Information Sys
Reference no: EM132240296

Question 1 : According to the instruction in class, which of the following is true:

o The correlation coefficient between two variables DOES vary greatly depending on their underlying UNITS of measurement (ex: using grams vs kilograms scales changes the correlation between weight and height, etc)
o When determining the correlation coefficient r as part of a regression analysis it DOES matters which variable is considered to be the dependent variable and which the independent
o both of the above are true
o None of the above are true

Question 2 : The r (correlation) values range from-----.
o -1 to 1
o 0 to 1
o 0 to 100
o 0 to infinity

Question 3 : The r-square values range------.
o -1 to 1
o 0 to 1
o 0 to 100
o 0 to infinity

Question 4 : Assume you ran a bivariate regression model (remember bivariate means there is only one independent variable) in which # of cans of tuna fish consumed per week was the independent variable (IV) and level of hair silkiness was the dependent variable (DV); the hypothesis being eating more tuna is associated with more silky hair (because, logically, sea lions eat more fish and they're more silky feeling so why not humans, too).

The model resulted in a correlation r =.6, beta coefficient = 2.14, and a P-value =.004.

If you switched the independent and dependent variables and ran a new bivariate regression analysis (so cans of tuna = DV and hair silkiness = IV), which of the following would substantially change?
o The correlation "r" value
o The regression coefficient
o The P-value
o All of the above would substantially change
o None of the above would substantially change

Question 5 : A correlation of r = 0 indicates that
o X and Y don't have a linear relationship
o X and Y are unrelated
o X and Y a linear relationship
o None of the above

Question 6 : According to the professor, if we find the P-value associated with an unstandardized regression coefficient = .05, then the associated 95% confidenceinterval:
o is the range of the effect across 95% of the population
o indicates we are 95% confident that the coefficient is accurate
o both of the above are true
o None of the above are true

Question 7 : According to the instruction on statistical significance in class, the confidence level is the probability that a confidence interval will include the population parameter.

o True
o False

Question 8 : Which of the following is true?
o If a difference in measurement is statistically significant, then it is also practicallysignificant
o If a difference in measurement is practically significant, then it is also statistically
o Significant
o both of the above are true
o None of the above are true

Question 9 : According to the discussion on sample size and statistical significance (hint: think of the excel sample size calculator), when calculating sample size
o You need to provide the estimated margin of error
o You need to provide the Z score (or the corresponding P- Value)
o both of the above are true
o None of the above are true

Question 10 : According to the discussion on sample size and statistical significance (hint: think of the excel sample size calculator), when the level of variance increases, the sample size needed to maintain a particular level of statistical significance
o Increase
o Decrease
o Is not affected (can main the same)
o Can't be determined

Question 11 : As described in class, when it comes to parallel computation of regression analysis on very large datasets (in which the data is spread over many machines, analysis is run on each machine, and then averages are taken across all of the analyses), the resulting averages are good approximations for which of the following computations (assumingno extra data manipulation):
o regression coefficients
o 95% confidence intervals
o both of the above
o none of the above

Question 12 : As described in class, when it comes to parallel computation of regression analysis on very large datasets (in which the data is spread over many machines, analysis is run on each machine, and then averages are taken across all of the analyses), the resulting P-value scores (assuming no extra data manipulation) are --------- what they would be if the analysis of the dataset was performed on a single supercomputer
o good approximations of
o larger than
o smaller than

Question 13 : One of the advantages of very large datasets (big data) is that we no longer have toworry about multicollinearity among independent variables in regression analysis
o True
o Fales

Question 14 : Multicollinearity is problematic in a typical sized dataset (ex: couple hundred observations and several variables) if high correlation (above .85 or .90) exists between:

o Two or more independent variables used in the multivariate regression model
o Any of the independent variables and the dependent variable used in the multivariate regression model
o both of the above
o none of the above

Question 15 : Factor analysis can be used in which of the following?
o To identify underlying dimensions, or factors, that explain the correlations among a set of variables
o To identify a new smaller set of uncorrelated variables to replace the original set of correlated variables in subsequent multivariate analysis.
o To identify a smaller set of salient variables from a larger set for use in subsequent analysis.
o All of the above are correct circumstances.

Question 16 : According to the professor, including all of the criterion variables related to the managerial decision under question (aka including every variable we are interested in that is in the database) in cluster formulation will IMPROVE the insight of cluster analysis.
o True
o False

Question 17 : Which of the following is NOT one of the 3 requirements of market segmentation?
o Identifiable
o Reachable
o Sizeable
o Loyal

Question 18 : The three benefits of market segmentation include
o Identifies opportunities for new product development
o Improves the strategic allocation of marketing resources
o both of the above
o none of the above

Question 19 : Consumer Market Major Segment Bases include all of the following EXCEPT:
o Demographic
o Psychographic
o Teleographic
o Behavioral

Question 20 : In the usage based approach to market segmentation, every company at least _______segments:
o One
o Two
o Three
o Four
o Five
o Eight

Question 21 : Cluster analysis does not classify variables as dependent or independent
o True
o False

Question 22 : When it comes to linking methods in cluster analysis, which of these statements are true?
o each of the linkage methods can yield different results when used on the same dataset
o each linking method has its specific properties
o both of the above are true
o none of the above are true

Question 23 : Which statement is NOT true about cluster analysis?
o Cluster analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the independent variables are interval in nature.
o Cluster analysis is also called classification analysis or numerical taxonomy.
o Groups or clusters are suggested by the data, not defined a priori.
o Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters.

Question 24 : When it comes to K-Means clustering, which of the following is true?
o we do not need to tell it how many clusters there are to begin with
o we can assess the best configuration by examining the dissimiliarly coefficient in the K-means generated agglomeration schedule
o both of the above are true
o none of the above are true

Question 25 : When it comes to K-Means clustering, which of the following is true?
o we can save the cluster membership
o using different combinations of variables can result in different cluster assignments for each observation
o both of the above are true
o none of the above are true

Question 26  : When it comes to very large datasets (i.e., a form of big data) and cluster analysis, it was recommended in class to:
o just use hierarchical cluster analysis
o just use k-means cluster analysis
o use both hierarchical and k-means cluster analysis together
o never use cluster analysis

Question 27 : Clustering can be performed using:
o Observable (directly measured) variables such as dollars spent on different products,etc
o Unobservable (inferred) variables measured on surveys such as attitudes and moods etc. (ex: happiness, sadness, etc. liked scales)
o both of the above
o none of the above

Question 28 : If you switch the dependent and independent variables in a regression analysis, which of the following will usually change:
o Regression Coefficient Value
o P-value
o both of the above change
o none of the above change

Question 29 : If you switchthe dependent and independent variablesin a regression analysis,which of the following willusually change:
o Regression Coefficient standard error
o Correlation (r) score
o both of the above change
o none of the above change

Question 30 : The 3 types of linkage in cluster analysis include:
o single linkage
o complete linkage
o both of the above true
o none of the above true

Question 31 : If we have several hundred independent variables that mightbe related, which technique do we use to reduce the number of variables?
o Factor Analysis
o Cluster Analysis
o Compare Means Analysis
o Numerical Taxonomy
o Classification Analysis

Question 32 : When running Factor Analysis, which of the following is recommended by theprofessor:
o We do not rotate the data
o We select a Promax rotation
o I We select a Varimax rotation
o We select a Quadmax rotation
o We selecta Expert rotation so SPSS will decide which rotation is best

Question 33 : When running Factor Analysis, we select to suppress absolute value smaller than
o .01
o .05
o .1
o .4
o .95

For questions 34 to 36, assume that you ran a regression to predict which
students are most likely to agree to dress up as a circus clown for $5 at a charitybaseball game at the local elementary school. The results are:

Dependent Variable: Likelihood of Dressing Up as a Clown (Very Unlikely 1, 2, 3, 4, 5, 6, 7 Very Likely)



Standard Error of Coefficient




Blue hair color (vs all others)



Age (in years)



Income (in $1000s)



Rides a Moped to School



Likes Watching Dr. Who



Question 34 : Is someone with red hair more or less likely to dress up as a clown than someonewith blue hair?
o More likely
o Less likely
o Equal likelihood
o It is not possible to determine from the provided data

Question 35 : Referring back to the regression coefficients from question 34....

Is the coefficient for "Age" statistically significant at a 95% confidence interval?
o Yes
o No
o It can't be determined from the provided information

Question 36 : Referring back to the regression coefficients from question 34....

Is the coefficient for 'likes watching Dr. Who' statistically significant at a 95% confidence interval?
o Yes
o No
o It can't be determined from the provided information

Question 37 : Referring back to the regression coefficients from question 34....

Who of the following is most likely to dress up as a clown at the charity?
o ride a moped, has red hair, 20 years old
o rides a bicycle, has blue hair, 20 years old
o like watching dr who, has blue hair, 20 years old

Question 38 : When it comes to the exponential smoothing model in forecasting, which following are true:
o The model has received widespread acceptance among American business firms that employ sales forecasts for managerial planning and control
o It uses special weighted moving averages and seasonal factor that is multiplied by the weighted moving average to calculate the forecast.
o both of the above are true
o none of the above are true

Question 39 : Generally, the exponential smoothing model uses _________ smoothed statistics that are weighted.
o Zero
o One
o Two
o Three
o Four
o Five

Question 40 : One of the most widely used techniques for short-term forecasting that is autoregressive integrated moving average (ARIMA) model associated with G. E. P. Box and G. M. Jenkins
o True
o False

Question 41 : Which of the following is true regarding the Arima forecasting technique?
o It is somewhat mathematically tedious and complex
o It relies of using past sales data exclusively
o Both of the above are true
o None of the above are true

Question 42 : In excel, the forecast sheet button produced a graph and a set of forecast estimates. According to the professor, this techniques uses which forecasting approach?
o simple moving average
o exponential smoothing
o ARIMA modeling
o ARMA modeling

Question 43 : In excel, if you have five columns each with a different year of data (think 2015, 2016, 2017, 2018, 2019) the spreadsheet has to be sorted so the oldest/smallest columns (ex:2015) always are to the left and newest/largest values always to the right (ex: 2019) or the forecast can't ever calculate the next predicted value correctly.
o True
o False

Reference no: EM132240296

Questions Cloud

What are the implications for organizational change : What are the major effects of the physical separation of group members? How can distance, in some cases, be beneficial to groups and teams?
Perform some research on a newer malware variant : Were you able to see this malware at both vendors? If so, are there any differences in how they are reported between the two vendors?
What likely effect will its emphasis on electronically store : At the end of 2006, a new edition of the Federal Rules of Civil Procedure (FRCP) went into effect. Using a Web search tool, learn more about the FRCP.
What are the functions of each of given muscle types : What are the functions of each of these muscle types? What are the similarities and differences between the structure and function of skeletal, smooth.
Which technique do we use to reduce the number of variables : If we have several hundred independent variables that mightbe related, which technique do we use to reduce the number of variables?
What are different ways a doctor may treat a fractured bone : What are the different ways a doctor may treat a fractured bone? Have you or someone you know had a fracture? How was it treated? What was the outcome?
Was there a new technology used : Was there a new technology used? How? What did it discover? Was there a new species discovered? New information on an already discovered species?
Describe the social and environmental factors : Provide demographic statistics of plague cases in Madagascar? Specific social and environmental factors that contribute to the spread of plague in Madagascar?
Define how your experience has met the competencies : Please discuss how your experience has met the following competencies. Apply evidence-based principles and the scientific knowledge base to critical evaluation.


Write a Review

Management Information Sys Questions & Answers

  What type of valuation to apply to any of Indian companies

PMBA 6317 Global Strategy, Policy, and Regulation - overview of the gains from trade, exchange rates, international trade policy, and business policy

  Discuss the topic of big data and its business impacts

Research at least two articles on the topic of big data and its business impacts. Write a brief synthesis and summary of the two articles.

  Create matrix comparing features of presentation software

Create a matrix comparing five features of presentation software. Using these five features, compare two presentation programs.

  Explain the importance of health promotion

Explain the importance of health promotion and the impact on the nation and the world. How much should the government play a role in our health?

  Using the internet research the business uses of cloud

1. article reviewin this article review you will describe one thing about prototyping that surprised you the most. find

  Build a technique for predictive model

Recognize the implementation of the Understand the concept and formulation of the k-nearest neighbor algorithm (kNN) in your business and specify the reason.

  Create your fictional company leaders and their titles

ISEM 570 : Your team will pick a business sector and then visualize a type of business.

  Determine five of the controllers monitored variables

Determine five of the controller's monitored and controlled variables. Describe each variable and explain how it is used in the system. Propose five mode classes and five terms that may be helpful in monitoring this system.

  Prepare a case analysis report on establishing controls

The CIO has seen several resources online which discuss the security risks related to Cloud based computing and storage.

  How supply chain utilizes the idea of speed to market

Supply chain utilizes the idea of speed to market -How might both the manufacturer and the end-consumer be harmed utilizing this concept? Document your sources.

  Research methodology explained in this solutiondemand for

research methodology explained in this solutiondemand for china rmb currency and will specifically focus on demand for

  Why standards are important for wireless security

Explain the needed advancements and how that will advance the wireless technology and business operations.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd