Define outlier and the curse of dimensionality

Assignment Help Basic Computer Science
Reference no: EM133058695

1. What's an attribute? What's a data instance?

2. What's noise? How can noise be reduced in a dataset?

3. Define outlier. Describe 2 different approaches to detect outliers in a dataset.

4. Describe 3 different techniques to deal with missing values in a dataset. Explain when each of these techniques would be most appropriate.

5. Given a sample dataset with missing values, apply an appropriate technique to deal with them.

6. Give 2 examples in which aggregation is useful.

7. Given a sample dataset, apply aggregation of data values.

8. What's sampling?

9. What's simple random sampling? Is it possible to sample data instances using a distribution different from the uniform distribution? If so, give an example of a probability distribution of the data instances that is different from uniform (i.e., equal probability).

10. What's stratified sampling?

11. What's "the curse of dimensionality"?

12. Provide a brief description of what Principal Components Analysis (PCA) does. [Hint: See Appendix A and your lecture notes.] State what's the input and what the output of PCA is.

13. What's the difference between dimensionality reduction and feature selection?

14. Describe in detail 2 different techniques for feature selection.

15. Given a sample dataset (represented by a set of attributes, a correlation matrix, a co-variance matrix, ...), apply feature selection techniques to select the best attributes to keep (or equivalently, the best attributes to remove).

16. What's the difference between feature selection and feature extraction?

17. Give two examples of data in which feature extraction would be useful.

18. Given a sample dataset, apply feature extraction.

19. What's data discretization and when is it needed?

20. What's the difference between supervised and unsupervised discretization?

21. Given a sample dataset, apply unsupervised (e.g., equal width, equal frequency) discretization, or supervised discretization (e.g., using entropy).

22. Describe 2 approaches to handle nominal attributes with too many values.

23. Given a dataset, apply variable transformation: Either a simple given function, normalization, or standardization.

24. Definition of Correlation and Covariance, and how to use them in data pre-processing.

Reference no: EM133058695

Questions Cloud

Information needed for logic modeling : What types of questions need to be asked during requirements determination in order to gather the information needed for logic modeling?
What are the different recovery strategies : What is Risk Analysis? Describe the Attributes of Risk. What are the different Recovery strategies? Explain in detail. What is Business Impact Analysis?
Conventional computer-aided manufacturing industry : The recent advances in information and communication technology (ICT) has promoted the evolution of conventional computer-aided manufacturing industry
IDaaS is cloud-based identity and access control service : IDaaS is a cloud-based identity and access control service. An organization can outsource some or all of its access control implementation to an IDaaS provider
Define outlier and the curse of dimensionality : Define outlier. Describe 2 different approaches to detect outliers in a dataset. What's "the curse of dimensionality"? What's stratified sampling?
Penetration testing techniques : Do a bit of research on penetration testing techniques
United states of america versus ross ulbrecht : Discuss the case involving the United States of America versus Ross Ulbrecht.
Anonymous means of accessing the internet : Describe the reasons for having a totally anonymous means of accessing the internet. Please also discuss the dangers of that same internet.
Recently adopted new security policy : Your company has recently adopted a new security policy that states that all confidential e-mails must be signed using a digital signature.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  Determine the cardinality for region

Construct a Venn diagram and determine the cardinality for each region. Use the completed Venn Diagram to answer the following questions.

  Find the probability of event described in problem

If one ball is drawn at random, find the probability of each event described in Problem. (a) The ball is black and even-numbered.

  Give the function table and explain its operation

Give the function table and explain its operation.

  It project management variables

When It comes to IT Project Management variables, "time, cost, Scope" why is scope of the project important to analyze prior to starting a project?

  Problems faced by dual-career couples-families

1. Discuss the special problems faced by dual-career couples/families?

  Define electronic monitoring and employee productivity

Your task is to prepare a concise report that considers the relevant issues in electronic monitoring, employee productivity, and makes a recommendation. Your report should include a discussion of the current trend to use electronic monitoring to m..

  Intellectual reaction to problems of the times

Some historians hold that economic theory can be explained as an intellectual reaction to problems of the times.

  Risk management-assessing the risk

Risk management is process of discovering and assessing the risks to an organization's operations and determining how those risks can be controlled or mitigated

  How do you create python function

How do you create Python function that will accept as input three string values from a user.

  Humans and technology interact in all information systems

It is important to understand that humans and technology interact in all information systems

  Calculate and interpret the value of the correlation

a. Calculate and interpret the value of the correlation coefficient between sale price and size.

  Majority of population associates blockchain

The vast majority of the population associates Blockchain with cryptocurrency Bitcoin; however, there are many other uses of blockchain;

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd