Explore the data by searching for anticipated relationships

Assignment Help Applied Statistics
Reference no: EM131906185

SVM models Development Using SAS Enterprise Miner

1. Select from the data sets available (or ones designated by your instructor or other available sources). Provide a thorough description of the data set to include the number of cases, description of the inputs, target variable, description of the variables that could be used to develop predictive models, etc. (NOTE: predictive models are developed better with larger data sets that have many cases and possible inputs from which to select. Part of your grade for this assignment will be based on the robustness of the data set used.)

2. Explore the data by searching for anticipated relationships, unanticipated trends and anomalies - to gain deeper understanding and ideas. Use the SEMMA explore option to examine the data set you have created and look for interesting anomalies or relationships.

3. Cleanse and modify the data by removing errors, imputing missing values (as appropriate), transforming the variable distributions as necessary, and creating and selecting appropriate variables. Use the appropriate SEMMA options to cleanse the dataset as necessary. Investigate and discuss any "feature engineering" done for the data set.

4. Develop predictive models using the appropriate predictive modeling technique. Develop complete prediction models. There should be at least two models developed, compared and explained. The imbalanced target variable must be addressed and accounted for using one or more of the methods outlined in earlier lessons.

5. Using appropriate accuracy measures, assess the resultant models. Provide a complete assessment of the different models created using the SAS Enterprise Miner assessment ? options. Explain clearly any insights or conclusions from the accuracy measures.

6. Conclusions and takeaways. Provide clear and concise conclusions about the project to include lessons learned and any suggested improvements for future development. Suggest future enhancements for the analysis.

Note: Submitted report must be either in MS Word or PDF format and titled: "Assignment2_LastName". Only one document will be allowed to be submitted.

Content (note that the document must have at clearly marked sections at least for the items listed below)

1) Title page (1 page limit): course number and term, assignment number and project title, student name and contact information, instructor's name. Format it so it looks pleasant and presentable. Follow formatting guidelines above.

2) Introduction. Provide a brief outline of the dataset(s) you are using for this assignment. You may use the same classification data set you used for assignment 1, if it meets the following criteria: 1) sufficient number of cases (2000 or more); 2) reasonable number of possible input features (12-15 or more); 3) binary target variable; 4) heavily skewed target variable (at least 75% one outcome). NOTE: Dealing with a variety of different data sets provides you with more experience in cleaning and preparing data sets for model development. SVM models are best used with data sets with binary targets.

Briefly explain the content of the data to include a description of the variables in the data sets, the number of cases, etc. Include a screenshot of the data (not all cases need be shown, but be sure all relevant variables are visible). Provide a clear description of the purpose of the model being developed.

3) Data cleansing and/or preparation. Explain what was done and why it was necessary.

4) Predictive models developed. Clearly present, compare, and explain the models.

5) Results. Include appropriate results for the models. Interpret the results for meaning.

6) Conclusions and takeaways. Provide clear and concise conclusions about the project to include lessons learned and any suggested improvements for future development.

7) References (1 page limit): List all references in APA format used in preparing this report. It is strongly recommended to use outside knowledge in setting-up the analysis or discussing the results where possible.

8) Appendix (6 page limit): Include any appropriate workbooks, screenshots (figures, tables, diagrams) used in this assignment. Make sure all tables or figures or diagrams are easily readable and visually presentable.

Attachment:- Assignment Files.rar

Reference no: EM131906185

Questions Cloud

What is the meaning of a derived demand : The demand for labor is said to be a "derived" demand. What is the meaning of a derived demand?
What are police psychological services : What are police psychological services, and what is the role of a psychological services section? How do in-house psychological services differ from department.
Estimates of the value of environmental and natural resource : "Willingness to pay" and "willingness to accept" estimates of the value of environmental and natural resources often can vary by a factor of up to 10.
Conduct research on potential ethical implication of working : Using online resources, conduct research on potential ethical implications of working within a police and public safety environment.
Explore the data by searching for anticipated relationships : Explore the data by searching for anticipated relationships, unanticipated trends and anomalies - to gain deeper understanding and ideas
Changing taxes instead of government spending : Suppose the government wants to boost GDP by $20 billion but by changing taxes instead of government spending. How much would they need to change taxes by?
What is the natural rate of unemployment : a) Suppose the rate of finding a job is 50% while the separation rate is 5%. What is the natural rate of unemployment?
Discuss the six processes for risk management : Discuss, analyze, and indicate the importance of each of the six processes for risk management identified in the PMBOK Guidelines.
What is your four-letter result : For this assignment, you will take an online personality assessment, and reflect on your own identity and whether or not it has remained consistent over time.

Reviews

len1906185

3/19/2018 2:46:59 AM

Submission - Each student will submit a single document conforming to the guidelines and standards outlined above. Document format: limited to 7 pages (excluding title page, references, and appendix). Include only important figures and tables in the main paper. Supporting information can be included in the appendix. Part of the assignment is to become adept at writing a complete report succinctly, but one which includes all of the key elements of the analysis. Double-spaced, 12 point Times New Roman font, 1” margins, Bottom-right page numbering. Additional supporting material can be included in the appendix.

len1906185

3/19/2018 2:46:54 AM

Note: Submitted report must be either in MS Word or PDF format and titled: “Assignment2_LastName”. Only one document will be allowed to be submitted. Assignments that: 1) adequately address all required tasks; 2) are submitted on time; 3) are properly formatted (APA format for references, no typos or misspelled words, no grammar errors, cover page, etc.) will receive a grade of B (80-85, depending on content). ?

len1906185

3/19/2018 2:46:49 AM

In order to increase (but not guarantee) your chances of receiving a higher grade, you need to show clear evidence of critical thinking. Critical thinking can take many forms, depending on the type of assignment. In some instances, showing greater depth (e.g., such as creating more models, looking at more than one insightful fact or relationships, and comparing them on key criteria) is one method for providing evidence of critical thinking. In other cases, it might include providing more explanation to include the pros and cons of the approach used or the arguments in favor and against the proposal as well as some criteria for choosing among the alternatives. Still another example would be providing significant insights as to how the assignment outcome would benefit (or would meet resistance) in your organization and what steps might be employed to facilitate acceptance. Certainly, this is not a complete list, but gives some examples of critical thinking aspects.

Write a Review

Applied Statistics Questions & Answers

  What is the lower and upper bounds of that interval

Before calculating the 95% confidence interval, it is always a good plan to first identify the values of the elements in the formula in order to complete the calculation. Now that you have those values, calculate the 95% confidence interval (CI). ..

  What is the quality of the references

Issues Capstone Poster Project: This was modified from an assignment developed by the Biology Faculty at BC, and derived from the Bio 100 Student Module. What is the quality of the references? Are the references relevant for the topic

  The marketing ethics case- conflict of interest

Assuming that the sample of 205 marketing researchers has been randomly selected. How much evidence is there that a majority of all marketing researchers disapprove of the actions taken?

  Important for a security professional

Why is it important for a security professional to know specifics about the particular target environment to which they are assigned? How can experienced security professionals prepare themselves for challenges and opportunities they have never encou..

  The number of hours a student watches tv

Select two variables that might be related, such as the number of hours a student watches TV and the number of credits the student has, or a baseball player's salary and the number of home runs per year the player scores. You will need to get..

  The alternative hypothesis h1 in symbolic form

The proportion of people aged 18-25 who currently use illicit drugs is equal to 0.20 (or 20%). Express the null hypothesis H0 and the alternative hypothesis H1 in symbolic form. Be sure to use the correct symbols - μ, p, and σ-for the indicated

  Standard deviation of the binomial distribution

Find the Mean, Variance, Standard Deviation of the binomial distribution with the given values of N and P

  What statistics from this module would use

As a manager of an organization, what statistics from this module would you use and why if you wanted to estimate your annual employee turnover? In your post, be sure to specifically identify the statistical formulas and what additional data you woul..

  Assignment-workforce 2020 executive report

Both leadership and organizational strategy in the next decade will see major trends affecting the way organizations conduct business. These include the deployment of more technology to extend reach and access, increased globalization, increased d..

  Return and rate the restaurant food as good

What proportion of customers say that they will return and rate the restaurant's food as good?

  How should they use the information from the labs

How should they use the information from the labs?

  Given a standard normal probability distribution

Given a standard normal probability distribution, what can be said about the mean and standard deviation

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd