Develop predictive models that find cartographic variables

Assignment Help Other Subject
Reference no: EM132249272

Assignment -

Overview and Assignment Goals: The objectives of this assignment are the following:

  • Use/implement feature selection/reduction technique(s).
  • Experiment with various classification models: Decision Tree, Naïve Bayes and Neural Network are the minimum requirements.
  • Think about dealing with imbalanced data.
  • F1 Scoring Metric.

Detailed Description: The goal of this competition is to allow you to develop predictive models that can determine, given 54 cartographic variables, the correct forest type. There are 7 forest types (1-7) in the dataset. Each observation (record) represents a 30x30 meter region. As such, the goal is to develop the best classification model that can predict the forest type given the observation.

Since the dataset is imbalanced, the scoring function will be the F1-score instead of Accuracy.

Caveats:

  • The dataset has an imbalanced class distribution. No information is provided for the test set regarding the distribution.
  • Use your data mining knowledge till now, wisely to optimize your results.
  • Try at least the following three classification methods: Decision Tree, Naïve Bayes, Neural Network.

Data Description: The dataset is split into training and test sets; both files are in CSV format. The training dataset consists of 14,528 records and the test dataset consists of 116,205 records. We provide you the class labels in the training set, and the test labels are held out. There are 55 attributes in each of the training and test sets. Attributes 1-54 are numeric cartographic variables - some of them are binary variables indicating absence or presence of something, such as a particular soil type. Specifically, attributes #1, 8, 9, 20, 22, 31, 42, 47, 50, 54 are numeric, and the rest are all binary (except the one for class labels). The last column contains the class labels.

  • train.csv: Training set with 55 attributes. The last attribute is the class label (1~7).
  • test.csv: Testing set with 54 attributes since the class labels are withheld.
  • format.dat: A sample submission with 116,205 entries of randomly chosen numbers between 1 and 7.

Rules:

Feel free to use the programming language of your choice for this assignment.

While you can use libraries and templates for dealing with this problem, remember implementation is 50% of the grade. There should still be programming needed even if you choose to use existing packages. You should be able to explain these methods and their choice in sufficient detail.

Implementation will be graded based on the quality of your code, the amount of effort put in for classifier/model selection, scalability, etc. You are required to try at least the following three classifiers (1)Decision Tree, (2) Naïve Bayes, and (3) Neural Network. You can try more classifiers if you want to, but if it's something we have not covered in class, make sure you provide explanation of the method(s) to demonstrate your understanding of it. Justify the choice of your method via experiments and report the results using tables. Submit your best predictions. Summarize your findings in the report.

Your results should be reproducible. If we find that we cannot reproduce your results, or if the description in your report does not match what your code does, you will receive penalty on the assignment, and this may result in honor code violation.

You are allowed 5 submissions in a 24 hour cycle.

Attachment:- Assignment Files.rar

Reference no: EM132249272

Questions Cloud

What types of it applications might it consider : Describe the strategy a healthcare organization can use to lower its cost of care. What types of IT applications could they use to help them achieve this goal?
Create a table differentiating the bell la padula model : Create a table in Microsoft Word differentiating the Bell La Padula model, the Denning Information Flow model, Rushby's model, the Biba model.
What biblical principle can you think of that could decrease : If you cannot think of a single principle, then what biblical principle would you identify as guiding your responsibilities as a network security manager?
Prepare a high-level plan for your evaluation study : Consider what type of formal evaluation study could be used to learn more about this technology and how it is likely to interact with people.
Develop predictive models that find cartographic variables : The goal of this competition is to allow you to develop predictive models that can determine, given 54 cartographic variables
How would granting access to this impact their business : Companies like Google, Apple, Microsoft, Twitter, Amazon and Facebook offer up free services to customers all across the globe.
Discuss project management tools : Discuss project management tools that will help you accomplish this task and conduct a risk analysis of what can go wrong.
Demonstrate a connection to your current work environment : If you are not currently working, share how this could be applied to an employment opportunity in your field of study.
Describe how you would start this incident off correctly : Describe how you would start this incident off correctly by properly protecting and securing the evidence on the laptop.

Reviews

len2249272

3/6/2019 2:41:52 AM

This is an individual assignment. Discussion of broad level strategies are allowed but any copying of prediction files and source codes will result in honor code violation. Feel free to use the programming language of your choice for this assignment. While you can use libraries and templates for dealing with this problem, remember implementation is 50% of the grade. There should still be programming needed even if you choose to use existing packages. You should be able to explain these methods and their choice in sufficient detail.

len2249272

3/6/2019 2:41:47 AM

Implementation will be graded based on the quality of your code, the amount of effort put in for classifier/model selection, scalability, etc. You are required to try at least the following three classifiers (1)Decision Tree, (2) Naïve Bayes, and (3) Neural Network. You can try more classifiers if you want to, but if it’s something we have not covered in class, make sure you provide explanation of the method(s) to demonstrate your understanding of it. Justify the choice of your method via experiments and report the results using tables. Submit your best predictions. Summarize your findings in the report.

len2249272

3/6/2019 2:41:42 AM

Your results should be reproducible. If we find that we cannot reproduce your results, or if the description in your report does not match what your code does, you will receive penalty on the assignment, and this may result in honor code violation. You are allowed 5 submissions in a 24 hour cycle.

len2249272

3/6/2019 2:41:35 AM

Deliverables: Blackboard Submission of Source Code and Report: Create a folder called HW2_LastName1_LastName2. Create a subfolder called src and put all the source code there. Create a subfolder called Report and place a 2~3 Page, single-spaced report describing details regarding the steps you followed for feature selection and classifier model development. Also report your experimental results from different classifiers/models, including the running time. Be sure to include the following in the report: Name registered on miner website. Rank & F1 score for your submission (at the time of writing the report). Your Approach - Your methodology of choosing the approach and associated parameters. Archive your parent folder (.zip or .tar.gz) and submit via Blackboard for HW2.

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd