Find out optimal number of clusters

Assignment Help Other Subject
Reference no: EM132358271

Assignment -

New DG Food Agro are a multinational exporter of various grains from India since nearly 130 years. But their main product of exporting since early 1980s has been Wheat. They export wheat to countries like America, Afghanistan, Australia etc.

They started seeing varying exports of sales year on year for various countries. The reason that was theorized by them had a lot of natural causes like floods, country growth, population explosion etc. Now they need to decide which countries fall in the same range of export and which don't. They also need to know which countries export is low and can be improved and which countries are performing very well across the years.

The data provided right now is across 18 years. What they need is a repeatable solution which won't get affected no matter how much data is added across time and that they should be able to explain the data across years in less number of variables.

Objective: Our objective is to cluster the countries based on various sales data provided to us across years. We have to apply an unsupervised learning technique like K means or Hierarchical clustering so as to get the final solution. But before that we have to bring the exports (in tons) of all countries down to same scale across years. Plus, as this solution needs to be repeatable we will have to do PCA so as to get the principal components which explain max variance. Implementation:

1) Read the data file and check for any missing values.

2) Change the headers to country and year accordingly.

3) Cleanse the data if required and remove null or blank values.

4) After the EDA part is done, try to think which algorithm should be applied here.

5) As we need to make this across years we need to apply PCA first.

6) Apply PCA on the dataset and find the number of principal components which explain nearly all the variance.

7) Plot elbow chart or scree plot to find out optimal number of clusters.

8) Then try to apply K means, Hierarchical clustering and showcase the results.

9) You can either choose to group the countries based on years of data or using the principal components.

10) Then see which countries are consistent and which are largest importers of the good based on scale and position of cluster.

Attachment:- Assignment Files.rar

Reference no: EM132358271

Questions Cloud

Explore the promising areas of knowledge management : Topic: Comprehensive analytical case study. Enterprise Resource Planning (ERP) - Explore the promising areas of Knowledge Management in organizations
What issues you identify that are related to your function : MSP610 Logistics Management Assignment - Distance, University of Lusaka, Zambia. What issues you identify that are related to your function
Implement and monitor the plan for managing project finances : BSBPMG522 Undertake Project Work Assignment, Mercury Institute of Victoria, Australia. Implement and monitor the plan for managing project finances
Analysis and produce a board briefing paper for tabling : Topic 1: Walmart and Foreign Corruption. Analysis and produce a board briefing paper for tabling at the next meeting of the company's board of directors
Find out optimal number of clusters : Read the data file and check for any missing values. Plot elbow chart or scree plot to find out optimal number of clusters
Performing the formal discovery upon request : Execute expert testimony in defense of your computer forensics or incident response report. Performing the formal discovery upon request
Write a Java Application that uses an interactive GUI : COIT20256 - Data Structures and Algorithms Assignment, Central Queensland University, Australia. Write a Java Application that uses an interactive GUI
Definition of Decision tree : Definition of Decision tree? Feature of decision theory problem? Decision making under both certainty and uncertainty? Steps involve in solving decision problem
Discussion on a relations-oriented or a task-oriented leader : Please find journal articles to add into the document for both questions - Would you consider Woodside a relations-oriented or a task-oriented leader

Reviews

Write a Review

Other Subject Questions & Answers

  Bibliographical information for the source

Complete this week's "The Graduate" scenario. You will notice that there are multiple arguments presented to influence your perspective on the issue confronted in the scenario. You confront arguments every day.

  Explain dentist practice contracted with slick fish

High Gloss Floss (HGF) a well known dentist practice contracted with Slick Fish, Inc. (SFI) for the purchase of a "fully installed 96" x 60" x 48" aquarium

  Calculate the cost of a stockout in sales dollars

Which group of action steps would you use to calculate the cost of a stockout in sales dollars? Which financial ratio best measures how hard a company is working their assets to produce sales?

  Describe general policy making viewpoints that exist today

Describe general policy making viewpoints that exist today on each of the following health care issues: Access to Care, Cost of Care and Quality of Care.

  What hippocrates and galen believed about the bodily fluids

The first part of the assignment is for you to analyze your results according to what Hippocrates and Galen believed about the four bodily fluids.

  Find the maximum compression of the spring

A 6.2 kg block starts from rest at height h=1.5m and slides down a 40 degree incline with coefficient of kinetic friction uk= 0.25. The block then strikes a spring constant k=360 N/m. Find the maximum compression of the spring.

  Compose a one page abstract that sums up the main ideas

Please compose a one page abstract (300 words) that sums up the main ideas and arguments you will make in the final paper.

  Investigate changes in total annual discharge

EESC323 Fluvial Geomorphology and Sedimentology, University of Wollongong, Australia. Investigate changes in total annual discharge at Macquarie Rivulet

  Discuss the characteristics of realism and idealism

Discuss the characteristics of Realism and Idealism and give one example of where do you see yourself on the continuum of realism and idealism.

  Explain key motivational factors associated with selected

Describe the tools and technologies used in one of the research studies on this topic. Explain the key motivational (biological, learned, and cognitive) factors associated with the selected behavior.

  Security concerns

Educate Joseph by developing a short research paper. Research and list some security concerns that his business should address as it considers how to develop a Web presence.

  How does the sound effects match with the plots actually

Why choose these sound effects and those sound effects and plot how to match this production at this time. How does the sound effects match with the plots actually

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd