ICT110 Introduction to Data Science Assignment

Assignment Help Basic Computer Science
Reference no: EM132520158 , Length: word count:1200

ICT110 Introduction to Data Science - University of the Sunshine Coast

Assignment Task

You work at Nintendo as a data scientist. The marketing team have approached you because they want to develop a new Pokémon that will be the ultimate Pokémon king directly below Arceus (the creator of the Pokémon world). The marketing team have no preconceived ideas about the sorts of attributes this new Pokemon should have. They would like to create a Pokémon that could be perceived by other Pokémon as being superior. Nintendo head office have provided you with a dataset and have asked to provide a report with recommendations about what attributes this new Pokémon should have.

First, the marketing team would like to get a better understanding about what sorts of attributes the current Pokémon have. They have asked you to describe the data and find interesting phenomena.

Second, the marketing team have asked you to explore the data in more detail. They would like you to use your expertise in data science to dig out anything you feel is interesting or significant. They are looking for attributes of strength that could be put together to create the profile of a Pokémon that could be the Pokémon King. Further, they would like you to be able to predict whether or not this Pokémon would win a battle against Dialga (one of Arceus' protectors).

You are required to prepare a report about your findings and to make suggestions about which attributes you would recommend be included in the ultimate Pokémon's profile. You are also required to provide the script of the code you have used to prepare and explore your data. A notepad template is provided for you to complete.

The dataset contains information about the following attributes:

• name: The English name of the Pokemon
• japanese_name: The Original Japanese name of the Pokemon
• pokedex_number: The entry number of the Pokemon in the National Pokedex
• percentage_male: The percentage of the species that are male. Blank if the Pokemon is genderless.
• type1: The Primary Type of the Pokemon
• type2: The Secondary Type of the Pokemon
• classification: The Classification of the Pokemon as described by the Sun and Moon Pokedex
• height_m: Height of the Pokemon in metres
• weight_kg: The Weight of the Pokemon in kilograms
• capture_rate: Capture Rate of the Pokemon
• baseeggsteps: The number of steps required to hatch an egg of the Pokemon
• abilities: A stringified list of abilities that the Pokemon is capable of having
• experience_growth: The Experience Growth of the Pokemon
• base_happiness: Base Happiness of the Pokemon
• against_?: Eighteen features that denote the amount of damage taken against an attack of a particular type
• hp: The Base HP of the Pokemon
• attack: The Base Attack of the Pokemon
• defense: The Base Defense of the Pokemon
• sp_attack: The Base Special Attack of the Pokemon
• sp_defense: The Base Special Defense of the Pokemon
• speed: The Base Speed of the Pokemon
• generation: The numbered generation which the Pokemon was first introduced
• is_legendary: Denotes if the Pokemon is legendary.
To learn more about Pokémon check this link out. It will bring up the official Pokédex where you can search for Pokémon to find pictures and learn more about them. If you aren't familiar with Pokémon it's worth taking a look at this link.

The potential audiences include other staff within Nintendo, such as executives or sales staff. These staff may have limited ICT or mathematical knowledge.

To prepare the report, please include the following sections:

1. Introduction
Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from, what are the dimensions and structure of the data.

2. Data Setup
Describe how to load the data, and how the pre-processing is performed.

The original dataset is not ready for analysis and it is different from the data forms that we are familiar with in previous practices. This means we need to do some pre-processing, either for the whole dataset, or for a subset of the dataset required for each sub task described later.

Once you have some ideas of exploratory or advanced analysis, you need to adjust the form of dataset. This can be achieved either by manipulating records in R by transposition or subsetting, or with other tools (e.g. notepad or excel) before reading them into R. Please clearly explain the way you have cleaned the data in this section. If you use Excel please still explain the steps in the Notepad document and the Report.

3. Exploratory Data Analysis
3.1. One-variable analysis
One-variable analysis studies one variable (one row or one column) each time. For example, the attribute "classification"could be selected to get a bar graph of the frequency of each Pokémon type. Or, "height" could be selected to show a histogram of height ranges of Pokémon.You can choose the attribute you want to for this. Add your code to the Notepad template.

Perform 2one-variable analysis. Plot one graph for each variable. Explain the finding for each graph.

3.2. Two-variable analysis
A two-variable analysis studies the relation between two variables. For example, we might be interested to know the attack strength or speed of Pokémon (using the attribute "type1" or "classification"). Which type is the strongest overall? Which is the weakest? It is up to you to decide which attributes/variables you use for this analysis.Just be sure to explain what you have done using sentences as well. Add your code to the Notepad template.

Perform 2 two-variable analysis. Plot one graph for each variable.Explain the finding for each graph.

4. Advanced Analysis
4.1. Clustering
Briefly explain the concept of clustering and k-means (with references).
Perform 1 clustering analysis. You can choose the attributes you want to evaluate but an idea is:
• "Are then any clusterswhen capture rate and base happiness are examined?"
4.2. Linear Regression
Briefly explain the concept of linear regression (with references).
Perform 2 linear regression analysis. Plot the learned models. You can choose the attributes you want to evaluate but an idea is:
• "Which type is the most likely to be a legendary Pokemon?"
• "How likely is [a Pokemon type] to be a legendary Pokemon?"
4.3. Classification Tree
Briefly explain the concept of a classification tree (with references). You can choose the attributes you want to evaluate but an idea is:
• "Is it possible to build a classification tree to identify legendary Pokemon?"
5. Conclusion
Sum up your findings and provide some insight into the findings.

6. Reflections
In this part, discuss any difficulties you had performing the analysis and how you solved those difficulties. Reflect on how the analysis process went for you, what you learnt, and what you might do differently next time.

7. Illustration
Drawing a funny picture of your Pokémon is encouraged but entirely optional. There are no marks for this.

Attachment:- Introduction to Data Science.zip

Reference no: EM132520158

Questions Cloud

Genetic diversity in small populations : There are a number of reasons why a gene may appear in some individuals and not others. After you have read the assigned readings, you will have an
What are the provisional primary and differential diagnoses : Ricardo does not speak English so he comes to see you with his sister. She says that he has a very sore throat and feels very tired. On closer questioning.
Principles of agency law : Choose one of the scenarios below and explain whether you think the business is liable for the acts under the principles of agency law.
Prepare the journal entries for both companies : Textile manufacturer Cullumber Corp. exchanges robotic equipment with an original cost of $20,900. Prepare the journal entries for both companies
ICT110 Introduction to Data Science Assignment : ICT110 Introduction to Data Science Assignment Help and Solution, University of the Sunshine Coast - Assessment Writing Service
Project sponsor and customer informed : Your Project Sponsor and customer informed you that you have to deliver your project much sooner than anticipated.
What is a bank reconciliation for July : China Import's accountant mistakenly recorded a $430 check that was written to purchase supplies as $350. What is a bank reconciliation for July
What were the early conceptualizations of nursing theory : This week's discussion question is also a two-part inquiry. What were the early conceptualizations of nursing theory? What are nurse-patient theories?
Connections between his temporal lobe and amygdala : Ramachandran believed that David's motorcycle accident may have disrupted the neural connections between his temporal lobe and amygdala.

Reviews

len2520158

5/19/2020 11:58:21 PM

The first document has all the information regarding assignments. Second and third files are the data for the assignments.

Write a Review

Basic Computer Science Questions & Answers

  Identifies the cost of computer

identifies the cost of computer components to configure a computer system (including all peripheral devices where needed) for use in one of the following four situations:

  Input devices

Compare how the gestures data is generated and represented for interpretation in each of the following input devices. In your comparison, consider the data formats (radio waves, electrical signal, sound, etc.), device drivers, operating systems suppo..

  Cores on computer systems

Assignment : Cores on Computer Systems:  Differentiate between multiprocessor systems and many-core systems in terms of power efficiency, cost benefit analysis, instructions processing efficiency, and packaging form factors.

  Prepare an annual budget in an excel spreadsheet

Prepare working solutions in Excel that will manage the annual budget

  Write a research paper in relation to a software design

Research paper in relation to a Software Design related topic

  Describe the forest, domain, ou, and trust configuration

Describe the forest, domain, OU, and trust configuration for Bluesky. Include a chart or diagram of the current configuration. Currently Bluesky has a single domain and default OU structure.

  Construct a truth table for the boolean expression

Construct a truth table for the Boolean expressions ABC + A'B'C' ABC + AB'C' + A'B'C' A(BC' + B'C)

  Evaluate the cost of materials

Evaluate the cost of materials

  The marie simulator

Depending on how comfortable you are with using the MARIE simulator after reading

  What is the main advantage of using master pages

What is the main advantage of using master pages. Explain the purpose and advantage of using styles.

  Describe the three fundamental models of distributed systems

Explain the two approaches to packet delivery by the network layer in Distributed Systems. Describe the three fundamental models of Distributed Systems

  Distinguish between caching and buffering

Distinguish between caching and buffering The failure model defines the ways in which failure may occur in order to provide an understanding of the effects of failure. Give one type of failure with a brief description of the failure

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd