Reference no: EM133692924
Machine Learning Applications
Your Task
Design a stroke prediction system to determine if a patient is likely to get a stroke given information about the patient.
Assessment Description
Stroke is one of the leading causes of death globally. Your organisation is supporting the World Health Organisation (WHO) to understand the root cause of the issue to be able to identify the patients well in advance.
Your team lead has given you a task to identify the patient if he's prone to an attack or not when the patient information is given you.
Data
A Stroke Prediction dataset is found at Kaggle
The original dataset is pre-processed and is provided in a file. MyKBS provides you this file containing following columns:
gender: "Male", "Female" or "Other".
age: age of the patient.
hypertension: 0 if the patient doesn't have hypertension, 1 if the patient has hypertension.
heart_disease: 0 if the patient doesn't have any heart diseases, 1 if the patient has a heart disease.
ever_married: "No" or "Yes".
work_type: "children", "Govt_jov", "Never_worked", "Private" or "Self-employed".
Residence_type: "Rural" or "Urban".
avg_glucose_level: average glucose level in blood.
bmi: body mass index.
smoking_status: "formerly smoked", "never smoked", "smokes" or "Unknown".
*Note: "Unknown" in smoking_status means that the information is unavailable for this patient You are required to train/test a stroke prediction system using the data provided to you.
Problem Statement
As an individual, you are required to download the data set from MyKBS. You must build a stroke prediction system to identify the patients at risk of getting stroke. You should systematically approach the problem by addressing the below tasks:
Load the data set, summaries it, and pre-process it to fit your requirements. Find and report the most correlated features in the data. Perform 80/20 train-test split.
Design a stroke prediction system using feed forward Neural Network (NN).
Write an analytical report to elaborate the approach and the performance using relevant metric(s) of the NN for a non-technical reader. Your report should contain the abstract, introduction, methodology and a conclusion section. The referencing is done in accordance with Kaplan Harvard Referencing style.
Learning Objective 1: Explain programming functions for the sourcing, storage, and preparation of data for machine learning applications.
Learning Objective 2: Design basic algorithmic models for the application of machine learning in information technology.
Learning Objective 3: Create insights of organisational value with the aid of machine learning.
Assessment Guidelines
You are required to follow the below guidelines:
You should write your Stroke Prediction System code using Python 3 programming language.
You can use any Python third-party package in this assessment.
You should ONLY use the provided file for training/testing your system.
The ideology for this assessment is to display your grasp over the concepts. Showing and explaining your way of thinking is more valued than the performance of the model itself.