Discuss key technology building blocks of visual analytics

Assignment Help Other Subject

Reference no: EM133544383 , Length: word count:3000

Business Intelligence

Learning objective 1: Analyse and apply strategies processes and underlying technologies for effective management of data to make evidence based decisions;

Learning objective 2: critically analyse organisational and societal problems using descriptive and predictive analysis and internal and external data sources to generate insight, create value and support evidence based decision making;

Learning objective 3: examine legal, ethical and privacy dilemmas that arise from the use of business intelligence, analytics and evidence based decisions making to comply with legal and regulatory requirements;

Learning objective 4: communicate effectively in a clear and concise manner in written report style for both senior and middle management with correct and appropriate acknowledgment of the main ideas presented and discussed.

Task 1 Predictive Analytics Case Study
The goal of the Predictive Analytics Case Study is to predict whether a patient is likely to have a stroke or not (see Table 1 Data Dictionary for stroke-data.csv data set below). In completing Task 1 you will apply business understanding, data understanding, data preparation, modelling and evaluation phases of the CRISP DM data mining process. It is important that you understand this data set to complete Task 1 and four sub tasks.

Table 1 Data dictionary for stroke-data.csv

Variable Name	Description	Data Type
id	unique identifier	Numeric
gender	gender of the patient	Categorical "Male", "Female" or "Other"
age	age of the patient	Numeric
hypertension	patient has hypertension	Binary 0 = No = the patient does not have hypertension 1 = Yes = the patient has hypertension
heart_disease	patient has heart disease	Binary 0 = No = the patient does not have heart disease 1 = Yes = the patient has heart disease

Exploratory data analysis and data preparation

Conduct an exploratory data analysis and data preparation of stroke-data.csv data set using RapidMiner to understand the characteristics of each variable and relationship of each variable to other variables. Summarise the findings of your exploratory data analysis and data preparation in terms of describing key characteristics of each variable in the stroke- data.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc. and relationships with other variables, transformation of existing variables, creation of new variables in a table named Table 1.1 Results of Exploratory Data Analysis and Data Preparation.

Hint: Statistics Tab and Chart Tab in RapidMiner provide a lot of descriptive statistical information and useful charts like Barcharts, Scatterplots required for Task 1.1 etc. You might also like to look at running some correlations and/or chi square tests depending on whether a variable is a categorical variable or a numeric variable. Indicate in Table 1.1 which variables contribute most to predicting whether a patient is likely to have a stroke or not. You could also consider transforming some variables and creating new variables and converting target/label variable into a binominal variable to facilitate analysis in Tasks 1.2, 1.3 and 1.4.

Briefly discuss the key findings of your exploratory data analysis and data preparation and justification for variables most likely to predict whether a patient is likely to have a stroke or not (500 words).

Decision Tree Model

Build a Decision Tree model for predicting whether a patient is likely to have a stroke or not, on the stroke-data.csv data set using RapidMiner and a set of data mining operators in part determined by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram and (3) Decision tree rules. Briefly explain your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting whether a patient is likely to have a stroke or not based on key contributing variables and relevant supporting literature on interpretation of decision trees (150 words).

Logistic Regression Model
Build a Logistic Regression model for predicting whether a patient is likely to have a stroke or not using RapidMiner and an appropriate set of data mining operators and stroke-data.csv data set determined in part by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Logistic Regression Model process (2) Key outputs from Logistic Regression Model.
Hint for Task 1.3 Logistic Regression Model you may need to change data types of some variables. Briefly explain your final Logistic Regression Model Process and discuss the results of the Final Logistic Regression Model drawing on the key outputs (Coefficients, Standardised Coefficients, Odds Ratios, P Values etc) for predicting whether a patient is likely to have a stroke or not based on key contributing variables and relevant supporting literature on interpretation of logistic regression models (150 words).

Model Validation and Performance
You will need to validate your Final Decision Tree Model and Final Logistic Regression Model using the Cross-Validation Operator, Apply Model Operator and Performance Operator in your data mining processes. Discuss and compare the performance of the Final Decision Tree Model with the Final Logistic Regression Model for predicting whether a patient is likely to have a stroke or not based on key results of the confusion matrix presented in Table 1.4 Model Performance Metrics (Decision Tree vs Logistic Regression). Table 1.4 will compare the Final Decision Tree Model with the Final Logistic Regression Model using following model performance metrics - (1) accuracy (2) sensitivity (3) specificity and (4) F1 score (200 words).

Note 1: the important outputs from the data mining analyses conducted in RapidMiner for Task 1 must be included in your Report 3 to provide support for your conclusions reached regarding each analysis conducted for 1.1, 1.2, 1.3 and 1.4. Note you can export important outputs from RapidMiner as jpg image files and include these screenshots in the relevant Task 1 parts of your Assessment 3 Report.

Note 2: you will find the North Textbook and RapidMiner Tutorials useful references for the data mining process activities conducted in Task 1 in relation to the exploratory data analysis and data preparation, decision tree analysis, logistic regression analysis and evaluation of the performance of the Final Decision Tree model and the Final Logistic Regression model. These concepts are covered in Module RapidMiner Practicals and Chapters 3, 4, 9, 10 and 13 of North Textbook and RapidMiner Tutorials contained within RapidMiner.

Research and critically review the study materials and other relevant literature to provide a suitable written response to each of the following tasks 2, 3 and 4 supported with an appropriate level of in-text referencing:

Task 2 Customer Relationship Management Analytics (500 words)
Explain why customer relationship management (CRM) analytics is such an important activity for business (250 words)
Choose and describe a widely used application of customer relationship management (CRM) analytics and explain how the impact of CRM analytics can be measured in this application area (250 words)

Task 3 Visual Analytics Technologies (500 words)
Explain why visual analytics is such an important concept in business intelligence, and illustrate your answer with a real-world application of visual analytics (250 words).

Discuss the key technology building blocks of visual analytics in the context of the same real-world application described in Task 3.1 (250 words).

Task 4 Automated Driving of Road Vehicles - Transforming Work and Ethical Considerations of AI Technologies (1000 words)
Identify and discuss the AI technologies used in automated driving of road vehicles (500 words).
Identify and discuss the ethical implications for transportation companies regarding (1) privacy, (2) transparency, (3) bias and discrimination, and (4) governance and accountability when using AI technologies in automated driving of road vehicles to replace human drivers for goods delivery ( 500 words).

Report Quality: structure presentation writing and referencing

Structure and presentation: Cover page, table of contents, page numbers, headings, subheadings, tables and diagrams, use of formatting, spacing, paragraphs.
Writing quality: Use of English, report written in a clear and concise manner for an intended management audience (Correct use of language and grammar. Also, is there evidence of spelling-checking and proofreading?)
Quality of research evident by correct and appropriate use of referencing: Appropriate level of referencing in text, reference list provided, used Harvard Referencing Style correctly.
Report 3 must be structured as follows:
Report 3 Cover page Table of Contents
Task 1 Heading - Sub headings for Tasks 1.1, 1.2, 1.3 and 1.4 Task 2 Heading - Sub headings for Task 2.1 and 2.2
List of References List of Appendices.

Attachment:- stroke-data.rar

Reference no: EM133544383

Questions Cloud

Identify ways that sharon can make positive first impression : Identify ways that Sharon can make positive first impression, using tacit communication, such as body language, tone of voice and so on. Explain your reasoning.

Describe professional therapeutic relationships : Describe professional therapeutic relationships with mental health consumers, carers, and significant others using empathic and compassionate language

What actions and alternatives are you prepared to consider : What actions and alternatives are you prepared to consider if you (Alice Jones) cannot reach a negotiated agreement within your Zone of Possible Agreement?

Which incentive do you think impacted providers more : Which incentive do you think impacted providers more? Do you think it was the need for financial assistance to get new technology?

Discuss key technology building blocks of visual analytics : Explain why customer relationship management (CRM) analytics is such an important activity for business (250 words) Choose and describe a widely used

Nursing responsibilities in treatments by degree of burns : Discuss the differences between burn treatments according to degrees of burns and specify nursing responsibilities in the treatments by degree of burns.

How you will use this data to develop a balance sheet : How you will use this data to develop a Balance Sheet, Profit and Loss Statement, and Cash Flow Statement. Refer to the course text for additional guidance.

Define validity and reliability : Define validity and reliability. Give two examples of evidence that may be obtained for each. You are a member of a nursing education program faculty

Define measurement-norm and criterion-referenced evaluation : Define assessment, test, measurement, norm- and criterion-referenced evaluation, and formative- and summative evaluation.

User Account

All Pages