Reference no: EM133151347 , Length: 3200 words
Business Intelligence - Practical Report
Assignment
1. The key frameworks and concepts covered in modules 1-10 are relevant for this assignment. Assignment 3 relates to the specific course learning objectives 2, 3, 4 and 5:
2. Analyse and apply strategies processes and underlying technologies for effective management of data to make evidence based decisions;
3. Critically analyse organisational and societal problems using descriptive and predictive analysis and internal and external data sources to generate insight, create value and support evidence based decision making;
4. Examine legal, ethical and privacy dilemmas that arise from the use of business intelligence, analytics and evidence based decisions making to comply with legal and regulatory requirements;
5. Communicate effectively in a clear and concise manner in written report style for both senior and middle management with correct and appropriate acknowledgment of the main ideas presented and discussed.
Assignment Task for - Business Intelligence
Text Book:
Business Intelligence, Analytics, and Data Science : A Managerial Perspective.
ASSIGNMENT DESCRIPTION AND TASK LIST
Task overview:
Task 1: Data Mining and Text Mining Concepts
Task 1 Predictive Analytics Case Study
The goal of the Predictive Analytics Case Study is to predict whether it is likely to rain tomorrow or not based on previous weather conditions recorded by 49 weather station locations in the weatherAUS.csv data set provided (see Table 1 Data Dictionary for weatherAUS.csv data set). You should review the data dictionary for weatherAUS.csv data set. The Australian Weather dataset contains over 190,000 daily observations from January 2008 through to July 2021 from 49
Australian weather stations. The daily observations are available.
In completing Task 1 you will apply business understanding, data understanding, data preparation, modelling and evaluation phases of the CRISP DM data mining process. It is important that you understand this data set to complete Task 1 and four sub tasks.
Exploratory data analysis and date preparation
Conduct an exploratory data analysis and data preparation of weatherAUS.csv data set using RapidMiner to understand the characteristics of each variable and relationship of each variable to other variables. Summarise the findings of your exploratory data analysis and data preparation in terms of describing key characteristics of each variable in the weatherAUS.csv data set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc and relationships with other variables, transformation of existing variables, creation of new variables in a table named Task 1.1 Results of Exploratory Data Analysis and Data Preparation.
Briefly discuss the key findings of your exploratory data analysis and data preparation and justification for variables most likely to predict whether it is likely to rain tomorrow or not (10 marks 600 words).
Decision Tree Model
Build a Decision Tree model for predicting whether it is likely to rain tomorrow or not based on the weatherAUS.csv data set using RapidMiner and a set of data mining operators in part determined by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Decision Tree Model process, (2) Final Decision Tree diagram and (3) Decision tree rules. Briefly explain your final Decision Tree Model Process, and discuss the results of the Final Decision Tree Model drawing on key outputs (Decision Tree Diagram, Decision Tree Rules) for predicting whether it is likely to rain tomorrow or not based on key contributing variables and relevant supporting literature on interpretation of decision trees (10 marks 200 words).
Logistic Regression Model
Build a Logistic Regression model for predicting whether it is likely to rain tomorrow or not using RapidMiner and an appropriate set of data mining operators and weatherAUS.csv data set determined in part by your exploratory data analysis in Task 1.1. Provide these outputs from RapidMiner (1) Final Logistic Regression Model process (2) Key outputs from Logistic Regression Model. Hint for Task 1.3 Logistic Regression Model you may need to change data types of some variables. Briefly explain your final Logistic Regression Model Process and discuss the results of the Final Logistic Regression Model drawing on the key outputs (Coefficients, Standardised Coefficients, Odds Ratios, P Values etc) for predicting whether it is likely to rain tomorrow or not based on key contributing variables and relevant supporting literature on interpretation of logistic regression models (10 marks 200 words).
Model Validation and Performance:
You will need to validate your Final Decision Tree Model and Final Logistic Regression Model using the Cross-Validation Operator, Apply Model Operator and Performance Operator in your data mining processes. Discuss and compare the performance of the Final Decision Tree Model with the Final Logistic Regression Model for predicting whether it is likely to rain tomorrow or not based on key results of the confusion matrix presented in Table 1.4 Model Performance Metrics (Decision Tree vs Logistic Regression). Table 1.4 will compare the Final Decision Tree Model with the Final Logistic Regression Model using following model performance metrics - (1) accuracy (2) sensitivity (3) specificity and (4) F1 score (10 marks 250 words).
Note: the important outputs from the data mining analyses conducted in RapidMiner for Task 1 must be included in your Assignment 3 report to provide support for your conclusions reached regarding each analysis conducted for 1.1, 1.2, 1.3 and 1.4. Note you can export important outputs from RapidMiner as jpg image files and include these screenshots in the relevant Task 1 parts of your Assignment 3 Report.
Note: you will find the North Text book and RapidMiner Tutorials useful references for the data mining process activities conducted in Task 1 in relation to the exploratory data analysis and data preparation, decision tree analysis, logistic regression analysis and evaluation of the performance of the Final Decision Tree model and the Final Logistic Regression model. These concepts are covered in Module RapidMiner Practicals and Chapters 3, 4, 9, 10 and 13 of North Textbook and RapidMiner Tutorials contained within RapidMiner.
Research and critically review the study materials and other relevant literature to provide a suitable written response to each of the following tasks 2, 3 and 4 supported with an appropriate level of in-text referencing:
Task 2 Sentiment Analysis (600 words)
Define the concept Sentiment Analysis and explain how Sentiment Analysis relates to text mining (300 words)
Identify and describe a widely used application area of sentiment analysis and explain why sentiment analysis is used in this application: what business problem does sentiment analysis address and how does it add value for an organisation and its customers: illustrating your answer with a real-world example of the application of sentiment analysis by an organisation (300 words)
Task 3 Big Data Technologies (600 words)
Identify and describe each of the three prominent big data technologies using diagrams where appropriate (300 words).
Explain the key role (s) that these three prominent big technologies play in managing big data in an organisation including how these three big data technologies are interrelated and integrated to achieve effective big data management (300 words).
Task 4 Artificial Intelligence: automation and augmentation in workplace and ethical considerations (1100 words)
First, discuss how configurations of humans and artificial intelligence will evolve in the workplace as organisations drive automation and augmentation through the adoption of artificial intelligence (600 words).
Second identify and discuss the ethical implications for organisations in relation to (1) privacy (2) transparency (3) bias and discrimination and (4) governance and accountability of using artificial intelligence to drive automation and augmentation in the workplace ( 600 words).
Report Quality: structure presentation writing and referencing Structure and presentation: Cover page, table of contents, page numbers, headings, sub-headings, tables and diagrams, use of formatting, spacing, paragraphs
Writing quality: Use of English, report written in a clear and concise manner for an intended management audience (Correct use of language and grammar. Also, is there evidence of spelling-checking and proofreading?)
Attachment:- Business Intelligence.rar