Reference no: EM133688336
Business Intelligence
Assessment
Part A
Data driven decision making (DDDM) is a process working towards key business goals by leveraging verified, analysed data to build BI applications. Answer the following questions in Part A.
Question A-1
Business intelligence (BI) tools are evolving beyond their conventional reliance on OLAP (Online Analytical Processing) data sources rooted in relational database systems (RDBMS). They're now expanding their connectivity to incorporate a diverse array of data sources, including social networks. This shift enables modern BI applications to utilize data mashups, enhancing flexibility in integrating various data types. Consequently, NoSQL technologies are gaining momentum and are increasingly viewed as a significant advancement in the BI domain, offering new capabilities for handling complex data integrations.
Search and provide two scenarios implementing NoSQL in Business Intelligence Applications, and
Discuss benefits and drawbacks for BI implementations based on NoSQL technologies.
Question A-2
Briefly discuss the main differences between the data mining tasks of classification and clustering.
For the following example, calculate the accuracy, precision, and recall:
Note: Spam is a positive class (y =1), and "Not Spam" is a negative class (y=0) for a binary classifier.
Spam
|
1
|
Not Spam
|
0
|
Not Spam
|
0
|
Spam
|
1
|
Spam
|
0
|
Spam
|
1
|
Not Spam
|
0
|
Spam
|
1
|
Spam
|
0
|
Not Spam
|
1
|
Question A-3
Given a database of transactions where each transaction is a collection of items purchased by a customer in a visit. Generate frequent item set with minimum support requirement minsup = 30%
Find all frequent itemsets using Apriori Algorithm (Show all intermediate results).
Part B
In this part, you are required to demonstrate your understanding of BI modelling of classification with a decision tree using the provided tool (WEKA) or using a programming language (Python or R).
Scenario 1:
Read the resource material available on the course website about WEKA (in Week 5 and Week 6). Download WEKA software and install it onto your computer (ensure the bundled Java Runtime Environment, i.e. jre is installed). After successful installation of the program, classify bank accounts from the data bank-data.arff using J48 classifier (keep other parameters as default,). The bank-data.arff file can be found on the unit website. Write an analysis report based on the classification results. You need to include the results in your report. (1000 words)
Hints:
To classify data in WEKA, use "Open file", than select "Classify" » "Classifier (Choose)" » "trees" »"J48".
You need to develop a decision tree and explain the output and the important variables required for generating the output.
The report should include:
Compare and contrast using the provided tools (WEKA) and programming languages (Python and/or R) to perform data analysis, classification to generate decision trees.
Analysis of classification results
Summary or conclusion
Scenario 2:
The goal of this scenario is to gain practical experience in applying classification to real data for building BI applications. There are two tasks in this scenario: data preparation and classification using Python or R you can achieve generating decision tree.
Task 1: Data Preparation
1.1 Extract data into R data frame
1.2 Assign the following names to the five different columns in your dataset,
sepal length in cm
sepal width in cm
petal length in cm
petal width in cm
class: Iris Setosa, Iris Versicolour and Iris Virginica
1.3 Remove all rows with missing values.
1.4 Save the dataframe into a file with filename Iris_processed.Rda
Task 2: Decision Trees
2.1 Load the pre-processed data from task 1 into the data frame.
2.2 Set seed 2021.
2.3 Divide the dataset into training and test subsets randomly (70% and 30% respectively).
2.4 Generate a classification tree and visualise.
2.5 Provide a summary of the classification result.
Part C
Question C-5
Search a live business intelligence dashboard from the Internet, like (not from) the following examples:
Review the available options or views to complete the following tasks:
What are the characteristics of a Business Intelligence Dashboard in an organisation or a company?
How should the Business Intelligence Dashboard be designed to facilitate decision making?
To critique the above questions, you should use five additional references, apart from the prescribed textbook.
Note: You should include the BI dashboard with your answers. BI dashboard Demons are acceptable. (1000 words)