Reference no: EM133671126
Power Point Project Presentation
Assignment Content
About
Your team must propose a data analytic methodology that addresses the client's project objective as described in the provided Project and Data Brief document. The proposed methodology must be fully implemented in Assessment 3 (Group Project Report), although preliminary results may be presented in this Presentation.
Tasks
Your team must deliver a group presentation in recording.
The presentation must propose a data analytics methodology to address the project objectives you identified earlier when drafting the Group Project Plan. Methodology refers to a systematic, stepwise approach that utilises analytic techniques and algorithms taught in this unit.
Additional algorithms not taught in this unit are not essential for passing this assignment. However, they may be incorporated into the proposed methodology only after including the taught algorithms and may help achieve higher grades. To be further considered for a higher grade (D or HD), you must also produce useful preliminary analytics results (using any method learned in Weeks 1 - 6).
Like most real data science projects, your examiner may expect a presentation of diverse approaches to address the project objective. No two teams are likely to produce identical proposals and/or results.
Linear regressions and Linear algebra operations
Key Python packages
• scikit-learn
• pandas
• matplotlib
Key Python functions
• sklearn.linear_model.LinearRegression.fit() : to build a linear regression model
• matplotlib.pyplot.plot() : to visualise a linear regression line
• matplotlib.pyplot.scatter() : to plot data points of a dataset
Week3 -Linear algebra applications
Key Python packages
• numpy
• numpy.linalg.inv
• matplotlib.pyplot
Key Python functions
• numpy.linalg.inv() : to obtain the inverse of a matrix
• numpy.asarray() : to create a matrix from a list
• numpy.linspace() : to create evenly spaced numbers over a specified interval
• matplotlib.pyplot.plot() : to visualise a linear regression line
• matplotlib.pyplot.scatter() : to plot data points of a dataset
Week 4-Dimensionality reductions
Key Python packages
• numpy
• pandas
• scikit-learn
• matplotlib.pyplot
Key Python functions
• sklearn.datasets.load_breast_cancer() : to access the built-in breast cancer dataset
• sklearn.preprocessing.StandardScaler() : to apply standardisation
• sklearn.decomposition.PCA() : to apply Principal Component Analysis
• matplotlib.pyplot.scatter() : to plot data points of a dataset
Week 5-Feature scaling
Key Python packages
• pandas
• scikit-learn
• matplotlib.pyplot
Key Python functions
• sklearn.preprocessing.StandardScaler().fit_tranform() : to apply Standardisation
• sklearn.decomposition.PCA().fit_transform() : to apply Principal Component Analysis
• sklearn.model_selection.train_test_split() : to split data into training and test sets
• sklearn.naive_bayes.GaussianNB().fit() : to train a Naive Bayes classifier
• sklearn.metrics.accuracy_score() : to compute a classifier's accuracy
Week6-Clustering
Key Python packages
• numpy
• scikit-learn
• matplotlib.pyplot
Key Python functions
• sklearn.cluster.KMeans() : to build and apply k-means clustering algorithm
• sklearn.preprocessing.StandardScaler() : to apply standardisation
• matplotlib.pyplot.scatter() : to plot data points of a dataset
• numpy.random.uniform() : to generate random numbers from a uniform distribution
• numpy.random.normal() : to generate random numbers from a normal (Gaussian) distribution