BUS5CA Customer Analytics and Social Media Assignment

Assignment Help Other Subject
Reference no: EM132970517

BUS5CA Customer Analytics and Social Media - La Trobe University

Social Media Analysis for Understanding Customer Preferences and Sentiments

Learning Objective:
The learning objective of Assignment 1 is to further develop your understanding and skills on social media analytics via performing analysis on two case studies:

1. Case Study A: you will work as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers' involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, comments, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, sports, etc. that are of interest to different clients in various industries.

2. Case Study B: you will be a data scientist working for the multinational technology company Apple Inc. to develop a sentiment analytics engine for Twitter, which is used to predict consumers' review sentiments. The aim is to develop both dictionary- based and machine learning-based sentiment analytics scripts using a number of R libraries and SAS Sentiment Analysis Studio (covered in the workshop activities on Week 4 and Week 5). You are required to use the developed engine to predict Apple reviewers' sentiments and benchmark various algorithms and analytics tools.

Case Study A

Leveraging the power of content and social media marketing can help elevate the audience and customer base in a dramatic way. However, using social media for marketing without any previous experience or insight could be challenging. It is vital for a marketing team to understand social media marketing fundamentals. If a company publishes exciting, high- quality content and builds an online audience of quality followers, they can comment on it and further share it with their own follower audience on Twitter, Facebook, LinkedIn, Google+, their own blogs and many other social media platforms. This commenting and sharing of content opens up new entry points for search engines like Google to find it in a keyword search. Those entry points could grow to hundreds or thousands or more potential ways for people to find a company, product or service online. Finding and understanding the online influencers in the market who have quality audiences and are likely to be interested in the product, service or business could make a huge positive impact.

The consulting company collected information on articles that were shared by people on social media. The dataset contains approximately 12273 articles and a large number (with the total of 17) of features were extracted from the HTML code of the article, including the headline and the abstract of each article. (The description of the dataset is provided as an appendix.) Some of the features depend on characteristics of the service used, which could be analysed based on the meta-data provided: articles have the meta-data, such as keywords, article domain type and the total number of comments, etc. The article domain categories are: ‘Lifestyle', ‘Scitech', ‘Entertainment', ‘Sports', and ‘World'. In addition, several natural language processing features were also extracted.

Task Requirements

As a data analytics team member for the consultancy firm, you are required to carry out a number of data analytics tasks for the consulting company using the data collected. You are given access to a sample of the data where some of the variables have been removed as they are not considered important for the analysis of this assignment.

The company is interested in identifying for each article domain:
• Investigate the impact of the article properties on number of comments;
• Use the SAS Enterprise Miner for text analysis to identify key features in the articles and analyse their contribution towards low and high number of comments.

To achieve the above, you need to carry out the following data analytics tasks:

a) Task 1: Explore the impact of article properties

Explore the data and investigate what properties of the article correlate with the high number of comments of the article on social media.
• Open the dataset ‘news.xlsx' using Microsoft Excel.

• Explore the dataset to understand and manage five channels from the five types of data channels (lifestyle, entertainment, scitech, sports and world) and the associating data. In each article domain column, the value of 1 represents that the data in the row is of the corresponding article domain.
• Copy the separate datasets for each article domain to different Excel sheets (sort and filter by each data channel to separate in Microsoft Excel or apply proper R code in R Studio).
• In each data channel, identify the articles with a high number of comments (with the threshold of top 20% in the dataset).
• Investigate the following properties and explain how they could have affected the high number of comments. You should provide explanations to support your argument.
o Number of words in headline
o Number of words in abstract
o Number of words in content
o Number of keywords in the meta data
o Was the article published on the weekend
(Hint: To do this, you can create plots in R with proper measures between the corresponding columns and the number of comments. You may want to include a fitted line to your plots to investigate the correlation for continuous variables.)

2. Task 2: Use SAS Enterprise Miner for keyword analysis
• Use the SAS Enterprise Miner to extract the keywords from the abstract in each article domain. (Hint: To do this, you can refer to the workshop activities in Week 3 and Week 4; by setting ‘Abstract' column as the only ‘Text' role in the variable setting. The keywords can be identified from the Terms table in the SAS results.)
• What are the highly used (top 5) topics in each category? Use the SAS Result window to explain your answers.
(Hint: ‘Abstract' column will need to be set as the only ‘Text' role.)
• Are there common topics which span across data channels and relate to a high number of comments and a low number of comments? Use the whole dataset in the SAS Enterprise Miner to identify the relationship. You should provide explanations to support your argument.
(Hint: Use the whole dataset to identify the articles with the high number of comments and the low number of comments - by using appropriate thresholds with the top 20% and the bottom 20% in the dataset. Separate the dataset using Excel based on this before the analysis and use these two datasets to analyse the common topics in each of them. In this question, please use ‘Abstract' column as the only ‘Text' role for topic modelling.)

You are required to:

a) Prepare a report for the Case Study A with all the analytics results to the above two key tasks. (You can use an appendix for any additional screenshots, figures and tables, which you feel are important for the report). The report should be named as:
<student_id>Assignment1A_Report.doc
b) Save the R script after Task 1 above as: <student_id>Assignment1A.r
c) Save the SAS project for Task 2 above as <student_id>Assignment1_Task1.spk. You may zip the SPKs files if you have multiple of them. The detailed procedures for exporting a model package spk file can be found in Assignment 1 Additional Technical Support file. The SAS project file should be named as:

<student_id>Assignment1_SAS1.zip

Case Study B

Sentiment analysis is the technique aiming to gauge the attitudes of customers in relation to topics, products and services of interests. It is a pivotal technology for providing insights to enhance the business bottom line in campaign tracking, customer-centric marketing strategy and brand awareness. Sentiment analytics approaches are used to produce sentiment categories such as ‘positive', ‘negative' and ‘neutral'. More specific human emotions are also the topic of interest. There are two major streams of methods to develop sentiment analytics engine: the dictionary-based and machine learning-based approaches. In this assignment, you are required to perform sentiment analytics based on both approaches.

Task Requirements

As a data scientist, you are required to perform a number of data analytics tasks. You are tasked to develop both dictionary-based and machine-learning sentiment analytics engines using R programming language and apply it to predict the sentiments of Apple product review tweets from a sample of data. You are also required to use the SAS Sentiment Analysis Studio to compare the results.

To achieve the above, you need to carry out the following data analytics tasks:

1. Develop a dictionary-based sentiment analytics engine based on the R library ‘syuzhet' and ‘tidytext' to analyse the different emotions from Apple review tweets.
• Analyse and aggregate the eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust) from the Apple review tweets file ‘apple_review.csv' using the function ‘get_nrc_sentiment'. (You should combine both negative and positive tweets into one before conducting the analysis. Additionally, you are required to plot a chart to visualise these emotions using the R library ‘ggplot2'.)
• Finding the top 5 most frequent words in all the Apple product reviews for each of the eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust). You are required to analyse the results.

2. Develop a machine learning-based model using the R libraries ‘tm' and ‘e1071' as well as evaluate the predictive accuracies of SVM classifier.
• Develop R scripts and import the data set ‘apple_review.csv' for training and
testing.
• Use the first 120 negative tweets and the first 120 positive tweets as the training dataset; and use the rest of the 23 negative tweets and 23 positive tweets as the testing dataset.
(Hint: You may need to use as.character() function to convert a dataframe column from factors to characters.)
• Develop a machine learning-based sentiment analytics engine and predict sentiment categories (only ‘positive' and ‘negative') using ‘tm' and ‘e1071' with the SVM classifier.
• Evaluate the testing accuracies and report the predicted results.

3. Develop a statistical model using SAS Sentiment Analysis studio and evaluate the accuracies (5%).
• Use the data folder: ‘apple_review' which contain ‘negative' and ‘positive' tweets for training and testing.
• Build a statistical model using SAS Sentiment Analysis (either simple or advanced), you may need to change configurations in the advanced model to obtain the best training accuracy and keep a record of how you manage to improve the accuracy.
(Hint: Refer to the SAS Sentiment Analysis Studio tutorial.)
• Evaluate and compare the testing accuracies for different models and report the results.
• Compare this result with the previous predictive results using R and discuss (Note: the Apple review tweets used in this task is the same tweets as in Task 2).

You are required to:

a) Prepare a report for Case Study B with all the analytics results to the above three key tasks. (You can use an appendix for any additional screenshots which you feel are important for the report). The report should be named as:
<student_id>Assignment1B_Report.doc
b) Save the R script after Task 2 above as: <student_id>Assignment1B.r
c) Save the SAS Sentiment Studio project (detailed saving procedures can be found in Assignment 1 Additional Technical Support file) as:
<student_id>Assignment1_SAS2.zip

Attachment:- Customer Analytics and Social Media.rar

Reference no: EM132970517

Questions Cloud

What is breakeven point in megawatt hours for the year : What is breakeven point in megawatt hours for the year? Your company generates and sells electricity. Your budgeted fixed costs are £27.6 million for next year.
What is breakeven point in numbers of umbrellas sold : What is your breakeven point in numbers of umbrellas sold for one year? Your company makes and sells umbrellas. Your fixed costs are £123163 per annum.
What is the unit contribution for each pencil sold : The variable cost of producing 20,000 pencils is £2,000. The selling price for one pencil is 21p. What is the unit contribution for each pencil sold?
What will laundry costs for the year be : Laundry services are a semi-fixed cost for your business. What will your laundry costs for the year be if your income for the next year is £9.2 million?
BUS5CA Customer Analytics and Social Media Assignment : BUS5CA Customer Analytics and Social Media Assignment Help and Solution, La Trobe University - Assessment Writing Service
What type of cost is vehicle maintenance : Expenditure for your business on vehicle maintenance is projected, What type of cost is vehicle maintenance, from the perspective of cost behaviour?
What is the cost of depreciation for the theatre : What is the cost of depreciation for the theatre for the new financial year? Helibeb Limited owns a theatre building which it cost the company £7.5 million.
What was helibeb sales income for the last financial year : At the end of the last financial year, Helibeb Limited was owed £5.4 million by customers. What was Helibeb's sales income for the last financial year?
How much was helibeb cost of inventory of brown coarse : How much was Helibeb's cost of inventory of brown coarse sand last month? Helibeb Limited buys and sells brown coarse sand in the course of its business.

Reviews

Write a Review

Other Subject Questions & Answers

  Explain the effects of unexpected increase in inflation rate

Draw a hypothetical demand and supply curve for S&P 500 stocks and briefly explain the effects of unexpected increase in inflation rate caused by a sudden rise.

  Write a paper that briefly critiques a global issue

UNCC300 - demonstrate your ability to apply your knowledge and understanding of principles of human dignity, advocacy, and community engagement to potential

  Relationship between the number of libraries per capita

A study finds a relationship between the number of libraries per capita in 50 California cities and literacy rates in those cities. What type of study is this?

  Design buildings as an organic and integral part of nature

Which architect tried to design buildings as an organic and integral part of nature?

  How do ethical leaders differ from other leaders

What is an ethical leader and how do ethical leaders differ from other leaders? What are the factors that promote or hinder the development of ethical leadership in organisations

  What are some fixed-mindset triggers

As Carol Dweck notes, we all hold both fixed and growth mindsets. Identifying situations that trigger a fixed-mindset voice can be a beneficial first step.

  Diuretic to control hypertension

What is the rationale for reducing salt intake and taking a diuretic to control hypertension?

  What ways are the elderly a hindrance to society

Aging in Society, What ways are the elderly a hindrance to society, and in what ways are they a strength to our economic growth?

  What devices might not be allowed in certain facilities

Some organizations prohibit workers from bringing certain kinds of devices into the workplace, such as cameras, cell phones, and USB drives.

  Define how the collaborative experiences might be improved

In this Discussion, you will reflect on your own observations of and/or experiences with informaticist collaboration. You will also propose strategies for how.

  Evaluate the significance of code enforcement research

evaluate the significance of code enforcement research conducted by the urban institute national fire protection

  Explain the family rights and privacy act

Privacy is an important part of our daily lives. For example, as a student you should be aware of the Family Rights and Privacy Act (FERPA).

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd