Social Media Analysis for Understanding Customer Preferences

Assignment Help Other Subject
Reference no: EM132420634

BUS5CA - Customer Analytics and Social Media - La Trobe University

Assignment - Social Media Analysis for Understanding Customer Preferences and Sentiments

Learning Objective:

The learning objective of Assignment 1 is to further develop your understanding and skills on social media analytics via performing analysis on two case studies:

1. Case Study A: you will work as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers' involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, shares, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, business, etc. that are of interest to different clients in various industries.

2. Case Study B: you will be a data scientist working for a hotel review firm to develop a sentiment analytics engine for Twitter, which is used to predict consumers' review sentiments. The aim is to develop both dictionary-based and machine learning-based sentiment analytics scripts using a number of R libraries and SAS Sentiment Analysis Studio (covered in the workshop activities on Week 4 and Week 5). You are required to use the developed engine to predict hotel reviewers' sentiments and benchmark various algorithms and analytics tools.

Case Study A

Leveraging the power of content and social media marketing can help elevate the audience and customer base in a dramatic way. However, using social media for marketing without any previous experience or insight could be challenging. It is vital for a marketing team to understand social media marketing fundamentals. If a company publish exciting, high-quality content and build an online audience of quality followers, they can share it with their own follower audience on Twitter, Facebook, LinkedIn, Google+, their own blogs and many other social media platforms. This sharing and discussing of content open up new entry points for search engines like Google to find it in a keyword search. Those entry points could grow to hundreds or thousands or more potential ways for people to find a company, product or service online. Finding and understanding the online influencers in the market who have quality audiences and are likely to be interested in the product, service or business could make a huge positive impact.

The consulting company collected information on articles that were shared by people on social media. The dataset contains approximately 39000 articles and a large number (with the total of 31) of features were extracted from the HTML code of the article, including the title and the content of each article. (The description of the dataset is provided as an appendix.) Some of the features depend on characteristics of the service used, which could be analysed based on the meta-data provided: articles have the meta-data, such as keywords, data channel type and the total number of shares (on Facebook, Twitter, Google+, LinkedIn, Pinterest), etc. The data channel categories are: ‘Lifestyle', ‘Business', ‘Entertainment', ‘Social Media', ‘Technology', and ‘World'. In addition, several natural language processing features were also extracted.

Task Requirements

As a data analytics team member for the consultancy firm, you are required to carry out a number of data analytics tasks for the consulting company using the data collected. You are given access to a sample of the data where some of the variables have been removed as they are not considered important for the analysis of this assignment.

The company is interested in identifying for each data channel:
- Investigate the impact of the article properties on sharing;
- Use the SAS Text Miner for text analysis to identify key features in the articles and analyse their contribution towards low and high sharing.

To achieve the above, you need to carry out the following data analytics tasks:

a) Task 1: Explore the impact of article properties

Explore the data and investigate what properties of the article correlate with the high number of shares of the article on social media.

- Open the dataset ‘online_news_popularity.xlsx' using Microsoft Excel.

- Explore the dataset to understand and manage the six types of data channels (lifestyle, entertainment, bus, socmed, tech, world) and the associating data. In each data channel column, the value of 1 represents that the data in the row is of the corresponding data channel.
- Copy the separate datasets for each channel to different Excel sheets (sort and filter by each data channel to separate).
- In each data channel, identify the articles with a high number of shares (with the threshold of top 10% in the dataset).
- Investigate the following properties and explain how they could have affected the high number of shares. You should provide the explanation to support your argument.
o Number of tokens in the title
o Number of tokens in the content
o Was the article published on the weekend
o Number of links
o Number of images
o Number of videos
(Hint: To do this, you can create plots in R between the corresponding columns and the number of shares. You may want to include a fitted line to your plots to investigate the correlation for continuous variables.)

2. Task 2: Use SAS Text Miner for keyword analysis

- Use the SAS Text Miner to extract the keywords from the title in each data channel. (Hint: To do this, you can refer to the workshop activities in Week 3 and Week 4; by setting ‘Title' column as the only ‘Text' role in the variable setting.)

- What are the highly used (top 10) topics in each category? Use the SAS Result window to explain your answers.
(Hint: ‘Topic' column will need to be set as the only ‘Text' role.)

- Are there common topics which span across data channels and relate to a high number of shares and a low number of shares? Use the whole dataset in the SAS Text Miner to identify the relationship. You should provide the explanation to support your argument.

(Hint: Use the whole dataset to identify the articles with the high number of shares and the low number of shares - by using appropriate thresholds with the top 10% and the bottom 10% in the dataset. Separate the dataset using Excel based on this before the analysis and use these two datasets to analyse the common topics in each of them. In this question, please use ‘Title' column as the only ‘Text' role for topic modelling.)

You are required to:

a) Prepare a report for the Case Study A with all the analytics results to the above two key tasks. (You can use an appendix for any additional screenshots, figures and tables, which you feel are important for the report). The report should be named as:
b) Save the R script after Task 1 above as: <student_id>Assignment1A.r
c) Save the SAS project for Task 2 above as <student_id>Assignment1_Task1.spk. You may zip the SPKs files if you have multiple of them. The SAS project file should be named as:

Case Study B

Sentiment analysis is the technique aiming to gauge the attitudes of customers in relation to topics, products and services of interests. It is a pivotal technology for providing insights to enhance the business bottom line in campaign tracking, customer-centric marketing strategy and brand awareness. Sentiment analytics approaches are used to produce sentiment categories such as ‘positive', ‘negative' and ‘neutral'. More specific human emotions are also the topic of interest. There are two major streams of methods to develop sentiment analytics engine: the dictionary-based and machine learning-based approaches. In this assignment, you are required to perform sentiment analytics based on both approaches.

Task Requirements

As a data scientist, you are required to perform a number of data analytics tasks. You are tasked to develop both dictionary-based and machine-learning sentiment analytics engines using R programming language and apply it to predict the sentiments of hotel review tweets from a sample of data. You are also required to use the SAS Sentiment Analysis Studio to compare the results.

To achieve the above, you need to carry out the following data analytics tasks:

Task 1. Develop a dictionary-based sentiment analytics engine based on the R library ‘syuzhet' to analyse the different emotions from hotel review tweets.

• Analyse and aggregate the eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust) from the hotel review tweets file ‘hotel_tweets.csv' using the function ‘get_nrc_sentiment'.

• You are required to plot a chart to visualise these emotions using the R library ‘ggplot2'.

• You should combine both negative and positive tweets into one before conducting the analysis.

Task 2. Develop a machine learning-based model using the R libraries ‘tm' and ‘e1071' as well as evaluate the predictive accuracies of SVM classifier.

• Develop R scripts and import the data set ‘hotel_tweets.csv' for training and testing.
• Use the first 200 negative tweets and the first 200 positive tweets as the training
dataset; and use the rest of the 63 negative tweets and 63 positive tweets as the testing dataset.
(Hint: You may need to use as.character() function to convert a dataframe column from factors to characters.)
• Develop a machine learning-based sentiment analytics engine and predict sentiment categories (only ‘positive' and ‘negative') using ‘tm' and ‘e1071' with the SVM classifier.
• Evaluate the testing accuracies and report the predicted results.

Task 3. Develop a statistical model using SAS Sentiment Analysis studio and evaluate the accuracies (5%).
• Use the data folder: ‘hotel_tweets' which contain ‘negative' and ‘positive' tweets for training and testing.
• Build a statistical model using SAS Sentiment Analysis (either simple or advanced), you may change configurations in the advanced model to obtain the best training accuracy.
(Hint: Refer to the SAS Sentiment Analysis Studio tutorial.)
• Evaluate and compare the testing accuracies for different models and report the results.
• Compare this result with the previous predictive results using R and discuss.

You are required to:

a) Prepare a report for Case Study B with all the analytics results to the above three key tasks. (You can use an appendix for any additional screenshots which you feel are important for the report). The report should be named as:
<student_id>Assignment1B_Report.doc
b) Save the R script after Task 2 above as: <student_id>Assignment1B.r
c) Save the SAS Sentiment Studio project as: <student_id>Assignment1_SAS2.zip

Attachment:- Customer Analytics and Social Media.rar

Reference no: EM132420634

Questions Cloud

What total amount of gain or loss on its securities : What total amount of gain or loss on its securities should be included in Nance's income statement for the year ended December 31, 2008?
Develop your distribution strategy for the product. : What retail outlets will sell your new product? How will you manage your supply chain? Develop your pricing and promotional strategy.
Do you agree with the arbitrator decision in this situation : Discussion Questions - Do you agree with arbitrator's decision in this situation? Is risk worth it in order to 'send a message' about acceptable social conduct
What must the coupon rate be on the bonds : At this price, the bonds yield 5.9 percent. What must the coupon rate be on the bonds?
Social Media Analysis for Understanding Customer Preferences : Social Media Analysis for Understanding Customer Preferences and Sentiments - Develop a machine learning-based model using the R libraries
Should the managers suggestion be accepted : The additional cost of the widget is $10 per unit. Should the manager's suggestion be accepted? Show all calculations.
What is the current ratio : Carman Inc. has net working capital of $2,710, current liabilities of $3,950 and inventory of $3,420. What is the current ratio? What is the quick ratio?
What was net capital spending in 2015 : The company's 2015 statement of comprehensive income showed a depreciation expense of $345,000. What was net capital spending in 2015?
Determine the amount of cash riverbed received : Determine the amount of cash Riverbed received from the loan on December 31, 2020

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd