Social Media Analysis for Understanding Customer Preferences

Assignment Help Other Subject
Reference no: EM132420634

BUS5CA - Customer Analytics and Social Media - La Trobe University

Assignment - Social Media Analysis for Understanding Customer Preferences and Sentiments

Learning Objective:

The learning objective of Assignment 1 is to further develop your understanding and skills on social media analytics via performing analysis on two case studies:

1. Case Study A: you will work as a social marketing analyst in a consulting company to uncover the impacts of online advertising and communication with customers. The aim of the study is to educate the marketing teams of their clients (in diverse industries) to market their products and/or services on social media to maximise customers' involvement (positive interest and sharing). The company is interested in finding out the relationship between the keywords, shares, sentiments and whether there is a relationship in different topic categories such as entertainment, technology, business, etc. that are of interest to different clients in various industries.

2. Case Study B: you will be a data scientist working for a hotel review firm to develop a sentiment analytics engine for Twitter, which is used to predict consumers' review sentiments. The aim is to develop both dictionary-based and machine learning-based sentiment analytics scripts using a number of R libraries and SAS Sentiment Analysis Studio (covered in the workshop activities on Week 4 and Week 5). You are required to use the developed engine to predict hotel reviewers' sentiments and benchmark various algorithms and analytics tools.

Case Study A

Leveraging the power of content and social media marketing can help elevate the audience and customer base in a dramatic way. However, using social media for marketing without any previous experience or insight could be challenging. It is vital for a marketing team to understand social media marketing fundamentals. If a company publish exciting, high-quality content and build an online audience of quality followers, they can share it with their own follower audience on Twitter, Facebook, LinkedIn, Google+, their own blogs and many other social media platforms. This sharing and discussing of content open up new entry points for search engines like Google to find it in a keyword search. Those entry points could grow to hundreds or thousands or more potential ways for people to find a company, product or service online. Finding and understanding the online influencers in the market who have quality audiences and are likely to be interested in the product, service or business could make a huge positive impact.

The consulting company collected information on articles that were shared by people on social media. The dataset contains approximately 39000 articles and a large number (with the total of 31) of features were extracted from the HTML code of the article, including the title and the content of each article. (The description of the dataset is provided as an appendix.) Some of the features depend on characteristics of the service used, which could be analysed based on the meta-data provided: articles have the meta-data, such as keywords, data channel type and the total number of shares (on Facebook, Twitter, Google+, LinkedIn, Pinterest), etc. The data channel categories are: ‘Lifestyle', ‘Business', ‘Entertainment', ‘Social Media', ‘Technology', and ‘World'. In addition, several natural language processing features were also extracted.

Task Requirements

As a data analytics team member for the consultancy firm, you are required to carry out a number of data analytics tasks for the consulting company using the data collected. You are given access to a sample of the data where some of the variables have been removed as they are not considered important for the analysis of this assignment.

The company is interested in identifying for each data channel:
- Investigate the impact of the article properties on sharing;
- Use the SAS Text Miner for text analysis to identify key features in the articles and analyse their contribution towards low and high sharing.

To achieve the above, you need to carry out the following data analytics tasks:

a) Task 1: Explore the impact of article properties

Explore the data and investigate what properties of the article correlate with the high number of shares of the article on social media.

- Open the dataset ‘online_news_popularity.xlsx' using Microsoft Excel.

- Explore the dataset to understand and manage the six types of data channels (lifestyle, entertainment, bus, socmed, tech, world) and the associating data. In each data channel column, the value of 1 represents that the data in the row is of the corresponding data channel.
- Copy the separate datasets for each channel to different Excel sheets (sort and filter by each data channel to separate).
- In each data channel, identify the articles with a high number of shares (with the threshold of top 10% in the dataset).
- Investigate the following properties and explain how they could have affected the high number of shares. You should provide the explanation to support your argument.
o Number of tokens in the title
o Number of tokens in the content
o Was the article published on the weekend
o Number of links
o Number of images
o Number of videos
(Hint: To do this, you can create plots in R between the corresponding columns and the number of shares. You may want to include a fitted line to your plots to investigate the correlation for continuous variables.)

2. Task 2: Use SAS Text Miner for keyword analysis

- Use the SAS Text Miner to extract the keywords from the title in each data channel. (Hint: To do this, you can refer to the workshop activities in Week 3 and Week 4; by setting ‘Title' column as the only ‘Text' role in the variable setting.)

- What are the highly used (top 10) topics in each category? Use the SAS Result window to explain your answers.
(Hint: ‘Topic' column will need to be set as the only ‘Text' role.)

- Are there common topics which span across data channels and relate to a high number of shares and a low number of shares? Use the whole dataset in the SAS Text Miner to identify the relationship. You should provide the explanation to support your argument.

(Hint: Use the whole dataset to identify the articles with the high number of shares and the low number of shares - by using appropriate thresholds with the top 10% and the bottom 10% in the dataset. Separate the dataset using Excel based on this before the analysis and use these two datasets to analyse the common topics in each of them. In this question, please use ‘Title' column as the only ‘Text' role for topic modelling.)

You are required to:

a) Prepare a report for the Case Study A with all the analytics results to the above two key tasks. (You can use an appendix for any additional screenshots, figures and tables, which you feel are important for the report). The report should be named as:
b) Save the R script after Task 1 above as: <student_id>Assignment1A.r
c) Save the SAS project for Task 2 above as <student_id>Assignment1_Task1.spk. You may zip the SPKs files if you have multiple of them. The SAS project file should be named as:

Case Study B

Sentiment analysis is the technique aiming to gauge the attitudes of customers in relation to topics, products and services of interests. It is a pivotal technology for providing insights to enhance the business bottom line in campaign tracking, customer-centric marketing strategy and brand awareness. Sentiment analytics approaches are used to produce sentiment categories such as ‘positive', ‘negative' and ‘neutral'. More specific human emotions are also the topic of interest. There are two major streams of methods to develop sentiment analytics engine: the dictionary-based and machine learning-based approaches. In this assignment, you are required to perform sentiment analytics based on both approaches.

Task Requirements

As a data scientist, you are required to perform a number of data analytics tasks. You are tasked to develop both dictionary-based and machine-learning sentiment analytics engines using R programming language and apply it to predict the sentiments of hotel review tweets from a sample of data. You are also required to use the SAS Sentiment Analysis Studio to compare the results.

To achieve the above, you need to carry out the following data analytics tasks:

Task 1. Develop a dictionary-based sentiment analytics engine based on the R library ‘syuzhet' to analyse the different emotions from hotel review tweets.

• Analyse and aggregate the eight emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust) from the hotel review tweets file ‘hotel_tweets.csv' using the function ‘get_nrc_sentiment'.

• You are required to plot a chart to visualise these emotions using the R library ‘ggplot2'.

• You should combine both negative and positive tweets into one before conducting the analysis.

Task 2. Develop a machine learning-based model using the R libraries ‘tm' and ‘e1071' as well as evaluate the predictive accuracies of SVM classifier.

• Develop R scripts and import the data set ‘hotel_tweets.csv' for training and testing.
• Use the first 200 negative tweets and the first 200 positive tweets as the training
dataset; and use the rest of the 63 negative tweets and 63 positive tweets as the testing dataset.
(Hint: You may need to use as.character() function to convert a dataframe column from factors to characters.)
• Develop a machine learning-based sentiment analytics engine and predict sentiment categories (only ‘positive' and ‘negative') using ‘tm' and ‘e1071' with the SVM classifier.
• Evaluate the testing accuracies and report the predicted results.

Task 3. Develop a statistical model using SAS Sentiment Analysis studio and evaluate the accuracies (5%).
• Use the data folder: ‘hotel_tweets' which contain ‘negative' and ‘positive' tweets for training and testing.
• Build a statistical model using SAS Sentiment Analysis (either simple or advanced), you may change configurations in the advanced model to obtain the best training accuracy.
(Hint: Refer to the SAS Sentiment Analysis Studio tutorial.)
• Evaluate and compare the testing accuracies for different models and report the results.
• Compare this result with the previous predictive results using R and discuss.

You are required to:

a) Prepare a report for Case Study B with all the analytics results to the above three key tasks. (You can use an appendix for any additional screenshots which you feel are important for the report). The report should be named as:
<student_id>Assignment1B_Report.doc
b) Save the R script after Task 2 above as: <student_id>Assignment1B.r
c) Save the SAS Sentiment Studio project as: <student_id>Assignment1_SAS2.zip

Attachment:- Customer Analytics and Social Media.rar

Reference no: EM132420634

Questions Cloud

What total amount of gain or loss on its securities : What total amount of gain or loss on its securities should be included in Nance's income statement for the year ended December 31, 2008?
Develop your distribution strategy for the product. : What retail outlets will sell your new product? How will you manage your supply chain? Develop your pricing and promotional strategy.
Do you agree with the arbitrator decision in this situation : Discussion Questions - Do you agree with arbitrator's decision in this situation? Is risk worth it in order to 'send a message' about acceptable social conduct
What must the coupon rate be on the bonds : At this price, the bonds yield 5.9 percent. What must the coupon rate be on the bonds?
Social Media Analysis for Understanding Customer Preferences : Social Media Analysis for Understanding Customer Preferences and Sentiments - Develop a machine learning-based model using the R libraries
Should the managers suggestion be accepted : The additional cost of the widget is $10 per unit. Should the manager's suggestion be accepted? Show all calculations.
What is the current ratio : Carman Inc. has net working capital of $2,710, current liabilities of $3,950 and inventory of $3,420. What is the current ratio? What is the quick ratio?
What was net capital spending in 2015 : The company's 2015 statement of comprehensive income showed a depreciation expense of $345,000. What was net capital spending in 2015?
Determine the amount of cash riverbed received : Determine the amount of cash Riverbed received from the loan on December 31, 2020

Reviews

Write a Review

Other Subject Questions & Answers

  Emergence of bacterial resistance to ciprofloxacin

CSA743 - Discuss different factors that would have contributed to the emergence of bacterial resistance to ciprofloxacin. Support your answer with suitable examples.

  How leadership style can affect employee commitment

Discuss the results and how leadership style can affect employee commitment and organizational effectiveness.

  Describe the metal band straight line stitch

Describe the metal band "Straight Line Stitch". Use your own words to introduce the band, the story behind, and their style.

  How does being unionized impact a workforce culture

Reflect on the following questions, should nurses be unionized and how does being unionized impact a workforce culture of safety?

  Mesopotamian civilization-origins development

Select one Mesopotamian Civilization and describe its origins development and decline politically, socially and economically:

  How you would remain anonymous without blowing your cover

Explain your method of attack and operation within reasonable parameters of the law. Determine how you would remain anonymous without blowing your cover.

  Major limitation in use of gdp as measure of nation''s wealth

A major limitation in the use of GDP as a measure of a nation's wealth is that it does not take into account:

  Identify risk drivers in your project and risks associated

Identify risk drivers in your project and the risks associated with each and explain how the risks you identified could impact the project, and propose strategies to mitigate each risk.

  Describes the persons assessment of the likelihood

Nurses have hard jobs, they encounter a variety of patients. Describes the persons assessment of the likelihood of them getting the given condition.

  Write an article about your evaluation and proposed changes

Write a 350- to 525-word article about your evaluation and proposed changes to the facility or service that can help curb negative effects.

  What are the complications regarding premises and conclusion

What are the complications regarding premises and conclusions. Be certain to identify what a premise and a conclusion are in your response

  What are some of the project risks management techniques

The knowledge concerning the risk management is becoming a matter of importance to successfully tackle the complexity of projects.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd