Reference no: EM133102197
DLE602 Deep Learning - Torrens University Australia
Assessment - Programming Problems
Learning Outcome 1: Build, train and apply deep learning models to real-world tasks.
Learning Outcome 2: Compare and select ways to pre-process signals, images, and texts for natural language, speech recognition, and computer vision applications.
Task Summary
For this assessment, you will undertake a Twitter sentiment analysis using a N-Gram model as described in the article entitled ‘Deep Convolution Neural Networks for Twitter Sentiment Analysis' by Zhao, Gui and Zhang (2018). You can access this article at: Use any two of the five datasets used in this paper and implement Twitter sentiment analysis using Python programming language. Identify and report on the similarities or dissimilarities of the outcomes for two different sources.
Context
Twitter Sentiment Analysis is an automated process whereby text data from Twitter is analysed and segmented into different sentiments (e.g., positive, negative or neutral sentiments). Performing a sentiment analysis on data from Twitter using deep learning models can help organisations understand how people are talking about their brand.
In the above-mentioned paper, Zhao, Gui and Zhang (2018) introduced the concept of using Deep Convolution Neural Networks for Twitter Sentiment Analysis. The authors also briefly described how the N-Gram model applies to the process. They conclude that Deep Convolution Neural Networks, which use pre-trained word vectors, can perform the task of Twitter sentiment analysis well. They used five different datasets to prove their point.
You will focus on the development of a basic Twitter Sentiment Analysis system using the N-Gram probabilistic language model. You will demonstrate your understanding of language processing models and your ability to develop systems using those models. You will also demonstrate your communication skills by drafting a short report.
Task Instructions
To complete this assessment task, you will need to read the article entitled ‘Deep Convolution Neural Networks for Twitter Sentiment Analysis' (Zhao & Zhang, 2018) closely.
You are NOT expected to reproduce all the experiments completed in this paper. This paper is provided as a reference to enable you to better understand the context of this assessment and provide you with an idea of the quality of research papers that you need to read as part of Assessments 2 and 3.
The only task you are required to complete in this assessment is to develop a Twitter sentiment analysis technique that uses a N-Gram probabilistic language model.
Your aim is to be able to analyse any twitter texts and classify them into different sentiments, such as positive, negative or neutral sentiments.
If your N-Gram model (Bigram/Trigram) identifies one fourth of the words in a twitter text as positive, classify that twitter text as positive. If your N-Gram model (Bigram/Trigram) identifies one fourth of the words in a twitter text as negative, classify that twitter text as negative. For any other variation to these two scenarios, classify the twitter text as neutral. If you are using Bigram for positive twitter texts, use the same for negative twitter texts. Similarly, if you are using Trigram, use it for both positive and negative twitter texts.
The authors used five different datasets to prove their points. You need to use two of the five datasets to implement your solution. You do NOT have to use all five datasets.
Use Python as the programming language for this natural language processing assessment. The code must be well formatted and conform to Python naming conventions. You also need to provide sufficient comments in the code.
You are also required to prepare a 500-word report highlighting the similarities or dissimilarities of the outcomes from two different sources. You can choose to divide the word limit into multiple paragraphs. Include a short introduction with any critical points that will help your readers to understand the outcomes for your program. Then, briefly describe whether you see similar or different trends, in terms of positive, negative and neutral twitter sentiments, in both of your datasets. Discuss whether your program behaved in the same way for the different datasets.
You will be assessed based on the completeness of your model, the efficiency of the implementation, the coding convention, the quality of code and your articulation of the outcomes.
Finally, you will submit the source code. You must provide a link to the dataset used. Ensure that you include instructions on how to run your code at the top of your main source code file inside a comment block.
Reference paper - "Deep Convolution Neural Networks for Twitter Sentiment Analysis"
Referencing
It is essential that you use APA style to cite and reference your research.
Attachment:- Programming Problems.rar