Create a word cloud to visualize frequency

Assignment Help Computer Engineering
Reference no: EM133422673

Some of the most important data that we have are large quantities of text data. Text data includes books, articles, blog posts, social media posts, emails, reports, journals and diaries, shopping lists, etc. etc. etc. This data is unstructured and can be massive. We have software tools available to help us make sense out of large quantities of text data. When we use software to analyze and try to make sense out of text data, we call this text mining. A practical example of this is chat bots. Chat bots utilize algorithms to try to make sense out of what the user types so it can select an appropriate response. In this task we will explore the text mining strategies of looking at word frequency and occurrences. Other methods beyond the scope of this class include part of speech tagging and machine learning. 

Tasks: 

In this task you will need files or web links (URLS) with text data. Please watch the How-To video before you start the assignment. You will use two free browser-based text mining tools to analyze your text. 

Task 1. Visualizing Word Frequency. Looking at the frequency of occurrence of each word can give you an overall sense of a document. You will use your first file to create  a word cloud to visualize frequency and understand how stop words affect word frequency analysis. We will use WordClouds for this.

Task 2. Taking a deeper look at word frequency and occurrence. For the second text document we will use Voyant Tools to explore other visualization methods as well as look at word correlations. 

Download the attached Word document Template add your screen shot for Task 2 and answer the questions for each task. This website can show you how to take a screenshot on whatever device you are using.

Steps for Task 1:

1. Open the following website: WordClouds

2. Watch this HowTo video. To create  a Word Cloud, you can upload a file or paste text or submit a URL.

3. Personalize your Word Cloud by using different Theme, Shape, Gap, Font or other options. Observe that Stop Words can be included or excluded. They are excluded by default, so you have to create  two Word Clouds from the same text: one with and one without the stop words. The website has the common English stop words already.

4. Save your Word Clouds as two images, called "WordCloud with Stop Words" and "WordCloud without Stop Words". Use the PNG file format. Submit these image to the Assignment 3 drop box.

5. Answer the questions in the Word document for Task 1.

Steps for Task 2: 

1. Open the following website: https://voyant-tools.org/

2. Paste in some URLS or paste the text from your Task 1 text file or from a different text file. 

3. You will see data in five different windows and each window has a tab. 

     a. In the upper left window click 'Links'

     b. In the upper middle window click 'TermsBerry'

     c. In the upper right window, leave it on 'Trends' (or click it if it's not there)

     d. In the bottom left window, leave it on 'Summary' (or click it if not there)

     e. In the bottom right window, click 'Correlations'

Reference no: EM133422673

Questions Cloud

Calculate the expected bonus per hour that the server makes : does not have to wait in the queue before entering service. Calculate the expected bonus per hour that the server makes
Describe what this will look like for your infants : Explain why you agree or disagree with your co-teachers suggestion. Will you implement these suggestions? If so, describe what this will look like for your
Create a new folder for pacific trails case study : Create a new folder for Pacific Trails case study - check with your instructor and ask if you can use it instead. You have four tasks in this case study
How will you support a child with behavioral issues : Why do you need to understand the Characteristics of Learners? What are the ways to improve in connecting with your learners? As an Education Assistant
Create a word cloud to visualize frequency : Create a word cloud to visualize frequency and understand how stop words affect word frequency analysis. We will use WordClouds for this
What understanding have you gained from this course about : What understanding have you gained from this course about care coordination ? How has your understanding and perspective of care coordination changed
Discuss structure of the website by sharing digital sitemap : Confirm the website requirements that you have documented in Part 1 of this asessment and Discuss the structure of the website by sharing digital sitemap
Research and list the symptoms deficiencies associated with : Research and list the symptoms/deficiencies associated with all of the vitamins in which you have less than 100%? Are you experiencing any of these symptoms?
Discuss the most important points to consider when planning : Discuss the most important points to consider when planning for the social development of young children? Discuss the most important points to consider

Reviews

Write a Review

Computer Engineering Questions & Answers

  Create a company to sell school-specific aloha shirts

After Sal Aurigemma received his PhD from the University of Hawaii, he became an assistant professor at the University of Tulsa.

  What are the differences between ascii and unicode

What are the differences between ASCII and Unicode? What primitive data types can normally be represented and processed by a CPU?

  Describe the different types of security violations

Describe the different types of security violations and give an example for each type of security violations. How can we protect against these security.

  Display a message showing the account id and balance

Have a default constructor which sets the dBalance to 0.0 And a second constructor Balance which takes in an amount that Balance is set to.

  Select two organizations in which you are interested and

select two organizations in which you are interested and use the internet and strayer library to research the

  Discuss the problem related to cloud computing

It consists of three parts: a twelve minute presentation on a topic from the list I provided or an alternative approved by me with a minimum of 12 slides not.

  How large are the five largest fires

How large are the five largest fires - For the records you obtained from the previous question, what are the corresponding month, temp, RH, wind, rain, area

  Declare the subtypes necessary

Declare the subtypes necessary for each of the following- A 24-component Float array for which the index goes from 1 to 24.

  Which architecture for deploying a firewall is most used

What are the reasons that VPN technology has become the dominant method for remote workers to connect to the organizational network?

  Implement your class use suitable object-oriented language

Design a Counter class, such that each Counter object is to be a counter. Implement your class. Use any suitable object-oriented language.

  One control against accidental software deletion

One control against accidental software deletion is to save all old versions of a program. Of course, this control is prohibitively expensive in terms of cost of storage. Suggest a less costly control against accidental software deletion. Is your ..

  Develop a model that will allow applecore

Develop a model that will allow Applecore to maximize the number of customers reached for a budget of $10,000 for one week of promotion

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd