Write a program that determines the linguistic complexity

Assignment Help Other Subject
Reference no: EM133217784

Measuring Complexity

Overview
This is intended as a "Warm-up" exercise. The main purpose of this assignment is to demonstrate your problem-solving ability and basic coding skills, and to practice conducting experiments and communicating results.

Background
One way of characterizing data is to measure and compare its relative complexity. This requires a definition of complexity and a method of comparison.

Consider linguistic complexity. We could define the complexity of written text as the ratio of the number of unique words to the total number of words in a sample. Think of a book for young readers ("See Spot. See Spot run."). The intuition is that repeated words and a small vocabulary make for simpler comprehension.

This complexity metric is very easy to compute. The text sample "Now is the time for all good men to come to the aid of their country." is a sentence consisting of 16 words, of which 14 are unique. Hence its complexity is 0.875 (14/16) as defined.

Different types of documents naturally have different complexities. Speeches and sermons are often less complex, both because of their oral delivery mode and their desire to establish a pleasing or memorable rhythm in the ear of the listener. Novels are often of low total complexity simply because they are lengthy (i.e. resulting in a large denominator)

Interestingly, it's not uncommon for complexity to change in a given document. For example, the introductory chapter of a textbook might use less discipline-specific words and hence be less complex than later chapters.

Specifications
Your assignment is to write a program that determines the linguistic complexity, as defined above, of the posted documents.

Your program should:
• Read in the text sample
• Clean it
- Convert all text to lowercase
- Eliminate all non-alphabetic characters
• Tokenize the remaining text (break it into words)
• Determine the total linguistic complexity of the document
Notes:
• This assignment is a good opportunity to learn/refresh your C/C++ skills.
• Be sure to demonstrate good programming style and practices.
Experiments
The Project Gutenberg website contains thousands of documents in text format.
• Is there a relationship between complexity and document length?
• Has linguistic complexity changed over time (for example, are today's books
easier to read than those written in the 1700's)?
• How does a novel compare to a textbook? How does a Trump speech compare to
an Obama speech?
• Other ideas of your choosing...
Deliverables
• Submit a single PDF, formatted as a Report. The report should describe your approach (data structures, algorithms), problems encountered, the experiments you conducted, and your results/analysis/conclusions. Include source-code, sample output, and complexity graphs.

Reference no: EM133217784

Questions Cloud

Describe a performance appraisal technique : Describe a performance appraisal technique or form with which you are familiar and assess its strengths and weaknesses.
Discuss the culture change movement : Discuss the culture change movement in long term residential care and supports with current North American literature
Name two capital investments from chosen trade entity : Name two capital investments from your chosen publicly traded entity, one that has associated cash flows
Civil division of supreme court : On November 2018, there was a huge debate in Spain over who should pay the mortgage creation tax. Should the buyers (consumers) or the sellers of the mortgage (
Write a program that determines the linguistic complexity : CS 677 High-performance Computing, Grand Valley State University - demonstrate your problem-solving ability and basic coding skills, and to practice conducting
Discuss each of the four major financial statements : Discuss each of the four major financial statements. What is the purpose of each? Which must be prepared first? How are the statements interrelated
Define service learning in the context of my profession : Define service learning in the context of my profession. Assess the impact of my service on the community. (the general public and Military)
Identify an example of a global company whose strategy : Identify an example of a global company whose strategy is to achieve higher quality in order to increase market share and profitability.
Implement and test object orientated programmes : KF7014 Advanced Programming, Implement and test Object Orientated programmes using advanced techniques ensuring a high level of quality and data security.

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd