Reference no: EM133736014
Assignment: Program that Prints Some Statistical Results Worksheet
Write a program that reads a text file, then prints some statistical results about its contents to an output file. The results will be primarily an analysis of the characters in the file, which will include the frequency of occurrence of certain categories of characters, as well as the frequency of occurrences of each of the letters of the alphabet.
Note that in most standard English text, certain letters typically appear with a higher frequency than others. This kind of information has been used in setting up games like Scrabble (letter frequencies and point values, etc), as well as more serious applications like decoding encrypted messages and cyphers.
Task
1) Start by asking the user to input the name of the input file, and then to input the name of an output file. Whenever the user enters a filename that cannot be opened (both in the input and output cases), print an error message and ask the user to re-enter. (We've seen several code examples in the course notes that illustrate this). The input file can be ANY file containing plain text.
2) Read the contents of the input file, and then print to the OUTPUT file the following information about the contents of the input file:
a) A header stating a general heading and the name of the input file
b) The total number of characters contained in the file
c) A chart (with headings) where each row lists a category, the number of occurrences of that category of character, and the percentage of the total file this makes up. These are the catgegories:
i) Letters
ii) White space
iii) Digits
iv) Other
Note that the percentages in this chart should add up to 100%
d) A heading "LETTER STATISTICS"
e) Another chart (again, with headings), listing a category, the number of occurrences of that letter type, and the percentage of all LETTERS this comprises. This time, the categories are Uppercase letters, lowercase letters, and then each fo the individual letters of the alphabet (i.e. 26 of them). So this chart will have 28 rows
i) Note that the percentage of uppercase + lowercase should add up to 100%
ii) Also, the percentages of the 26 alphabet letters should add up to 100%
iii) Specifically, note that these are percentages of the total number of letters, NOT the total number of characters in the file
f) A heading "NUMBER ANALYSIS", followed by the count, sum, and average (to 2 decimal places) of all integer numbers appearing in the file, where a number is defined as any consecutive sequence of digits bounded by non-digits. Examples:
g) I am 14 years old and it is now 11:15 AM and my IP address is 123.45.0.204
In the above passge, there are 7 "numbers" (14, 11, 15, 123, 45, 0, and 204)
3) All percentages should be printed to two decimal places, along with a space and a % sign afterwards. Example: 12.45 %
4) Your headings and category labels should match mine (see example output files below). You may use whatever field widths you like, as long as your charts line up in neat columns (i.e. your spacing doesn't have to match mine exactly)
5) Category labels on your chart should be lined up on the left side of the words. All numbers in your charts should be lined up on the right side.
6) Hint: Instead of declaring 26 different variables to count each letter, consider using an array of counters
General Requirements
1) No global variables
2) You may use any of these libraries (no others):
a) iostream
b) iomanip
c) fstream
d) cctype
3) Write your source code so that it is readable and well-documented
4) Part of assessing readability and style will be how well you break your program into appropriate functions. Note: Breaking it up into functions makes for smaller, easier-to-code segments. You'll make your work easier if you do so.