Reference no: EM13939508
In this project, you will build a program used by an admissions office to help process applicants for a graduate degree in engineering. Your program will read several database files containing applicant information and other information relevant to the applications process. Based on an admissions "formula," you will compute a score for each applicant, and output a file with all the applicants and their scores. The output of your program will aid the admissions office in their admissions decision.
1 Database
The database your program will process consists of 3 files: applicants.txt, ranking.txt, and journals.txt. (A sample of these files are available on the web site-https://www.enee.umd.edu/class/enee114/projects/pr3/. All database files are text files, containing sequences of ASCII characters. Below, we provide detailed information about the contents and format of each database file. This information will help you write code to correctly read the information from each file into your program.
1.1 Applicants File
The applicants file, named applicants.txt, lists all the student applicants along with their personal and academic information. There is one line of text for each applicant in this file. In the sample applicants.txt file that we provide, there are 810 applicants. In general, you can assume there are never more than 1000 applicants. Each line of text contains at least 7 fields. These 7 mandatory fields provide the following information, in the order listed below:
Last Name
First Name
GPA
GRE Verbal
GRE Quantitative
Publications
Undergraduate Institution
Each field is a sequence of characters, i.e. a string. The first two fields are strings containing the applicant's last and first name, respectively. The next 4 fields are strings that specify numbers. Your program should convert these strings into the corresponding numbers for the purposes of scoring the applicant. The GPA field should be converted into a floating point number between 0.0 and 4.0. The GRE Verbal and GRE Quantitative fields report the applicant's scores on 2 parts of the GRE standardized test (the graduate school equivalent of SATs). Your program should convert each of these two strings into integers between 0 and 800. The Publications field is a count of the number of technical papers the student has published in journals. Your program should convert this string into an integer between 0 and 2. Finally, the last field is a string containing the name of the student's undergraduate institution.
In addition to these 7 mandatory fields, there are up to 2 additional optional fields appearing after the Undergraduate Institution field. These optional fields are strings that specify the journals from which the student has published technical papers. The number of optional fields is determined by the Publications field: if the student has published 0
papers, there are no optional fields; if the student has published 1 paper, there is 1 optional field; and if the student has published 2 papers, there are 2 optional fields.
Fields are delimited by a single comma character, with 0 or more white space characters before and after the comma. For example, an applicant with zero publications would have an entry with the following format:
<f1><W>,<W><f2><W>,<W><f3><W>,<W><f4><W>,<W><f5><W>,<W><f6><W>,<W><f7>
where "<fi>" denotes the string from the ith field, and "<W>" denotes 0 or more white space characters. The white space and comma characters allow you to parse the sequence of fields in the following manner. The first field always starts with the first character in the line, and ends the first time you encounter either a white space or comma character. The second field starts when you encounter the first non-white space character after the first comma, and ends the first time you encounter either a white space or comma character. The third through sixth fields are delimited in exactly the same way as the second field. (Note, the first 6 mandatory fields are guaranteed to never contain a white space or comma character). The seventh field starts when you encounter the first non-white space character after the sixth comma, and ends when you encounter either a comma character or a 'n' character. The two optional fields (if present) are delimited in exactly the same way as the seveth field. (Note, the seventh field and two optional fields may contain white space characters, but are guaranteed to never contain a comma character).
In general, you can assume that a line in the applicants.txt file will never exceed 2000 characters, and any single field will never exceed 256 characters.
1.2 School Rankings File
The school rankings file, ranking.txt, contains a list of the top 25 engineering programs from the 2006 U.S. News rankings. Each line of this file lists one ranked school. The format of each line is:
<rank>. <school>
where "<rank>" is a string representing an integer between 1 and 25, and "<school>" is a string representing the name of a school. Between the ranking and school name, there is exactly 1 period character followed by 1 white space character. Many of the applicants in the applicants.txt file received their undergraduate degrees from schools in this list, but not all of them. You can assume that this file always contains 25 lines. You can also assume that each school name will never exceed 256 characters.
1.3 Journal Impact Factors file
The journal impact factors file, journals.txt, contains a list of computer science and en-gineering journals and conference proceedings, with one journal or conference proceeding listed per line. For each publishing venue, the file specifies an "impact factor" which is a numeric score that reflects the quality of the journal or conference proceeding. The larger the impact factor, the higher quality the journal or conference proceeding. The format of each line in the journals.txt file is:
<journal>, <impact factor>
where "<journal>" is a string representing the name of a journal or conference proceeding, and "<impact factor>" is a string representing a floating point number between 1.12 and 3.31. Between the journal name and impact factor, there is exactly 1 comma character followed by 1 white space character. Note, journal and conference proceeding names may
contain white space characters. All the publications listed in the applicants.txt file are covered by this list of journals and conference proceedings. You can assume that each journal or conference proceeding name will never exceed 256 characters.
2 Applicant Score
Your program should read the contents of all three database files described in Section 1. You will need to create the appropriate arrays of strings, integers, and floating point values to store the database contents, and follow the format rules described in Section 1 to correctly extract all the data.
Once you have loaded the database into your program, you will compute a score for each student that reflects the quality of his/her academic record. To compute the score, begin by averaging each student's normalized GPA and GRE scores. This is known as the "baseline score," and is computed using the following formula:
baselinescore = (((GPA / 4.0) + (GREVerbal / 800.0) + (GREQuantitative / 800.0)) / 3.0) * 100.0
In addition to this baseline score, add 5 points if the student graduated from a school ranked between 11th and 20th, and add 10 points if the student graduated from a school in the top 10. Finally, if the student has publications, find the corresponding impact factor for each published paper, multiply the impact factor by 10.0, and add the result to the student's baseline score. The baseline score, with the school ranking and impact factor adjustments, represents the student's final applicant score.
3 Output
Your program should create an output file, called scores.txt. This file should contain 1 line for each of the applicants in the applicants.txt file. For each applicant, you should print the last and first name of the applicant, the applicant's computed score from Section 2, and the ranking of the undergraduate institution that the applicant attended. For the last name and first name fields, you should pad the field with blank space characters so that the entire field is exactly 15 characters wide. (You can assume that all first and last names are less than 15 characters wide). Between the applicant's score and the undergraduate institution ranking, you should print a single tab character, 't'. Finally, the order of the applicants in the scores.txt should be identical to the corresponding order from the applicants.txt file.