Describe an efficient algorithm for computing

Assignment Help Other Subject
Reference no: EM132184351

Reading:

For this last assignment you can work alone or in pairs. If you work with someone else put both names in comments on the top of any files that you submit. Each of the two parts counts the same as a regular assignment. In addition, the solution to part 2 with the best running time will receive a bonus equal to the maximum score on a regular assignment, if the author(s) can clearly explain their solution to the class.

For many years people have been interested in the question of how connected social networks are. A famous experiment was carried out by Stanley Milgram in 1967. The experiment went something like this: Randomly chosen people in Omaha were given packets that included a letter addressed to a randomly chosen person in Boston. In the unlikely event that the resident of Omaha new the recipient, he was to send it directly. Otherwise, he was asked to select someone he knew who was more likely to know the recipient and send it on to them. The question: On average, how many intermediaries before the letter reached its final destination?

We can perform this type of experiment more easily nowadays using publicly available data, for example using the IMDB actor data. Let's say that if two actors worked on the same film, they are distance 1 apart. If actor 1 worked on a film with actor 2, and actor 2 worked on a film with actor 3, who worked on a film with actor 4, then we would say that the distance between actor 1 and actor 4 is 3 if there is no shorter path connecting them.

In this exercise we will work with the files actresses.list and actors.list, which contain, for each actor in the database, a list of movies the actor appeared in. Here is a typical entry:

Cronin, Doug (I) Chasing Dreams (1982) [Sheriff] <28>

Scared Straight! (1978)

The movie information comprises several items besides movie name (date, role, tv or film, order in credits, etc).
How would you store the data from actors.list in a Java program? You might start by writing a class to represent the data for a particular actor; lets call the class ActorRecord. This class should store the actor's name, and then a list of the movies that the actor has appeared in. We might also use a class to store the list of actors that appeared in a movie.

The actresses and actors files are quite large, and so we will work with compressed versions, actresses.list.gz and actors.list.gz, which are on Moodle and also in the common/GRAPHS directory on afs. Also in that subdirectory are several classes you can use, including RetrieveActors. The constructor takes a string argument that specifies the file from which to extract the actor data; you should give it the path name to actresses.list.gz and/or actors.list.gz. This class reads the compressed actor data without decompressing the entire file. It contains a method getNext() that returns a string for the next actor, or null if there are no more actor records in the file. The string comprises a list of strings separated by "@@@". The first string is the actor's name, and subsequent strings are the movies that the actor appeared in. Each movie string has two initial characters; a "TV" indicates that the title is a made-for-TV movie, a "TS" indicates a TV series or mini-series, a "VO" indicates a video, and "FM" indicates a film.

For example, the entry displayed above would be returned as

Cronin, Doug (I)@@@FMChasing Dreams (1982)@@@FMScared Straight! (1978)@@@

You do not need to be familiar with the details of how RetrieveActors works, or of the format of the IMDB files.

The program BusyActress.java reads in the data from actresses.list.gz and deter- mines which actress appeared in the most movies. If you go to your cs114 directory on afs, the command.

will copy all java files to your directory. You can then compile and run BusyActress without copying the actresses.list.gz file.

Write a program that creates a graph using the data in actresses.list.gz and actors.list.gz. You should have a vertex for each actor and an edge between two actors if they worked on a film together. (Consider only films; not tv shows or videos.) Choose your favorite actor, and compute the "distance" to each other actor. For each possible distance, give the number of actors that distance away (take the distance to be infinity if there is no path.) Use the graph classes on Moodle. That directory also contains a file shortestActors.list.gz, which is of the same format as the other actor files but with only 8 actors and 5 movies. You can check the distances you obtain with the solution in ActorsSmallWorld.pdf.

Call your program SmallWorld. For part 1 your program should work for short- estActors.list.gz. For part 2, tune your program so that it works with the combined data from actresses.list.gz and actors.list.gz. You may need to reevaluate your choices of data structures and algorithms to deal with the large data set. You may need to increase the memory size of the JVM; for example "java -Xmx16g Small- World". Use only algorithms and data structures (and your modifications) that we have covered in the course.

Here is the table for actress Hedy Lamarr based on all actresses and actors:

dist.

count

1

2,224

2

196,781

3

1,555,043

4

974,718

5

121,062

6

12,232

7

1,358

8

321

9

46

10

7

1,287,288

Notes:

(1) In a January 1994 Premiere magazine interview about the film "The River Wild", Kevin Bacon commented that he had worked with everybody in Hollywood or someone who's worked with them. This claim attracted a lot of attention, even resulting in a parlor game called "Six Degrees of Kevin Bacon";.

(2) If you type in "bacon number name" in google, it will give you the "bacon number" of the name you enter, and also give you the path that results in that number. You can try checking your program with the results from Google, but you might get some different answers. If you are interested in the reason.

(3) In addition to appearing in many films, Hedy Lamarr (born Hedwig Eva Maria Kiesler in 1914, died 2000) patented (with George Antheil) a radio frequency hopping communications system that introduced some of the principles upon which modern wireless communications systems are based.

Exercises: (Not to turn in.)

(1) Use induction to prove that a (simple) graph tt = (V, E) has at most
V ( V 1) /2 edges.

(2) Suppose that you need to be able to quickly determine if two given vertices of a graph are connected by an edge. Which graph representation would you use, adjacency list or adjacency matrix? What are the time and space requirements?

(3) Consider a communications network that is a free tree; that is, a graph with no cycles. The time delay in a signal sent across a path is a small constant times the number of links in the path. Suppose that you want to determine the maximum possible delay in the network. The diameter of a tree T is the length of the longest path between two vertices of T. Describe an efficient algorithm for computing the diameter of T .

Attachment:- Problem set.rar

Reference no: EM132184351

Questions Cloud

Employees provide health insurance : Which of the following explains why nearly all firms with more than 200 employees offer health insurance
Civic and social responsibility related to inequality : In your own civic and social responsibility related to inequality. state whether you feel that inequality is a problem or is not a problem and defend your posit
Role third-party payers play in each plan : Discuss the four generic plans for health care reform in terms of the role third-party payers play in each plan.
Discuss the difference between multicast and unicast routing : Discuss the difference between multicast and unicast routing. Choosing either one, describe where it can be used in your professional or personal life.
Describe an efficient algorithm for computing : Describe an efficient algorithm for computing the diameter of T - What are the time and space requirements - Which graph representation would you use, adjacency
Applying for state or federal programs : Are some other options available to an individual, besides applying for state or federal programs?
Insurance coverage to all americans so costly : Why is the PPACA's attempt to extend insurance coverage to all Americans so costly?
Evaluate which method of the secure sdlc will best serve : Evaluate which method of the secure software development life cycle will best serve your team and explain how you plan on implementing.
Factors that contribute to rising health costs : Briefly discuss the demand and supply factors that contribute to rising health costs. Specify how (a) asymmetric information

Reviews

Write a Review

Other Subject Questions & Answers

  Is the sample size sufficient for the research

Does the study pose any risks to the participants? If so, what are they? Have all reasonable steps been taken to protect the rights of the participants? Do the benefits outweigh any risks to participants?

  Do any require changes in law or allocation of tax revenues

Do any require changes in the law or allocation of tax revenues? How would a socially responsible corporation respond to this report?

  Determine the member of the nuclear family

Theoretical perspective might lead you to determine the member of the nuclear family who spends the most time doing housework?

  Discuss a local policy example where the principle applies

As demand for services increase, tax rates and service fees often increase. This results in a slower rate of new development and revenue growth.

  What is the just-wage doctrine

A typical structure within colleges is instructor, assistant professor, associate professor, full professor. Is this egalitarian or hierarchical?

  Capitalism and socialism represent two ends of a continuum

The author of the textbook indicates that capitalism and socialism represent two ends of a continuum in which all real-world economies can be located. Also, most experts agree that capitalism is more productive and socialism generates more social ..

  Draft a mission and philosophy statement

Begin to draft a mission and philosophy statement for your chosen program scenario. Provide rationales for the elements you included in the statement

  Explain why couples live together without getting married

The question is belongs to Sociology and the question describe the cohabitation and live-in relationships. Due to the changes in gender role over the years, most working women do not feel the need to be dependent and hence, cohabitation has risen.

  In what ways are you more attuned to children today

Think about how you looked at children in your everyday world before this course. In what ways are you more attuned to children today?

  Define descripiction on hierarchy of needs

Please write a 300 word descripiction on "Hierarchy of needs" and "expectancy theory".

  Examine legal and professional ethical standards

Explain how the board refers legal issues that they cannot address to the appropriate agencies. If this information is not available, please describe the efforts you made to access it.

  Review the information provided on hofstedes six dimensions

Review the information provided on Hofstede's Six Dimensions of Culture in this week's lecture. The CSU-Global Library is a good place to find these references.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd