Determine the relation between storage size and genomes, Computer Networking

Assignment Help:

Genome4U is a scientific research project at a large university in the United States. Genome4U has recently started a large-scale project to sequence the genomes of 250,000 volunteers with a goal of creating a set of publicly accessible databases with human genomic, trait, and medical data.

The project's founder, a brilliant man with many talents and interests, tells you that the public databases will provide information to the world's scientific community in general, not just those interested in medical research. Genome4U is trying not to prejudge how the data will be used because there may be opportunities for interconnections and correlations that computers can find that people might have missed.

The founder envisions clusters of servers that will be accessible by researchers all over the world. The databases will be used by end users to study their own genetic heritage, with the help of their doctors and genetic counselors. In addition, the data will be used by computer scientists, mathematicians, physicists, social scientists, and other researchers.

Genome4U has developed new techniques to sequence a person's genome quickly, accurately and most importantly at low cost.  The research group is a contestant for the $10,000,000 X-Prize offered by Archon-Genomics (see https://genomics.xprize.org for details).  With their current funding they expect to complete the pilot project with capability to store the research data for 1,000 individuals by December 2012. And can sequence 5,000 individuals every month thereafter.

In addition to genetic information, the project will ask volunteers to provide detailed information about their traits so that researchers can find correlations between traits and genes. Volunteers will also provide their medical records. Storage will be required for these data sets and the raw nucleotide data. This detailed medical information is expected to require not more than 100 Mega-Bytes of storage for each individual.

Since the data is to be publically shared, an initial community of 25,000 active users are expected, and this community expected to double every 18 months.  Active users are expected to access 10% of the entire database daily which is expected to create huge demand on the networking infrastructure. For user navigation, search and management HTTP will be used as well as FTP for genome data transfer.

Also, the data center with the NAS and the research center with equipment to enumerate genome sequence are in different university campus buildings. To store one genome information in the NAS, 25% of traffic overhead is generated.

You have been brought in as a network design consultant to help the Genome4U project and the management team has asked you to help them organize their requirements.

They would appreciate your analysis to answer the following questions:

1. List the major user communities.

2. List project technical goals. Specify expected tradeoffs.

3. Calculate data storage requirements in the table below (10 marks). (Hint: do not forget RAID 6 waste of disk space)

Parameter

By December 2012

Next each month

Storage size

 

 

Number of NAS servers

 

 

4. Estimate additional bandwidth requirements between data and research center buildings in kbps. Assume that a Month is 20 work days and equipment works 10 hours per day.  (Show all your calculations)

5. Can you determine the relationship between the storage size, number of genomes, number of users and network capacity requirements? If possible express this as an equation.

6. Characterize the network traffic in terms of flow, load, behavior, and QoS requirements. You will not be able to precisely characterize the traffic but provide some theories about it and document the types of tests you would conduct to prove your theories right or wrong.


Related Discussions:- Determine the relation between storage size and genomes

Diffrence between network vs internet layers, Q. Diffrence between Network ...

Q. Diffrence between Network vs Internet Layers? - Similar to all the other OSI Layers the network layer provides both connectionless and connection-oriented services. From th

Combine subtitution and transposition, how to own cipher to encrypt and dec...

how to own cipher to encrypt and decrypt message by combine both substitution ans transposition algorithm using c program

Define the microcells- routing and switching, Microcells As cells becom...

Microcells As cells become smaller, antennas move from the tops of tall buildings or hills, to the tops of small buildings or the sides of large buildings, and finally to lamp

Round trip time and time out - transport layer, Round Trip Time (RTT) and T...

Round Trip Time (RTT) and Time Out The  size and  the complexity  of computer  networks  have grown  in past years. To achieve  an efficient  and reliable transmission  some

Network Fundamentals BTech Telecommunication, Write a report on your chosen...

Write a report on your chosen topic that has been approved by your Lecturer. Your report should include appropriate figures, about 2000 words. Topics: 1 MAN- Metro Ethernet 2 Inte

Describe the main factors of switching delay, Describe the main factors of ...

Describe the main factors of switching delay No. The speed of propagation is 200,000 km/sec or 200 meters/µsec. In 10 µsec signal travels 2 km. Therefore, each switch adds equi

What are the main differences between ripv1 and ripv2, 1.  What does RIP st...

1.  What does RIP stand for? 2.  What metric does RIP use for Path Selection? 3.  If the metric used by RIP exceeds this value for a route it is considered unreachable, effec

Design a logical lan topology- ccna, Design a Logical LAN Topology Step...

Design a Logical LAN Topology Step: Design an IP addressing scheme. Given the IP address block of 192.168.7.0 /24, design an IP addressing scheme that states the following r

Explain clustering support, Explain clustering support Clustering suppo...

Explain clustering support Clustering support refers to the ability of a network operating system to link multiple servers in a fault-tolerant group. The main purpose of this i

Determine the names of network installation, Determine the names of Network...

Determine the names of Network installation ? Buying the components ? Cabling the network ? Installation of networking components such as hubs, switches, routers, gateway

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd