Assignment Document

COLLECTIVE INTELLIGENT BRICKSTo access IceCube storage.

Pages:

Preview:


  • "COLLECTIVE INTELLIGENT BRICKSTo access IceCube storage. Blue Gene/L [17] or other application servers must install a GPFScluster (separate from the IceCube GPFS cluster), in which a GPFS node runs on each BlueGene/L node that requires access to IceC..

Preview Container:


  • "COLLECTIVE INTELLIGENT BRICKSTo access IceCube storage. Blue Gene/L [17] or other application servers must install a GPFScluster (separate from the IceCube GPFS cluster), in which a GPFS node runs on each BlueGene/L node that requires access to IceCube storage. The Blue Gene/L GPFS cluster remotelymounts an IceCube GPFS file system, allowing it to make file I/O requests to the IceCube filesystems. When such requests are made, the file requests are mapped by the Blue Gene/L nodes toNSD nodes in the IceCube GPFS cluster, and the file requests are sent directly to the appropriatebricks from the Blue Gene/L nodes. The management processor is not involved in this process. GPFS includes sophisticated controls for disk management, including performing data andmetadata striping, mirroring, rebalancing, and migration. The Kybos management softwareleverages those controls as described in the next section. Kybos The purpose of the Kybos software is to reduce the administrative effort required tomaintain a storage system. Its management effort scales with the number of applications usingthe storage system rather t\\han with the number of disks in the system, as in many storagesystems today. It is based on a policy-oriented management interface. An administrator classifiessets of data for each application according to a small number of clear and simple service goals,along with rules for the identification of newly created data (by directory, by user ID, by filename, by file type, etc.). Kybos implements a self-management control system that performs three functions.First, it monitors the state of the system as reported by the hardware sensors described earlier andthe state of the file system as reported through the GPFS administrative interface. Second, itdetects important events by analyzing the current state and trends of the system against the dataservice goals set by administrators, along with other system invariants (such as routing statisticsand acceptable voltage and temperature ranges). Third, it schedules activities (such as datamigration, restoring of desired data redundancy, and replication or backup of sets of data) torealign the system state with the data service goals.In the Kybos model, the system administrator expresses the goals for a given set of databy creating a Kybos resource pool, described in terms of capacity (lower reserve and upper limitDEPT OF IT, PDCE 2009-2010 Page 37 COLLECTIVE INTELLIGENT BRICKSon the number of gigabytes of storage), performance (lower reserve and upper limit on thethroughput and/or response time), and reliability (e.g., for the active copy, the number of datalosses per exabyte-year of stored data that are acceptable, and for GPFS secondary copies, arecovery- time objective). Kybos remembers the goals for resource pools, and when a new file iscreated, identifies the resource pool to which the new file is to be assigned according to theattributes of the file. It then determines how the file is to be placed on physical disks. Kybos self-management relieves administrators of other device- oriented management tasks aswell. Today, for example, it requires many distinct steps to install additional physical storage andmake it visible in a typical storage area network (SAN)-based storage server. The steps includegathering requirements, installing the disks, controllers and cables, configuring the RAID arraysand virtual disks, host mapping, zoning, initiating the rediscovery of SAN devices in theapplication servers, and extending the file system. As an example, in a case study, 25 person- days were required to perform these tasks. Using the Kybos management model, the time wasestimated to be reduced to seven person-days, of which most were required for high-levelplanning. This is because it is expected that the administrator simply needs to physically placethe bricks in the cube and define the rather obvious resource pool parameters. This leaves Kybosto discover, power, and use the bricks. 8.4. System status The prototype cube is connected to a two-rack Blue Gene/L system at the IBMAlmaden Research Center via Gigabit Ethernet links and other application servers. The meshrouting control protocol has been implemented and demonstrated to work in scenarios withmultiple failures. File storage and access from Blue Gene/L using GPFS and hardwaremonitoring and visualization are completed and have been demonstrated. The first integrated version of Kybos self-management software is currently being implemented.It provides resource pools with the ability to set coarse-grain capacity goals and limitedreliability goals (mirroring/no mirroring of primary data copy). The first version will alsoimplement all hardware monitoring and safety controls for the prototype and sufficient GPFSmonitoring and control to detect failed components and invoke recovery actions in GPFS. ADEPT OF IT, PDCE 2009-2010 Page 38 COLLECTIVE INTELLIGENT BRICKSmore extensive set of self-management algorithms is also being developed on the Kybossimulation platform. Calculated bandwidths as a function of IceCube dimensions. Other file services Thus far, the sole storage access method used has been via GPFS remote mounting of theIceCube GPFS file systems, although there are other access methods that can be deployed. Forexample, selected bricks could also run Network File Service (NFS) servers, or all bricks couldrun distributed NFS servers. Other IBM projects at Almaden Research are building suchcapabilities. Common Internet File Service (CIFS) is another option that could be implementedon the system. Note that the cube can, in principle, run any software developed for a Linuxcluster, because that is what it is. Note that the prototype IceCube system is built with integrated circuits that were first introducedseveral years ago. Thus, the prototype system is about an order of magnitude slower-for allinteresting parameters-than what one could build today with the same number of bricks.Nevertheless, even the IceCube prototype is quite a capable system. For example, it could store5,000 movies in DVD format and simultaneously stream 900 MPEG-2 streams to users. Thesenumbers are based on measured sustained data transfer rates to application servers. Very large systems The following section discusses the scalability of the IceCube architecture to very large systems.As discussed earlier, thermal and power constraints are not an important issue for thisarchitecture. A critical concern is floor loading. Typical limits today are 500-1,200 kg/m^sup 2^(100-250 lb/ft^sup 2^). This effect limits the number of bricks that can be vertically stacked andforces large systems to spread out horizontally. Eventually, this will change the average distancemessages must travel between nodes from scaling with the third root of the number of bricks (forcubes) to the second root (for a flat mesh), which is a less favorable scaling. Whether or not thisbecomes an issue depends on the application. A storage server will be more latency-tolerant (interms of hop count) than a supercomputer doing more closely coupled computations. SystemsDEPT OF IT, PDCE 2009-2010 Page 39 COLLECTIVE INTELLIGENT BRICKSwith 1,000 to 2,000 bricks (e.g., five or six bricks high and 15 to 20 bricks wide and deep) appearto be a practical upper limit. This corresponds to a dynamic scaling range of two orders ofmagnitude if one assumes that the smallest practical system contains ten bricks. However, it would be impractical to provide a single base of such a size. Rather, one couldpartition the system into smaller subcubes and assemble the system in situ from these subcubes.It is likely that one could build self-aligning capacitive couplers that could tolerate centimeter- scale misalignments between subcubes. Such couplers would remove the need to connectsubcubes via cables or fibers; rather, the same 3D mesh structure could be extended across theentire system. This would greatly simplify the assembly of very large systems. Commodity serial ATA 3.5-in. disk drives should approach one- terabyte capacity in 2007. A1,000-brick system, each brick containing eight such drives, provides eight petabytes (PB) of rawstorage capacity. The use of a strong dRAID algorithm reduces the usable storage capacity to 4-6PB [16], with a data-loss probability resulting from simultaneous failures measured in a only afew events per exabyte-year. This is an interesting system for many emerging applications thatrequire storing large amounts of data, such as images, digital media, and regulatory compliancedata. Small systems The brick architecture as described is designed for medium-sized to very large systems, wherethe failure of a few bricks makes little difference. Nevertheless, the ease of management andscalability promised by the architecture has prompted inquiries about its app\\licability to systemsfor the small and medium-sized business market, which require only a small number of bricks.For systems with fewer than approximately eight bricks-in which the failure of a single brickbecomes noticeable-the brick design should be modified to include brick-internal RAID andpossibly a provision to hot-swap bricks. Petaflops compute servers This project was conceived as an architecture for petaflops supercomputing, not as a storageserver. Seymour Cray [20] once said: It's the heat and the thickness of the (wiring) mat whichDEPT OF IT, PDCE 2009-2010 Page 40 COLLECTIVE INTELLIGENT BRICKSmatters. That is still true today, and the Intelligent Bricks architecture directly addresses thesetwo issues. Very powerful processors can be used because they can be cooled. The architecturealso allows scaling to large numbers of nodes and provides high, low- cost bandwidth betweennodes. Communication latency, because of the multihop architecture, is larger than forcentralized switch architectures, but 15 years of experience with message-passingsupercomputers has shown that latency is dominated by software latency at the endpoint nodes- not by hardware communication latency [21]. There is one important issue with the fail-in-place assumption for compute applications:Unlike storage servers, supercomputers run a wide variety of software that is provided by users,not the system vendor. If an application assumes that allocated resources will never go away, itcannot deal with failing nodes without aborting and remapping the problem. Fortunately, agrowing part of the information technology industry is dedicated to grid computing. They mustsolve exactly the same problem, and any solution found will be directly applicable to brick-basedsupercomputers. 8.5. Related work The approach of distributing data across independent nodes to build scalable storage systems hasbeen explored by both academic and commercial projects. These include DataMesh [22], FAB[23], Self- * [24], Petal [25], and OceanStore [26], which are primarily research projects. To the authors' knowledge, no company has realized the 3D brick packaging as described here.However, numerous companies are working on various forms of distributed enterprise storage.Among these companies are Panasas [27] (scalable storage for Linux clusters), Isilon (distributedfile system on standard hardware), Pivot3, Pillar Data Systems (storage management), LefthandNetworks (iSCSI IP-based SAN software), Equallogic (iSCSI-based systems), Ibrix (storagesoftware suite for the enterprise), Cluster File Systems (Lustre** open source object storesoftware), Google (GFS), and Archivas (software for reference storage).DEPT OF IT, PDCE 2009-2010 Page 41 COLLECTIVE INTELLIGENT BRICKS9. CONCLUSIONThe Intelligent Bricks project is demonstrating the feasibility of brick architectures. Theexact meaning of this term is defined in this paper. The salient feature is not the physical shapeof the packaging units, but rather the biologically inspired model that any brick in a system, likea cell in an organism, may fail without noticeably affecting the operation of the system as long asmost bricks are functioning normally. The required repair actions are done by system software,without human intervention.This model has numerous positive consequences. It simplifies system management, eliminatescommon system failures caused by inappropriate human intervention, and allows highly efficientpackaging with a concomitant improvement in important parameters such as scalability,communications performance, system density, and thermal management. A working prototype of such a system has been built at the IBM Almaden Research Center andperforms as expected. It implements a 27- brick, 26-TB storage server. The system software isbased on Linux (the OS for each processor), GPFS (the distributed file system), and Kybos (themanagement software specifically developed for the project). While it is purely a researchproject, extensive external exposure to IBM customers has yielded strong confirmation that theobjectives of the project are well aligned with market needs. Acknowledgments This work has been funded by the IBM Research Division over a period of several years. Theauthors especially thank Drs. Jai Menon, Dilip Kandlur, Robin Williams, Eric Kronstadt, andRobert Morris for their long-term support for this project. We extend our profound thanks to Jim Speidell of the IBM Thomas J. Watson Research Center,Dave Altknecht of the IBM Almaden Research Center, Karl-Heinz Lehnert, IBM E&TS, Mainz,and Bob Steinbugler for their strong management support, and to Dr. Gaby Persch-Schuy, IBME&TS, Mainz. We thank Mike Rogers, Vincent Arena, Andy Ferez, and Ron Ridgeway for theirwork related to the design and manufacturing of the brick electronics and Aaron Cox, IBMTucson, for the industrial designs. Dr. Roger Schmidt of IBM Poughkeepsie was the source ofDEPT OF IT, PDCE 2009-2010 Page 42 "

Why US?

Because we aim to spread high-quality education or digital products, thus our services are used worldwide.
Few Reasons to Build Trust with Students.

128+

Countries

24x7

Hours of Working

89.2 %

Customer Retention

9521+

Experts Team

7+

Years of Business

9,67,789 +

Solved Problems

Search Solved Classroom Assignments & Textbook Solutions

A huge collection of quality study resources. More than 18,98,789 solved problems, classroom assignments, textbooks solutions.

Scroll to Top