to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU  terjemahan - to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU  Bahasa Indonesia Bagaimana mengatakan

to the CPUs with a crossbar switch,

to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU and each memory has a connection coming out of it, as shown. At every intersection is a tiny electronic crosspoint switch that can be opened and closed in hardware. When a CPU wants to access a par-
ticular memory, the crosspoint switch connecting them is closed momentarily, to allow the access to take place. The virtue of the crossbar switch is that many CPUs can be accessing memory at the same time, although if two CPUs try to access the same memory simultaneously, one of them will have to wait.


Memories



CPUs Memories





CPUs







Crosspoint switch 2 X 2 switch

(a) (b)

Fig. 1-6. (a) A crossbar switch. (b) An omega switching network.

The downside of the crossbar switch is that with n CPUs and n memories,

n

2
crosspoint switches are needed. For large n, this number can be prohibitive.

As a result, people have looked for, and found, alternative switching networks that require fewer switches. The omega network of Fig. 1-6(b) is one example. This network contains four 2 x 2 switches, each having two inputs and two outputs. Each switch can route either input to either output. A careful look at the figure will show that with proper settings of the switches, every CPU can access every memory. These switches can be set in nanoseconds or less.








SEC. 1.3 HARDWARE CONCEPTS






13

In the general case, with n CPUs and n memories, the omega network requires log2n switching stages, each containing n/2 switches, for a total of (n log2n)/2 switches. Although for large n this is much better than n2, it is still substantial.
Furthermore, there is another problem: delay. For example, for n = 1024, there are 10 switching stages from the CPU to the memory, and another 10 for the word requested to come back. Suppose that the CPU is a modern RISC chip running at 100 MIPS; that is, the instruction execution time is 10 nsec. If a memory request is to traverse a total of 20 switching stages (10 outbound and 10 back) in 10 nsec, the switching time must be 500 picosec (0.5 nsec). The complete multiprocessor will need 5120 500-picosec switches. This is not going to be cheap.
People. have attempted to reduce the cost by going to hierarchical systems. Some memory is associated with each CPU. Each CPU can access its own local memory quickly, but accessing anybody else's memory is slower. This design gives rise to what is known as a NUMA (NonUniform Memory Access) machine. Although NUMA machines have better average access times than machines based on omega networks, they have the new complication that the placement of the programs and data becomes critical in order to make most access go to the local memory.
To summarize, bus-based multiprocessors, even with snoopy caches, are limited by the amount of bus capacity to about 64 CPUs at most. To go beyond that requires a switching network, such as a crossbar switch, an omega switching network, or something similar. Large crossbar switches are very expensive, and large omega networks are both expensive and slow. NUMA machines require complex algorithms for good software placement. The conclusion is clear: building a large, tightly-coupled, shared memory multiprocessor is possible, but is difficult and expensive.

1.3.3. Bus-Based Multicomputers
On the other hand, building a multicomputer (i.e., no shared memory) is easy. Each CPU has a direct connection to its own local memory. The only problem left is how the CPUs communicate with each other. Clearly, some interconnection scheme is needed here, too, but since it is only for CPU-to-CPU communication, the volume of traffic will be several orders of magnitude lower than when the interconnection network is also used for CPU-to-memory traffic.
In Fig. 1-7 we see a bus-based multicomputer. It looks topologically similar to the bus-based multiprocessor, but since there will be much less traffic over it, it need not be a high-speed backplane bus. In fact, it can be a much lower speed LAN (typically, 10-100 Mbps, compared to 300 Mbps and up for a backplane bus). Thus Fig. 1-7 is more often a collection of workstations on a LAN than a






14 INTRODUCTION TO DISTRIBUTED SYSTEMS CHAP. 1
collection of CPU cards inserted into a fast bus (although the latter configuration
is definitely a possible design).


Workstation Workstation Workstation

Local Local Local
memory memory memory

CPU CPU CPU



Network

Fig. 1-7. A multicomputer consisting of workstations on a LAN.


1.3.4. Switched Multicomputers

Our last category consists of switched multicomputers. Various intercon-
nection networks have been proposed and built, but all have the property that each CPU has direct and exclusive access to its own, private memory. Figure
1-8 shows two popular topologies, a grid and a hypercube. Grids are easy to understand and lay out on printed circuit boards. They are best suited to problems that have an inherent two-dimensional nature, such as graph theory or vision (e.g., robot eyes or analyzing photographs).
















(a) (b)

Fig. 1-8. (a) Grid. (b) Hypercube.
A hypercube is an n-dimensional cube. The hypercube of Fig. 1-8(b) is four-dimensional. It can be thought of as two ordinary cubes, each with 8 ver-
tices and 12 edges. Each vertex is a CPU. Each edge is a connection between two CPUs. The corresponding vertices in each of the two cubes are connected.
To expand the hypercube to five dimensions, we would add another set of two interconnected cubes to the figure, connect the corresponding edges in the







SEC. 1.3 HARDWARE CONCEPTS 15
two halves, and so on. For an n-dimensional hypercube, each CPU has n connections to other CPUs. Thus the complexity of the wiring increases only logarithmically with the size. Since only nearest neighbors are connected, many messages have to make several hops to reach their destination. However, the longest possible path also grows logarithmically with the size, in contrast to the grid, where it grows as the square root of the number of CPUs. Hypercubes with 1024 CPUs have been commercially available for several years, and hypercubes with as many as 16,384 CPUs are starting to become available.

0/5000
Dari: -
Ke: -
Hasil (Bahasa Indonesia) 1: [Salinan]
Disalin!
to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU and each memory has a connection coming out of it, as shown. At every intersection is a tiny electronic crosspoint switch that can be opened and closed in hardware. When a CPU wants to access a par-ticular memory, the crosspoint switch connecting them is closed momentarily, to allow the access to take place. The virtue of the crossbar switch is that many CPUs can be accessing memory at the same time, although if two CPUs try to access the same memory simultaneously, one of them will have to wait. Memories CPUs MemoriesCPUsCrosspoint switch 2 X 2 switch(a) (b)Fig. 1-6. (a) A crossbar switch. (b) An omega switching network.The downside of the crossbar switch is that with n CPUs and n memories, n 2crosspoint switches are needed. For large n, this number can be prohibitive. As a result, people have looked for, and found, alternative switching networks that require fewer switches. The omega network of Fig. 1-6(b) is one example. This network contains four 2 x 2 switches, each having two inputs and two outputs. Each switch can route either input to either output. A careful look at the figure will show that with proper settings of the switches, every CPU can access every memory. These switches can be set in nanoseconds or less. SEC. 1.3 HARDWARE CONCEPTS 13 In the general case, with n CPUs and n memories, the omega network requires log2n switching stages, each containing n/2 switches, for a total of (n log2n)/2 switches. Although for large n this is much better than n2, it is still substantial. Furthermore, there is another problem: delay. For example, for n = 1024, there are 10 switching stages from the CPU to the memory, and another 10 for the word requested to come back. Suppose that the CPU is a modern RISC chip running at 100 MIPS; that is, the instruction execution time is 10 nsec. If a memory request is to traverse a total of 20 switching stages (10 outbound and 10 back) in 10 nsec, the switching time must be 500 picosec (0.5 nsec). The complete multiprocessor will need 5120 500-picosec switches. This is not going to be cheap. People. have attempted to reduce the cost by going to hierarchical systems. Some memory is associated with each CPU. Each CPU can access its own local memory quickly, but accessing anybody else's memory is slower. This design gives rise to what is known as a NUMA (NonUniform Memory Access) machine. Although NUMA machines have better average access times than machines based on omega networks, they have the new complication that the placement of the programs and data becomes critical in order to make most access go to the local memory. To summarize, bus-based multiprocessors, even with snoopy caches, are limited by the amount of bus capacity to about 64 CPUs at most. To go beyond that requires a switching network, such as a crossbar switch, an omega switching network, or something similar. Large crossbar switches are very expensive, and large omega networks are both expensive and slow. NUMA machines require complex algorithms for good software placement. The conclusion is clear: building a large, tightly-coupled, shared memory multiprocessor is possible, but is difficult and expensive. 1.3.3. Bus-Based Multicomputers On the other hand, building a multicomputer (i.e., no shared memory) is easy. Each CPU has a direct connection to its own local memory. The only problem left is how the CPUs communicate with each other. Clearly, some interconnection scheme is needed here, too, but since it is only for CPU-to-CPU communication, the volume of traffic will be several orders of magnitude lower than when the interconnection network is also used for CPU-to-memory traffic. In Fig. 1-7 we see a bus-based multicomputer. It looks topologically similar to the bus-based multiprocessor, but since there will be much less traffic over it, it need not be a high-speed backplane bus. In fact, it can be a much lower speed LAN (typically, 10-100 Mbps, compared to 300 Mbps and up for a backplane bus). Thus Fig. 1-7 is more often a collection of workstations on a LAN than a 14 INTRODUCTION TO DISTRIBUTED SYSTEMS CHAP. 1collection of CPU cards inserted into a fast bus (although the latter configurationis definitely a possible design).Workstation Workstation WorkstationLocal Local Localmemory memory memoryCPU CPU CPUNetwork Fig. 1-7. A multicomputer consisting of workstations on a LAN. 1.3.4. Switched Multicomputers Our last category consists of switched multicomputers. Various intercon-nection networks have been proposed and built, but all have the property that each CPU has direct and exclusive access to its own, private memory. Figure 1-8 shows two popular topologies, a grid and a hypercube. Grids are easy to understand and lay out on printed circuit boards. They are best suited to problems that have an inherent two-dimensional nature, such as graph theory or vision (e.g., robot eyes or analyzing photographs). (a) (b)Fig. 1-8. (a) Grid. (b) Hypercube.A hypercube is an n-dimensional cube. The hypercube of Fig. 1-8(b) is four-dimensional. It can be thought of as two ordinary cubes, each with 8 ver-tices and 12 edges. Each vertex is a CPU. Each edge is a connection between two CPUs. The corresponding vertices in each of the two cubes are connected. To expand the hypercube to five dimensions, we would add another set of two interconnected cubes to the figure, connect the corresponding edges in the SEC. 1.3 HARDWARE CONCEPTS 15two halves, and so on. For an n-dimensional hypercube, each CPU has n connections to other CPUs. Thus the complexity of the wiring increases only logarithmically with the size. Since only nearest neighbors are connected, many messages have to make several hops to reach their destination. However, the longest possible path also grows logarithmically with the size, in contrast to the grid, where it grows as the square root of the number of CPUs. Hypercubes with 1024 CPUs have been commercially available for several years, and hypercubes with as many as 16,384 CPUs are starting to become available.
Sedang diterjemahkan, harap tunggu..
 
Bahasa lainnya
Dukungan alat penerjemahan: Afrikans, Albania, Amhara, Arab, Armenia, Azerbaijan, Bahasa Indonesia, Basque, Belanda, Belarussia, Bengali, Bosnia, Bulgaria, Burma, Cebuano, Ceko, Chichewa, China, Cina Tradisional, Denmark, Deteksi bahasa, Esperanto, Estonia, Farsi, Finlandia, Frisia, Gaelig, Gaelik Skotlandia, Galisia, Georgia, Gujarati, Hausa, Hawaii, Hindi, Hmong, Ibrani, Igbo, Inggris, Islan, Italia, Jawa, Jepang, Jerman, Kannada, Katala, Kazak, Khmer, Kinyarwanda, Kirghiz, Klingon, Korea, Korsika, Kreol Haiti, Kroat, Kurdi, Laos, Latin, Latvia, Lituania, Luksemburg, Magyar, Makedonia, Malagasi, Malayalam, Malta, Maori, Marathi, Melayu, Mongol, Nepal, Norsk, Odia (Oriya), Pashto, Polandia, Portugis, Prancis, Punjabi, Rumania, Rusia, Samoa, Serb, Sesotho, Shona, Sindhi, Sinhala, Slovakia, Slovenia, Somali, Spanyol, Sunda, Swahili, Swensk, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Turki, Turkmen, Ukraina, Urdu, Uyghur, Uzbek, Vietnam, Wales, Xhosa, Yiddi, Yoruba, Yunani, Zulu, Bahasa terjemahan.

Copyright ©2025 I Love Translation. All reserved.

E-mail: