to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU  terjemahan - to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU  Bahasa Indonesia Bagaimana mengatakan

to the CPUs with a crossbar switch,

to the CPUs with a crossbar switch, as shown in Fig. 1-6(a). Each CPU and each memory has a connection coming out of it, as shown. At every intersection is a tiny electronic crosspoint switch that can be opened and closed in hardware. When a CPU wants to access a par-
ticular memory, the crosspoint switch connecting them is closed momentarily, to allow the access to take place. The virtue of the crossbar switch is that many CPUs can be accessing memory at the same time, although if two CPUs try to access the same memory simultaneously, one of them will have to wait.


Memories



CPUs Memories





CPUs







Crosspoint switch 2 X 2 switch

(a) (b)

Fig. 1-6. (a) A crossbar switch. (b) An omega switching network.

The downside of the crossbar switch is that with n CPUs and n memories,

n

2
crosspoint switches are needed. For large n, this number can be prohibitive.

As a result, people have looked for, and found, alternative switching networks that require fewer switches. The omega network of Fig. 1-6(b) is one example. This network contains four 2 x 2 switches, each having two inputs and two outputs. Each switch can route either input to either output. A careful look at the figure will show that with proper settings of the switches, every CPU can access every memory. These switches can be set in nanoseconds or less.








SEC. 1.3 HARDWARE CONCEPTS






13

In the general case, with n CPUs and n memories, the omega network requires log2n switching stages, each containing n/2 switches, for a total of (n log2n)/2 switches. Although for large n this is much better than n2, it is still substantial.
Furthermore, there is another problem: delay. For example, for n = 1024, there are 10 switching stages from the CPU to the memory, and another 10 for the word requested to come back. Suppose that the CPU is a modern RISC chip running at 100 MIPS; that is, the instruction execution time is 10 nsec. If a memory request is to traverse a total of 20 switching stages (10 outbound and 10 back) in 10 nsec, the switching time must be 500 picosec (0.5 nsec). The complete multiprocessor will need 5120 500-picosec switches. This is not going to be cheap.
People. have attempted to reduce the cost by going to hierarchical systems. Some memory is associated with each CPU. Each CPU can access its own local memory quickly, but accessing anybody else's memory is slower. This design gives rise to what is known as a NUMA (NonUniform Memory Access) machine. Although NUMA machines have better average access times than machines based on omega networks, they have the new complication that the placement of the programs and data becomes critical in order to make most access go to the local memory.
To summarize, bus-based multiprocessors, even with snoopy caches, are limited by the amount of bus capacity to about 64 CPUs at most. To go beyond that requires a switching network, such as a crossbar switch, an omega switching network, or something similar. Large crossbar switches are very expensive, and large omega networks are both expensive and slow. NUMA machines require complex algorithms for good software placement. The conclusion is clear: building a large, tightly-coupled, shared memory multiprocessor is possible, but is difficult and expensive.

1.3.3. Bus-Based Multicomputers
On the other hand, building a multicomputer (i.e., no shared memory) is easy. Each CPU has a direct connection to its own local memory. The only problem left is how the CPUs communicate with each other. Clearly, some interconnection scheme is needed here, too, but since it is only for CPU-to-CPU communication, the volume of traffic will be several orders of magnitude lower than when the interconnection network is also used for CPU-to-memory traffic.
In Fig. 1-7 we see a bus-based multicomputer. It looks topologically similar to the bus-based multiprocessor, but since there will be much less traffic over it, it need not be a high-speed backplane bus. In fact, it can be a much lower speed LAN (typically, 10-100 Mbps, compared to 300 Mbps and up for a backplane bus). Thus Fig. 1-7 is more often a collection of workstations on a LAN than a






14 INTRODUCTION TO DISTRIBUTED SYSTEMS CHAP. 1
collection of CPU cards inserted into a fast bus (although the latter configuration
is definitely a possible design).


Workstation Workstation Workstation

Local Local Local
memory memory memory

CPU CPU CPU



Network

Fig. 1-7. A multicomputer consisting of workstations on a LAN.


1.3.4. Switched Multicomputers

Our last category consists of switched multicomputers. Various intercon-
nection networks have been proposed and built, but all have the property that each CPU has direct and exclusive access to its own, private memory. Figure
1-8 shows two popular topologies, a grid and a hypercube. Grids are easy to understand and lay out on printed circuit boards. They are best suited to problems that have an inherent two-dimensional nature, such as graph theory or vision (e.g., robot eyes or analyzing photographs).
















(a) (b)

Fig. 1-8. (a) Grid. (b) Hypercube.
A hypercube is an n-dimensional cube. The hypercube of Fig. 1-8(b) is four-dimensional. It can be thought of as two ordinary cubes, each with 8 ver-
tices and 12 edges. Each vertex is a CPU. Each edge is a connection between two CPUs. The corresponding vertices in each of the two cubes are connected.
To expand the hypercube to five dimensions, we would add another set of two interconnected cubes to the figure, connect the corresponding edges in the







SEC. 1.3 HARDWARE CONCEPTS 15
two halves, and so on. For an n-dimensional hypercube, each CPU has n connections to other CPUs. Thus the complexity of the wiring increases only logarithmically with the size. Since only nearest neighbors are connected, many messages have to make several hops to reach their destination. However, the longest possible path also grows logarithmically with the size, in contrast to the grid, where it grows as the square root of the number of CPUs. Hypercubes with 1024 CPUs have been commercially available for several years, and hypercubes with as many as 16,384 CPUs are starting to become available.

0/5000
Dari: -
Ke: -
Hasil (Bahasa Indonesia) 1: [Salinan]
Disalin!
untuk CPU dengan beralih Palang, seperti ditunjukkan pada gambar 1-6(a). Setiap CPU dan memori masing-masing memiliki hubungan keluar dari itu, seperti yang ditunjukkan. Pada setiap persimpangan jalan adalah sebuah saklar crosspoint elektronik kecil yang dapat dibuka dan ditutup pada perangkat keras. Ketika CPU yang ingin mengakses par-khusus mereka dan memori, switch crosspoint menghubungkan mereka tutup sejenak, untuk membolehkan akses untuk mengambil tempat. Keutamaan saklar Palang adalah bahwa banyak CPU dapat mengakses memori pada saat yang sama, meskipun jika dua CPU mencoba untuk mengakses memori sama secara bersamaan, salah satu dari mereka akan harus menunggu. Kenangan CPU kenanganCPUCrosspoint beralih beralih 2 X 2(a) (b)Gambar 1-6. (saklar Palang). (b) omega switching jaringan.Kelemahan dari saklar Palang adalah bahwa dengan n CPU dan n kenangan, n 2crosspoint switch yang diperlukan. N besar, jumlah ini bisa menjadi mahal. Akibatnya, orang telah mencari dan menemukan, alternatif switching jaringan yang memerlukan lebih sedikit switch. Jaringan omega Fig. 1-6(b) adalah salah satu contoh. Jaringan ini berisi empat 2 x 2 switch, masing-masing memiliki dua input dan output dua. Masing-masing switch dapat rute baik input ke output baik. Hati-hati melihat sosok akan menunjukkan bahwa dengan tepat pengaturan switch, CPU setiap dapat mengakses setiap memori. Switch ini dapat diatur dalam Nano atau kurang. KONSEP-KONSEP HARDWARE SEC. 1.3 13 Dalam kasus umum, dengan n CPU dan n kenangan, Jaringan omega memerlukan tahap switching log2n, n/2 masing-masing berisi switch, dengan jumlah (n log2n) / 2 switch. Meskipun n besar ini jauh lebih baik daripada n2, ianya masih besar. Selain itu, ada masalah lain: penundaan. Misalnya, untuk n = 1024, ada 10 beralih tahap dari CPU untuk memori, dan lain 10 kata diminta untuk datang kembali. Anggaplah bahwa CPU adalah sebuah chip RISC modern yang berjalan pada 100 MIPS; waktu eksekusi instruksi adalah 10 nsec. Jika permintaan memori untuk melintasi total 20 switching tahap (10 keluar dan 10 kembali) di 10 nsec, waktu switching harus 500 picosec (0,5 nsec). Lengkap multiprosesor akan perlu 5120 500-picosec switch. Ini tidak akan menjadi murah. Orang-orang. telah berusaha untuk mengurangi biaya dengan pergi ke sistem-sistem hierarki. Beberapa memori ini dikaitkan dengan setiap CPU. CPU masing-masing dapat mengakses memori lokal sendiri dengan cepat, tetapi mengakses memori orang lain lebih lambat. Desain ini menimbulkan apa yang dikenal sebagai mesin NUMA (NonUniform memori akses). Meskipun mesin NUMA memiliki kali rata-rata lebih baik daripada mesin berdasarkan jaringan omega, mereka memiliki komplikasi baru yang penempatan program dan data menjadi penting untuk membuat kebanyakan akses menuju memori lokal. Untuk meringkas, berbasis bus multiprocessors, bahkan dengan cache usil, dibatasi oleh jumlah bus kapasitas untuk sekitar 64 CPU paling. Untuk pergi melampaui yang membutuhkan jaringan switching, seperti sebuah saklar Palang, omega switching jaringan atau sesuatu yang serupa. Besar Palang switch sangat mahal, dan omega besar jaringan lambat dan mahal. NUMA mesin memerlukan algoritma yang rumit untuk penempatan baik perangkat lunak. Kesimpulannya jelas: bangunan besar, memori bersama, digabungkan erat multiprosesor mungkin, tapi sulit dan mahal. 1.3.3. bus berbasis Multicomputers Di sisi lain, bangunan multicomputer (yaitu, tanpa memori yang bersama) mudah. CPU masing-masing memiliki koneksi langsung ke memori lokal sendiri. Satu-satunya masalah kiri adalah bagaimana CPU berkomunikasi dengan satu sama lain. Jelas, beberapa skema interkoneksi yang dibutuhkan di sini, juga, tetapi karena itu adalah hanya untuk komunikasi CPU di-CPU, volume lalu lintas akan menjadi beberapa kali lipat lebih rendah daripada ketika jaringan interkoneksi juga digunakan untuk lalu lintas CPU untuk memori. Di gambar 1-7 kita lihat multicomputer berbasis bus. Kelihatannya topologically mirip dengan bus berbasis multiprosesor, tetapi karena akan ada lebih sedikit lalu lintas lebih dari itu, tidak perlu bus backplane berkecepatan tinggi. Pada kenyataannya, hal ini dapat kecepatan jauh lebih rendah LAN (biasanya, 10-100 Mbps, dibandingkan untuk 300 Mbps dan untuk backplane bus). Dengan demikian Fig. 1-7 adalah lebih sering koleksi workstation pada sebuah LAN daripada 14 INTRODUCTION TO DISTRIBUTED SYSTEMS CHAP. 1collection of CPU cards inserted into a fast bus (although the latter configurationis definitely a possible design).Workstation Workstation WorkstationLocal Local Localmemory memory memoryCPU CPU CPUNetwork Fig. 1-7. A multicomputer consisting of workstations on a LAN. 1.3.4. Switched Multicomputers Our last category consists of switched multicomputers. Various intercon-nection networks have been proposed and built, but all have the property that each CPU has direct and exclusive access to its own, private memory. Figure 1-8 shows two popular topologies, a grid and a hypercube. Grids are easy to understand and lay out on printed circuit boards. They are best suited to problems that have an inherent two-dimensional nature, such as graph theory or vision (e.g., robot eyes or analyzing photographs). (a) (b)Fig. 1-8. (a) Grid. (b) Hypercube.A hypercube is an n-dimensional cube. The hypercube of Fig. 1-8(b) is four-dimensional. It can be thought of as two ordinary cubes, each with 8 ver-tices and 12 edges. Each vertex is a CPU. Each edge is a connection between two CPUs. The corresponding vertices in each of the two cubes are connected. To expand the hypercube to five dimensions, we would add another set of two interconnected cubes to the figure, connect the corresponding edges in the SEC. 1.3 HARDWARE CONCEPTS 15dua bagian, dan seterusnya. Untuk hypercube n-dimensi, CPU masing-masing memiliki n koneksi ke CPU lain. Dengan demikian meningkatkan kompleksitas kabel hanya logarithmically dengan ukuran. Karena hanya tetangga terdekat yang terhubung, banyak pesan harus membuat beberapa melompat untuk mencapai tujuan mereka. Namun, mungkin jalan terpanjang juga tumbuh logarithmically dengan ukuran, berbeda dengan grid, di mana ia tumbuh sebagai akar kuadrat dari jumlah CPU. Hypercubes dengan CPU 1024 telah tersedia secara komersial selama beberapa tahun, dan hypercubes dengan sebanyak 16,384 CPU mulai menjadi tersedia.
Sedang diterjemahkan, harap tunggu..
 
Bahasa lainnya
Dukungan alat penerjemahan: Afrikans, Albania, Amhara, Arab, Armenia, Azerbaijan, Bahasa Indonesia, Basque, Belanda, Belarussia, Bengali, Bosnia, Bulgaria, Burma, Cebuano, Ceko, Chichewa, China, Cina Tradisional, Denmark, Deteksi bahasa, Esperanto, Estonia, Farsi, Finlandia, Frisia, Gaelig, Gaelik Skotlandia, Galisia, Georgia, Gujarati, Hausa, Hawaii, Hindi, Hmong, Ibrani, Igbo, Inggris, Islan, Italia, Jawa, Jepang, Jerman, Kannada, Katala, Kazak, Khmer, Kinyarwanda, Kirghiz, Klingon, Korea, Korsika, Kreol Haiti, Kroat, Kurdi, Laos, Latin, Latvia, Lituania, Luksemburg, Magyar, Makedonia, Malagasi, Malayalam, Malta, Maori, Marathi, Melayu, Mongol, Nepal, Norsk, Odia (Oriya), Pashto, Polandia, Portugis, Prancis, Punjabi, Rumania, Rusia, Samoa, Serb, Sesotho, Shona, Sindhi, Sinhala, Slovakia, Slovenia, Somali, Spanyol, Sunda, Swahili, Swensk, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Turki, Turkmen, Ukraina, Urdu, Uyghur, Uzbek, Vietnam, Wales, Xhosa, Yiddi, Yoruba, Yunani, Zulu, Bahasa terjemahan.

Copyright ©2024 I Love Translation. All reserved.

E-mail: