Intel Itanium 2
Configuration
1300 MHz (Madison, 130 nm) (HP rx2600, RAM: : PC2100 ECC Registered DDR266A).
Cache
- L1 Data cache = 16 KB. 4-WAY, 64-byte line, write-through,
2 independent load ports and 2 store ports, Caches only integer loads.
- L1 Instruction cache = 16 KB, 4-Way, 64-byte line.
- L2 cache size = 256 KB. 8-Way, 128-byte line, 256-bit to L1 Data cache, 4-ported for loads,
write-back with write-allocate policy.
- L3 cache size = 3 MB. 128-byte line, single ported.
- DTLB1: 32 items. full assoc. (if miss, L2 data cache will be used instead of L1 data cache). 2 read ports & 1 write port,
supports 4 KB pages (or 4 KB subsections of large pages).
- DTLB2: 128 items, full assoc. 4 ports, 4 KB / 8 KB/ 16 KB / 64 KB / 256 KB / 1 MB / 4 MB / 16 MB / 64 MB / 256 MB / 1 GB / 4 GB pages.
- ITLB1: 32 items. full assoc. (used only for L1 ICache ). 2 ports, supports 4 KB pages only (or 4 KB subsections of large pages).
- ITLB2: 128 items, full assoc. 4 ports, 4 KB - 4 GB pages.
- Virtual hash page table (VHPT) walker accesses L2 Cache.
- Advanced Load Address Table (ALAT): 32 entries. For speculation data loads.
16 KB pages mode
Size |
Latency |
Description |
16 K | 2 | L1 TLB + L1 |
128 K | 6 | + 4 (L1 miss -> L2 hit) |
256 K | 10 | + 4 (L1-TLB miss -> L2-TLB hit) |
2M | 20 | + 10 (L2 miss -> L3 hit) |
3M | 35 | + 15 (L2-TLB miss -> VHPT walker to L2-Cache) |
... | 35 + 160 ns | + RAM (L3-Cache Miss) |
Note: L1 latency is 1 Cycle, if data is used for ALU
- RAM Read B/W (4 Bytes stride) = 840 MB/s
- RAM Write B/W (4 Bytes stride) = 800 MB/s
Pipeline
# |
Name |
Description |
1 | IPG |
Instruction pointer generation. L1 ILTB, L1I, access. |
2 | ROT |
Instruction rotation. |
3 | EXP |
Instruction template decode, expand, and disperse. |
4 | REN |
Rename (for register stack and rotating registers) and decode. |
5 | REG |
Register file read. |
6 | EXE |
ALU execution |
7 | DET |
Last stage for exception detection. |
8 | WRB |
Write Back. |
Links
Itanium 2 at Wikipedia
Itanium at Intel