Cache

Table of contents

  1. Common designs
  2. Cache operations
  3. Cache write policies
  4. Virtual or physical addr
  5. Cache coherency

1. Common designs ↑top

2. Cache operations ↑top

The Pulpit Rock
Fig.1 - A cache organization.

As a summary:

3. Cache write policies ↑top

if data is already in the cache ...

if data is not in the cache

write-through vs. write-back

A cache with a write-through policy (and write-allocate) read an entire block (cacheline) from memory on a cache miss and writes only the updated item to memory for a store. Evictions do not need to write to memory.

A cache with a write-back policy (and write-allocate) reads an entire block (cacheline) from memory on a cache miss, may need to write dirty cacheline first. Any writes to memory need to be the entire cacheline since no way to distinguish which word was dirty with only a single dirty bit. Evictions of a dirty cacheline cause a write to memory.

write-through is lower but cleaner (memory always consistent), write-back is faster but complicated when multi cores sharing memory, requiring cache coherency protocol.

4. Virtual or physical addr ↑top

TLBs are small (maybe 64 entries), fully-associative caches for page table entries.

physical cache vs. virtual cache

If we translate before we go to the cache, we have a "physical cache" which works on physical addr.
Critical path = TLB access time + cache access time

Alternatively, we could translate after the cache (only for cache misses), we have a "virtual cache". Virtual cache is dangerous. We must flush the cache on a context switch to avoid "aliasing".

virtually indexed physically tagged

Page offset bits are not translated and thus can be presented to the cache immediately. Accordingly, cache and TLB accesses can begin simultaneously, and tag comparion is made after both accesses are completed.

The Pulpit Rock
Fig.2 - Virtual index, physical tag.

5. Cache coherency ↑top

In a shared memory multiprocessor system, an operand can have multiple copies in main memory and in caches. Cache coherence is to ensure that the changes in the values of shared operands are propagated throughout the system in a timely fashion.

Coherence rules:

The Pulpit Rock
Fig.3 - Cache of CMP.

The most basic protocol is MSI.
MSI → MESI:

MSI → MOSI:

MSI → MOESI:

MESI addes an "Exclusive" state to reduce the traffic caused by writes of blocks that only in one cache (a silent write in MESI). Further, MOSI adds an "Owned" state to reduce the traffic caused by write-backs of blocks that are read by other caches.

MESI protocol

The Pulpit Rock
Fig.4 - State transitions in MESI protocol.

Every cache line is marked with one of the four following states (coded in two bits):

Transitions (assume local is on core0 and remote is on core1):

A cache may satisfy a read from any state except Invalid. An invalid line must be fetched (to the Shared or Exclusive states) to satisfy a read.

A write may only be performed if the cache line is in the Modified or Exclusive state. If it is in Shared state, all other cached copies must be invalidated first. This is typically done by a broadcast operation known as RfO.

A cache may discard a non-Modified line (i.e., Shared or Exclusive) at any time, changing to the Invalid state. A Modified line must be written back first.

A cache that holds a line in the Modified state must snoop (intercept) all attempted reads (from all of the other caches in the system) of the corresponding main mem location and insert the data it holds. This is typically done by forcing the read to back off (i.e., retry later), then writing the data to main memory and changing the cache line to Shared state.

A cache that holds a line in the Shared state must listen for invalidate or RfO broadcasts from other caches, and discard the line (by moving it into Invalid) on a match.

A cache that holds a line in Exclusive state must also snoop all read transactions from all other caches, and move the line into Shared state on a match.

Snooping cache

Snooping is widely used in bus-based multiprocessors. The cache controller constantly watches the bus.

[1] Cache general structure
[2] Virtual cache
[3] Cache coherence
[4] MESI protocol
[5] MSI and variants
[6] Cache coherence (MIT)