EE482A
Spring Quarter 1999/2000
Reading List Version 1.1
Mattan Erez and William Dally
The setting: technology,
market trends, critical issues (4.3)
-
Required Reading
-
K. Diefendorff, "PC
Processor Microarchitecture", Microprocessor Report,
Vol. 13 No. 9, July 12, 1999.
-
S. Hamilton, "Taking
Moore's Law into the Next Century",
Computer, Vol.
32 Issue 1, January 1999.
-
C. Kozyrakis and D. Patterson, "A
New Direction for Computer Architecture Research", Computer,
Vol. 31 Issue 11, November 1998.
Branch Prediction
(4.5)
Required Reading
-
J. Lee and A.J. Smith, "Branch
Prediction Strategies and Branch Target Buffer Design",
Computer,
Vol. 17 Issue 1, January 1984.
-
S. McFarling, "Combining
Branch Predictors", Technical Note TN-36, DEC WRL,
June 1993.
-
E. Federovsky, M. Feder, and S. Weiss, "Branch
prediction based on universal data compression algorithms",
in Proceedings of the 25th International Symposium on Computer
Architecture, June 1998.
Highly Recommended Reading
-
M. Evers, S. Patel, R. Chappell and Y. Patt, "An
analysis of correlation and predictability: what makes two-level branch
predictors work", in Proceedings of the 31st
International Symposium on Microarchitecture, December 1998.
Recommended Reading
-
A. Eden and T. Mudge, "The
YAGS branch prediction scheme", in Proceedings of the
31st
International Symposium on Microarchitecture, December 1998.
-
T. Yeh and Y. Patt, "Alternative
Implementations of Two-Level Adaptive Branch Prediction",
in Proceedings of the 19th International Symposium on Computer
Architecture, 1992.
-
R. Nair, "Dynamic
Path-Based Branch Correlation", in Proceedings of the
28th
International Symposium on Microarchitecture, December 1995.
-
J. Kalamatianos and D. Kaeli, "Predicting
indirect branches via data compression", in Proceedings
of the 31st International Symposium on Microarchitecture,
December 1998.
-
K. Driesen and U. Holzle, "The
Cascaded Predictor: Economical and Adaptive Branch Target Prediction",
in Proceedings of the 31st International Symposium on Microarchitecture,
December 1998.
-
D. Grunwald, A. Klauser, S. Manne, and A. Pleszkun, "Confidence
Estimation for Speculation Control", in Proceedings of
the 25th International Symposium on Computer Architecture,
June 1998.
-
J. E. Smith, T. Heil, S. Sastry, and T. Bezenek, "Improving
Branch Predictors by Correlating on Data Values," in
Proceedings of the 32nd Annual International Symposium on
Microarchitecture, November 1999.
Fetch Issues (4.10)
Required Reading
-
G. Reinman, B. Calder, and T. Austin, "A
Scalable Front-End Architecture for Fast Instruction Delivery",
in Proceedings of the 26th International Symposium on Computer
Architecture, May 1999.
-
C.K. Luk and T. Mowry, "Cooperative
Prefetching: Compiler and Hardware Support for Effective Instruction Prefetching
in Modern Processors", in Proceedings of the 31st
International Symposium on Microarchitecture, December 1998.
-
T. Conte, K. Menezes, P. Mills, and B. Patel, "Optimization
of Instruction Fetch Mechanisms for High Issue Rates,"
in Proceedings of the 22nd International Symposium on Computer
Architecture, June 1995.
Recommended Reading
-
M. Slater, "Rise
Joins x86 Fray With mP6", Microprocessor Report Vol.
12 Issue 15, November 16, 1998.
-
C. Lefurgy, E. Piccininni, and T. Mudge, "Evaluation
of a high performance code compression method", in Proceedings
of the 32nd International Symposium on Computer Microarchitecture,
November 1999.
-
S. McFarling, "Program
Optimization For Instruction Caches", in Proceedings
of the Third International Conference on Architectural Support for Programming
Languages and Operating Systems, April 1989.
-
J. Turley, "PowerPC
Adopts Code Compression", Microprocessor Report Vol.
12 Issue 14, October 26, 1998.
-
J. Bondi, A. Nanda, and S. Dutta, "Integrating
a Misprediction Recovery Cache (MRC) into a Superscalar Pipeine",
in Proceedings of the 29th International Symposium on Microarchitecture,
December 1996.
Trace Cache (4.12)
Required Reading
-
A. Peleg and U. Weiser, "Dynamic
Flow Instruction Cache Memory Organized around Trace Segments Independent
of Virtual Address Line," U.S. Patent Number 5,381,533,
Intel Corporation, 1994.
-
D. Friendly, S. Patel, and Y. Patt, "Alternative
Fetch and Issue Policies for the Trace Cache Fetch Mechanism", in Proceedings
of the 30th International Symposium on Microarchitecture,
Novemeber 1997.
-
Q. Jacobson, E. Rotenberg, and J. Smith, "Path-Based
Next Trace Prediction," in Proceedings of the 30th
International Symposium on Microarchitecture, Novemeber 1997.
Recommended Reading
-
R. Nair and M. Hopkins, "Exploiting
Instruction Level Parallelism in Processors by Caching Scheduled Groups",
in Proceedings of the 24th International Symposium on Computer
Architecture, June 1997.
-
S. Dutta and M. Franklin, "Control
Flow Prediction with Tree-Like Subgraphs for Superscalar Processors,"
in Proceedings of the 28th International Symposium on Microarchitecture,
November 1995.
-
Stephan Jourdan, Lihu Rappoport, Yoav Almog, Mattan Erez,
Adi Yoaz, and Ronny Ronen, "eXtended
Block Cache", in Proceedings of the 6th
International Symposium on High-performance Computer Architecture,
January 2000.
-
S. Patel, M. Evers, and Y. Patt, "Improving
Trace Cache Effectiveness with Branch Promotion and Trace Packing,"
in Proceedings of the 25th International Symposium on Computer
Architecture, June 1998.
Predication and Eager
Execution (4.17)
Required Reading
-
S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A.
Bringmann, "Effective
Compiler Support for Predicated Execution Using the Hyperblock",
in Proceedings of the 25th International Symposium on Microarchitecture,
December 1992.
-
A. Uht and V. Sindagi, "Disjoint
Eager Execution: An Optimal Form of Speculative Execution",
in Proceedings of the 28th International Symposium on Microarchitecture,
November 1995.
-
A. Klauser, A. Paithankar, and D. Grunwald, "Selective
Eager Execution on the PolyPath Architecture", in Proceedings
of the 25th International Symposium on Computer Architecture,
June 1998.
Recommended Reading
-
G.S. Tyson, "The
Effects Of Predicated Execution On Branch Prediction",
in Proceedings of the 27th International Symposium on Microarchitecture,
November 1994.
-
D.I. August, W.W. Hwu, and S.A. Mahlke, "A
Framework for Balancing Control Flow and Predication",
in Proceedings of the 30th International Symposium on Microarchitecture,
November 1997.
-
A. Klauser, T. Austin, D. Grunwald, and B. Calder, "Dynamic
Hammock Predication for Non-Predicated Instruction Set Architectures",
in Proceedings of the International Conference on Parallel Architectures
and Compilation Techniques, 1998.
-
D. Pnevmatikatos and G. Sohi, "Guarded
Execution and Branch Prediction in Dynamic ILP Processors",
in Proceedings of the 21st International Symposium on Computer
Architecture, June 1994.
Memory Systems and
Memory Latency (4.19 (papers 1-3), 4.24 (papers 4-6))
Required Reading
-
A. Saulsbury, F. Pong, A. Nowatzyk, "Missing
the Memory Wall: The Case for Processor/Memory Integration",
in Proceedings of the 23rd International Symposium on Computer
Architecture, May 1996.
-
N. Jouppi, "Improving
Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative
Cache and Prefetch Buffers", in Proceedings of the 17th
International Symposium on Computer Architecture, June 1990.
-
S. Rixner, W. Dally, U. Kapasi, P. Mattson, and J. Owens,
"Memory
Access Scheduling", in Proceedings of the 27th
International Symposium on Computer Architecture, June 2000.
-
T.F. Chen and J.L. Baer, "Effective
Hardware Based Prefetching for High-Performance Processors",
IEEE
Transactions on Computers, Vol. 44 No. 5, May 1995.
-
T. Mowry, M.S. Lam, and A. Gupta, "Design
and Evaluation of a Compiler Algorithm for Prefetching",
in Proceedings of the Fifth International Conference on Architectural
Support for Programming Languages and Operating Systems, October 1992.
-
D. Joseph and D. Grunwald, "Prefetching
Using Markov Predictors", in Proceedings of the 24th
International Symposium on Computer Architecture, June 1997.
Recommended Reading
-
C.K. Luk and T. Mowry, "Compiler-Based
Prefetching for Recursive Data Structures", in Proceedings
of the seventh International Conference on Architectural Support for
Programming Languages and Operating Systems, October 1996.
-
A. Roth, G. Sohi, "Effective
Jump-Pointer Prefetching for Linked Data Structures",
in Proceedings of the 26h International Symposium on Computer
Architecture, June 1999.
-
T. Alexander and G. Kedem, "Distributed
Prefetch-buffer/Cache Design for High Performance Memory System",
in Proceedings of the 2nd International Symposium on High-performance
Computer Architecture, February 1996.
-
M. Bekerman, S. Jourdan, R, Ronnen, G. Kirshenboim, L. Rappoport,
A. Yoaz, and U., Weiser, "Correlated
Load-Address Predictors", in Proceedings of the 26h
International Symposium on Computer Architecture, June 1999.
Memory Disambiguation
and Speculation (4.26)
Required Reading
-
A. Nicolau, "Run-Time
Disambiguation: Coping with Statically Unpredictable Dependencies",
IEEE
Transactions on Computers, Vol. 38 No. 5, May 1989.
-
G. Reinman and B. Clader, "Predictive
Techniques for Aggressive Load Speculation", in Proceedings
of the 31st International Symposium on Microarchitecture,
December 1998.
-
A. Moshovos and G. Sohi, "Streamlining
Inter-operation Memory Communication via Data Dependence Prediction",
in Proceedings of the 30th International Symposium on Microarchitecture,
December 1997.
Recommended Reading
-
T. Austin and G. Sohi, "Zero-Cycle
Loads: Microarchitecture Support for Reducing Load Latency",
in Proceedings of the 28th International Symposium on Microarchitecture,
December 1995.
-
A. Yoaz, M. Erez, R Ronnen, and S. Jourdan, "Speculation
Techniques for Improving Load Related Instruction Scheduling",
in Proceedings of the 26h International Symposium on Computer
Architecture, June 1999.
-
M. Franklin and G. Sohi, "ARB:
A Hardware Mechanism for Dynamic Memory Disambiguation",
IEEE
Transactions on Computers, Vol. 45 No. 5, May 1996.
-
G. Chrysos and J. Emer, "Memory
Dependence Prediction Using Store Sets", in Proceedings
of the 25h International Symposium on Computer Architecture,
July 1998.
-
J. Hesson, J. LeBlanc, and S. Ciavaglia, "Apparatus
to Dynamically Control the Out-Of-Order Execution of Load-Store Instrructions",
US Patent 5,615,350, Filed December 1995, Issues March 1997.
Value Prediction
(5.1)
Required Reading
-
S. P. Harbison, "An
Architectural Alternative to Optimizing Compilers," in
Proceedings of the first Symposium on Architectural Support for
Programming Languages and Operating Systems, 1982.
-
M. H. Lipasti and J. P. Shen, "Exceeding
the Dataflow Limit via Value Prediction," in Proceedings
of the 29th Annual International Symposium on Microarchitecture,
December 1996.
-
A. Sodani and G. S. Sohi, "An
Empirical Analysis of Instruction Repetition," in Proceedings
of the 8th International Conference on Architectural Support
for Programming Languages and Operating Systems, October 1998.
Highly Recommended Reading
-
S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, and A. Yoaz,
"A Novel Renaming
Scheme to Exploit Value Temporal Locality through Physical Register Reuse
and Unification," in Proceedings of the 31st
Annual International Symposium on Microarchitecture, November 1998.
Recommended reading
-
F. Gabbay and A. Mendelson, "Using
Value Prediction to Increase the Power of Speculative Execution Hardware,"
in ACM Transactions on Computer Systems, Vol. 16 No. 3, August
1998.
-
Y. Sazeides and J. E. Smith, "The
Predictability of Data Values," in Proceedings of the
30th
Annual International Symposium on Microarchitecture, December 1997.
-
A. Sodani and G. S. Sohi, "Dynamic
Instruction Reuse," in Proceedings of the 24th
Annual International Symposium on Computer Architecture, June 1997.
-
B. Calder, G. Reinman, and D. Tullsen, "Selective
Value Prediction," in Proceedings of the 26th
Internation Symposium on Computer Architecture, May 1999.
-
K. Wang and M. Franklin, "Highly
Accurate Data Value Prediction using Hybrid Predictors",
in proceedings of the 30th Annual International Symposium on Microarchitecture,
December 1997.
-
B. Rychlik, J. Faistl, B. Krug and J. Shen, "Efficacy
and Performance Impact of Value Prediction", in Proceedings
of the 1998 International Conference on Parallel Architectures and Compilation
Techniques, October 1998.
-
Y. Sazeides and J. E. Smith, "Modeling
Program Predictability", in Proceedings of the 25th
Annual International Symposium on Computer Architecture, July 1998.
ILP Execution (5.3)
Required Reading
-
R. M. Tomasulo, "An
Efficient Algorithm for Exploiting Multiple Arithmetic Units",
IBM
J. Research and Development 11:1, January 1967.
-
S. Palacharla, N. Jouppi, and J. Smith, "Complexity-Effective
Superscalcar Processors", in Proceedings of the 24th
International Symposium on Computer Architecture, June 1997.
-
J. Smith and A. Pleszkun, "Implementation
of Precise Interrupts in Pipelined Processors", in Proceedings
of the
12th International Symposium on Computer Architecture,
June 1985.
-
L. Gwennap, "VLIW:
The Wave of the Future?", Microprocessor Report, Vol.
8 No.2, February 14, 1994.
Recommended Reading
-
S. Jourdan, R. Ronen, M. Bekerman, B. Shomar, and A. Yoaz,
"A Novel Renaming
Scheme to Exploit Value Temporal Locality through Physical Register Reuse
and Unification," in Proceedings of the 31st
Annual International Symposium on Microarchitecture, November 1998.
-
T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, and V. Vinals,
"Delaying
Physical Register Allocation through Virtual-Physical Registers",
in Proceedings of the 32nd Annual International Symposium
on Microarchitecture, November 1999.
-
J. Fisher, "Very Long Instruction Word Architectures
and the ELI-512", in Proceedings of the 10th
International Symposium on Computer Architecture, June 1983.
-
R. Colwell, R. Nix, J. O'Donnel, D. Papworth, and P. Rodman,
"A
VLIW Architecture for a Trace Scheduling Compiler",
IEEE Transactions on Computers, Vol. 37 No.8, August 1988.
Beyond ILP (5.8 (papers
1-4), 5.10 (papers 5-7))
Required Reading
-
W.D. Weber and A. Gupta, "Exploring
The Benefits Of Multiple Hardware Contexts In A Multiprocessor Architecture:
Preliminary Results", in Proceedings of the 16th
International Symposium on Computer Architecture, May 1989.
-
S. Keckler and W. Dally, "Processor
Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism",
in Proceedings of the 19th International Symposium on Computer
Architecture, June 1992.
-
L. Hammond, M. Willey, and K. Olukotun, "Data
Speculation Support for a Chip Multiprocessor", in Proceedings
of the 8th International Conference on Architectural Support
for Programming Languages and Operating Systems, October 1998.
-
G. Sohi, S. Breach, and T.N. Vijaykumar, "Multiscalar
Processors", in Proceedings of the 22nd
International Symposium on Computer Architecture, June 1995.
-
D. Tullsen, S. Eggers, and H. Levy, "Simultaneous
Multithreading: Maximizing On-Chip Parallelism", in Proceedings
of the 22nd International Symposium on Computer Architecture,
June 1995.
-
H. Akkary and M. Driscol, "A
Dynamic Multithreaded Processor", in Proceedings of the
31st
Annual International Symposium on Microarchitecture, November 1998.
-
E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. Smith, "Trace
Processors", in Proceedings of the 30th
Annual International Symposium on Microarchitecture, November 1997.
Recommended Reading
-
P. Marcuello and A. Gonzalez, "Clustered
Speculative Multithreaded Processors", in Proceedings
of the 1999 International Conference on Supercomputing, April 1999.
-
S. Keckler, W. Dally, D. Maskit, N. Carter, A. Chang, and
W.S. Lee, "Exploiting
Fine-Grain Thread Level Parallelism on the MIT Multi-ALU Processor",
in Proceedings of the 25th International Symposium on Computer
Architecture, June 1998.
Vectors and
Streams (5.15)
Required Reading
-
S. Rixner, W. Dally, U. Kapasi, B. Khailany, A. Lopez-Lagunas,
P. Mattson, and J. Owens, "A
Bandwidth-Efficient Architecture for Media Processing",
in Proceedings of the 31st Annual International Symposium
on Microarchitecture, November 1998.
-
C. Kozyrakis, S. Perissakis, D. Patterson, T. Anderson, K.
Asanovic, M. Cardwell, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, R.
Thomas, N. Treuhaft, and K. Yelick, "Scalable
Processors in the Billion-Transistor Era: IRAM", Computer,
Vol. 30 Issue 9, September 1997.
-
K. Diefendordd, "Sony's
Emotionally Charged Chip", Microprocessor Report,
Vol. 13 No. 5, April 19, 1999.
Recommended Reading
-
P. Glaskowsky , "Media
Processors Redefined", Microprocessor Report,
January 24, 2000.
-
P. Glaskowsky, "MAP1000
Unfolds at Equator", Microprocessor Report, Vol. 12
No. 16, December 7, 1998.
-
P. Glaskowsky, "Philips
Advances TriMedia Architecture", Microprocessor Report,
Vol. 12 No. 14, October 26, 1998.
Low Power Design (5.17)
Required Reading
-
J. Montanaro et al., "A
160MHz, 32b, 0.5W CMOS RISC Microprocessor", IEEE
Journal of Solid-State Circuits, volume 31, number 11, November 1996,
pp. 1703-1714.
-
A. Sinha and A. Chandrakasan, "Energy
Aware Software", in Proceedings of the 13th
International Conference on VLSI Design, January 2000.
-
S. Manne, A. Klauser, and D. Grunwald, "Pipeline
Gating: Speculation Control for Energy Reduction", in
Proceedings of the 25th International Symposium on Computer
Architecture, June 1998.
Recommended Reading
-
T. Halfhill, "Transmeta
Breaks x86 Low_power Barrier", Microprocessor Report,
February
14, 2000.
-
R. Gonzalez and M. Horowitz, "Energy
dissipation in general purpose microprocessors,"
IEEE Journal of Solid-State Circuits, September 1996, pages 1277-1284.
-
T. Burd and R. Brodersen, "Energy
Efficient CMOS Microprocessor Design," Proceedings
of the 28th Annual HICSS Conference, Jan. 1995; Vol. I, pp.
288-297.
Reliability, Availability, and
Serviceability (5.22)
Required Reading
-
"Ultra
Enterprise 10000 Server: SunTrust Reliability, Availability, and Serviceability",
Sun Microsystems Technical White Paper, 1997.
-
T. Slegel, R. Averill III, M. Check, B. Krumm, C. Krygowski,
W. Li, J. Liptay, J. MacDougall, T. McPherson, J. Navarro, E. Schwarz,
K. Shum, and C. Webb, "IBM's
S/390 G5 Microprocessor Design",
IEEE Micro, Vol.
19 No. 2, March-April 1999.
-
T. Austin, "DIVA:
A Reliable Substrate for Deep Submicron Microarchitecture Design",
in Proceedings of the 32nd Annual International Symposium
on Microarchitecture, November 1999.