* for now, /rel is the original and /dev has updates:
* compiles with verilator, iverilog, yosys
* runs simple version of kernel/bios/random test with cocotb (A2L2 interface partially implemented in Python) and Verilog core wrappers (A2L2<->mem/wb interfaces)
* wrapper converts A2L2 interface to mem and Wishbone interfaces
* experiment with parameters to create smaller version(s) for OpenLane
# Original Release
## The Project
This is the release of the A2O POWER processor core RTL and associated FPGA implementation (using ADM-PCIE-9V3 FPGA).
See [Project Info](rel/readme.md) for details.
## The Core
The [A2O core](rel/doc/A2O_UM.pdf) was created to optimize single-thread performance, and targeted 3+ GHz in 45nm technology.
It is a 27 FO4 implementation, with an out-of-order pipeline supporting 1 or 2 threads. It fully supports Power ISA 2.07 using Book III-E.
The core was also designed to support pluggable implementations of MMU and AXU logic macros.
This includes elimination of the MMU and using ERAT-only mode for translation/protection.
## The History
The A2O design was a follow-on to A2I, written in Verilog, and supported a lower thread count than A2I, but higher performance per thread, using out-of-order execution
(register renaming, reservation stations, completion buffer) and a store queue.
The A2L2 external interface is largely the same for the two cores.
## FPGA Implementation Notes
1. There are lots of knobs available for tweaking generation parameters. Very little experimentation was done to test whether they work, or the effects on area, etc.
2. Only single-thread generation has been done so far. The FPGA in use has very high utilization with one thread.
3. A2I used clk_1x and clk_2x (for some of the special arrays), but A2O also uses clk_4x. This (and possibly along with the area congestion) led to changing the clk_1x to 50MHz to lessen timing pressure
(both setup and hold misses).
### Technology Scaling
A comparison of the design in original technology and scaled to 7nm (SMT2, fixed-point, no MMU):