microwatt

Commit Graph

Author	SHA1	Message	Date
Paul Mackerras	6701e7346b	core: Use a busy signal rather than a stall This changes the instruction dependency tracking so that we can generate a "busy" signal from execute1 and loadstore1 which comes along one cycle later than the current "stall" signal. This will enable us to signal busy cycles only when we need to from loadstore1. The "busy" signal from execute1/loadstore1 indicates "I didn't take the thing you gave me on this cycle", as distinct from the previous stall signal which meant "I took that but don't give me anything next cycle". That means that decode2 proactively gives execute1 a new instruction as soon as it has taken the previous one (assuming there is a valid instruction available from decode1), and that then sits in decode2's output until execute1 can take it. So instructions are issued by decode2 somewhat earlier than they used to be. Decode2 now only signals a stall upstream when its output buffer is full, meaning that we can fill up bubbles in the upstream pipe while a long instruction is executing. This gives a small boost in performance. This also adds dependency tracking for rA updates by update-form load/store instructions. The GPR and CR hazard detection machinery now has one extra stage, which may not be strictly necessary. Some of the code now really only applies to PIPELINE_DEPTH=1. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 years ago
Paul Mackerras	b14d982011	execute: Implement bypass from output of execute1 to input This enables back-to-back execution of integer instructions where the first instruction writes a GPR and the second reads the same GPR. This is done with a set of multiplexers at the start of execute1 which enable any of the three input operands to be taken from the output of execute1 (i.e. r.e.write_data) rather than the input from decode2 (i.e. e_in.read_data[123]). This also requires changes to the hazard detection and handling. Decode2 generates a signal indicating that the GPR being written is available for bypass, which is true for instructions that are executed in execute1 (rather than loadstore1/dcache). The gpr_hazard module stores this "bypassable" bit, and if the same GPR needs to be read by a subsequent instruction, it outputs a "use_bypass" signal rather than generating a stall. The use_bypass signal is then latched at the output of decode2 and passed down to execute1 to control the input multiplexer. At the moment there is no bypass on the inputs to loadstore1, but that is OK because all load and store instructions are marked as single-issue. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Benjamin Herrenschmidt	e4f475e17f	sprs: Store common SPRs in register file This stores the most common SPRs in the register file. This includes CTR and LR and a not yet final list of others. The register file is set to 64 entries for now. Specific types are defined that can represent a GPR index (gpr_index_t) or a GPR/SPR index (gspr_index_t) along with conversion functions between the two. On order to deal with some forms of branch updating both LR and CTR, we introduced a delayed update of LR after a branch link. Note: We currently stall the pipeline on such a delayed branch, but we could avoid stalling fetch in that specific case as we know we have a branch delay. We could also limit that to the specific case where we need to update both CTR and LR. This allows us to make bcreg, mtspr and mfspr pipelined. decode1 will automatically force the single issue flag on mfspr/mtspr to a "slow" SPR. [paulus@ozlabs.org - fix direction of decode2.stall_in] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Anton Blanchard	bdc26b7527	Add GPR hazard detection Check GPRs against any writers in the pipeline. All instructions are still marked single in pipeline at this stage. Signed-off-by: Anton Blanchard <anton@linux.ibm.com>	5 years ago

4 Commits (bf36ea365b9998f5cf8e1cb082fc1d8c78cb6721)