microwatt

Commit Graph

Author	SHA1	Message	Date
Paul Mackerras	6fe4b549f5	FPU: Improve accuracy in multiply-add almost-cancellation cases There are two paths for multiply-add instructions; one where the product is larger or nearly the same as the addend, which does the addition/subtraction in the multiplier with 128-bit accuracy; the other is used when the addend is clearly larger, which shifts the product right before doing the addition/subtraction in 64-bit arithmetic. The threshold for the second path is that B_exp has to be greater than A_exp + C_exp + 1, the +1 being because the product mantissa can be greater than 2. This increases the +1 to +2 to make sure that the 128-bit path is used when there is any chance of cancellation of the high-order bits of the sum. With the +1 threshold we could still get close to cancellation when the mantissas of A and C were nearly 2 and the mantissa of B was 1. This improves accuracy and avoids the need to do a 120-bit subtraction in the second path. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	80c81b58ef	FPU: Generate correct result sign when B is denormal If a subtraction A - B is done where A is in normalized form with an exponent of -1022, and B is denormal, an inconsistency arises between the comparison of the raw exponents in the first cycle, which sees A.exp (0x001) > B.exp (0x000), and the comparison in DO_FADD state, which sees r.a.exponent (-1022) = r.b.exponent (-1022). Conseqently we get r.add_bsmall = 0 and the subtraction is done the wrong way around, yielding the wrong sign for the result. Fix this by setting r.add_bsmall according to the comparison of raw exponents in the first cycle and then using it in DO_FADD state. Also add a test case for this. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	f631dcd700	FPU: Set FPRF correctly on multiply result that underflows rcls_op being set to RCLS_TZERO was not detecting a zero result after rounding for a multiply result that underflows, because S still had low bits of the product. To fix this, remove the 's_nz = 0' from the RCLS_TZERO test. We can't then use this test in the FMADD_6 state, but we really shouldn't be testing for zero there, before rounding, so remove that. Also simplify FMADD_6 state by not setting rs_norm and going always to FINISH state rather than going to NORMALIZE state. Add a test for this case (actually a fmadd with B=0). While here, remove a pointless assignment to f_to_multiply.valid in MULT_1 state, since r.first is never set here. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	b122577a4e	FPU: Be more careful about preserving low-order bits in multiply-add instrs Add code to check whether bits of S which don't get shifted into R are non-zero, and set X if they are, so that rounding in multiply-add instructions works correctly. This needs to be done after normalization in the case of very small results, where potentially all the non-zero bits in S do get shifted into R. Also fix an incorrect test case, and add another multiply-add test case. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	59992eab90	FPU: Avoid doing overflow processing twice in OE=1 case Split the ROUND_OFLOW state into two, one which handles the OE=0 case (disabled overflow exception) and one which handles the OE=1 case (enabled overflow exception). This avoids a loop in the state diagram and prevents us from adding the exponent bias twice. Also correct a bug in ROUNDING_3 state where for single-precision operations which yield a result which is denormal in double-precision format, r.shift was set wrongly. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	b8f7cbd894	FPU: Record bits shifted out of addend in fmadd-family instructions If the addend is smaller than the product and thus needs to be shifted right, record if any bits are lost from the right end in r.x, so that the result gets rounded correctly. Also add a test that checks one such case. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	32919435a3	FPU: Allow mtfsb* to set FPSCR[FX] implicitly If mtfsb1 causes an individual exception bit to go from 0 to 1, that should set FX as well. Arrange for this by setting update_fx to 1. Also make sure mcrfs doesn't copy the reserved FPSCR bit. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	de71a6119c	FPU: Make FPSCR bit 11 always read as 0 Bit 11 (52 in BE numbering) is a reserved bit. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	2f29daab2d	FPU: Fix setting of r.x for single-precision operations The fp_rounding function expects r.x to have been set based on the lower 31 bits of r.r, not 29 as presently done, so change 28 to SP_RBIT-1 (SP_RBIT is 31). Also add a test to check. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	577bbb8f5d	tests/fpu: Add test case for denorm input in frsp test Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	7b1febcbd3	tests/fpu: Check setting of FR and FI in FPSCR by frsp instruction Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 weeks ago
Paul Mackerras	9326fc7f18	tests/modes: Test that mfspr/mtspr to unimplemented SPR in user mode causes HEAI Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 months ago
Paul Mackerras	0255283159	tests/spr_read: Test that mfspr/mtspr to SPRs 0,4,5,6 generate HEAI Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 months ago
Paul Mackerras	f2166d326c	tests/fpu: Add a test for result writing being suppressed When an arithmetic instruction generates an invalid operation exception or a divide by zero exception, and that exception is enabled in the FPSCR, the writing of the result to the destination register should be suppressed, leaving whatever value was last written in the destination. Add a check that this occurs correctly, for the cases of square root of a negative number (invalid operation exception) and division by zero (zero divide exception). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 months ago
Paul Mackerras	9f9f9046ee	tests/spr_read: Add a check for no-op behaviour of mtspr and mfspr Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 months ago
Paul Mackerras	9ac71cfbf2	tests/fpu: Add more floating multiply-add tests Add more tests to check that the result sign computations are correct. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	10 months ago
Paul Mackerras	8f537c13bc	tests: Add a test for the hash instructions hash{st,cmp}[p] Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	12 months ago
Paul Mackerras	80bc9d5098	tests/trace: Add a few tests of DAWR (data watchpoint) functionality Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	09de0738de	tests/trace: Add checks for SIAR and SDAR being set correctly Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	23b183fb16	tests/reservation: Check that SRR0 is set correctly on alignment interrupt The tests that intentionally generate alignment interrupts now also check that SRR0 is pointing to a larx or stcx instruction. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	f64ab6569d	tests/trace: Add a couple of tests of CIABR function Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	140b930ad3	tests: Add tests for lq/stq, plq/pstq and lqarx/stqcx. Lq and stq are tested in both BE and LE modes (though only 64-bit mode) by the 'modes' test. Lqarx and stqcx. are tested by the 'reservation' test in LE mode (64-bit). Plq and pstq are tested in 64-bit LE mode by the 'prefix' test. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	d2777dd1dd	Generate Hypervisor Emulation Assistance Interrupt for illegal instructions This implements the HEIR register (Hypervisor Emulation Instruction Register) and arranges for an illegal instruction to cause a Hypervisor Emulation Assistance Interrupt (HEAI) at vector 0xE40, and set HEIR to the illegal instruction. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	12a3d76217	Implement hrfid and make MSR[HV] always 1 Implementations without hypervisor/LPAR support are permitted by the architecture, but should have MSR[HV] forced to be 1 at all times, not 0, and should implement various instructions and registers that are only accessible in hypervisor mode. This commit implements MSR[HV] as a constant 1 bit and adds the hrfid instruction, which behaves exactly the same as rfid except that it reads HSRR0/1 instead of SRR0/1. We already have HSRR0/1 and HSPRG0/1 implemented. When HV=1, Linux expects external interrupts to arrive as hypervisor interrupts, so this adds support for hypervisor interrupts (i.e., those that set HSRR0/1) and makes the external interrupt be a hypervisor interrupt. (If we had an LPCR register, the LPES bit would control this, but we don't.) The xics test is updated to read HSRR0/1 after an external interrupt. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	1 year ago
Paul Mackerras	7f781b835d	tests/fpu: Add tests for ftdiv and ftsqrt Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	7b86bf8863	tests/fpu: Add tests for fdiv and fre with denormalized operands Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	59a7996f1c	tests/fpu: Add checks for correct setting of FPRF Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	7c5a2bcaf4	tests: Add a test for prefixed instructions Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Michael Neuling	116f6281a9	tests: Update metavalues test count With Paulus changes in PR #396 merged in `5c6d57de30`, we can now reduce the metavalue test counts. Signed-off-by: Michael Neuling <mikey@neuling.org>	3 years ago
Michael Neuling	0073d23e73	Merge pull request #392 from paulusmack/fix-branch-alias fetch1: Fix bug where BTC entries don't match on MSR[IR]	3 years ago
Anton Blanchard	25f93fc17e	Add branch alias test Signed-off-by: Anton Blanchard <anton@linux.ibm.com>	3 years ago
Anton Blanchard	3c27abcc40	tests/trace: Test trace vs system call interrupt Signed-off-by: Anton Blanchard <anton@linux.ibm.com>	3 years ago
Michael Neuling	eeac86c9d8	test: Add test for metavalues Make sure they don't increase in future Signed-off-by: Michael Neuling <mikey@neuling.org>	4 years ago
Michael Neuling	72fcca8e52	tests: Update FPU test output The following commit added two tests but didn't update the tests outputs: commit `73cc5167ec` Author: Paul Mackerras <paulus@ozlabs.org> Date: Mon May 9 19:18:42 2022 +1000 Use FPU for division instructions if we have an FPU This patch updates these using tests/update_console_tests Signed-off-by: Michael Neuling <mikey@neuling.org>	4 years ago
Michael Neuling	281a125f1f	Merge pull request #379 from paulusmack/master Lots of improvements	4 years ago
Michael Neuling	a060ad5085	tests/pmu: Cleanup whitespace in pmc.c Fixup tabs vs space and trailing whitespace. Signed-off-by: Michael Neuling <mikey@neuling.org>	4 years ago
Paul Mackerras	73cc5167ec	Use FPU for division instructions if we have an FPU - Arrange for XER to be written for OE=1 forms - Arrange for condition codes to be set for RC=1 forms (including correct handling for 32-bit mode) - Don't instantiate the divider if we have an FPU. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 years ago
Paul Mackerras	c9e838b656	Remove support for lq, stq, lqarx and stqcx. They are optional in SFFS (scalar fixed-point and floating-point subset), are not needed for running Linux, and add complexity, so remove them. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 years ago
Iago Caran Aquino	de1bf10114	tests/pmu: Add load/store completed, instruction count and cycle count tests Signed-off-by: Iago Caran Aquino <iago.caran@gmail.com>	4 years ago
Anton Blanchard	2d142a6c01	tests/misc: Add a store/dcbz test We have a bug where an store near a dcbz can cause the dcbz to only zero 8 bytes. Add a test case for this. Signed-off-by: Anton Blanchard <anton@linux.ibm.com>	4 years ago
Anton Blanchard	00259458c7	tests/misc: Add an icbi test We have a bug where an icbi can cause an instruction to execute twice. Add a test case for this. Signed-off-by: Anton Blanchard <anton@linux.ibm.com>	4 years ago
Paul Mackerras	ba34914465	tests/misc: Add a test for a load that hits two preceding stores This checks that the store forwarding machinery in the dcache correctly combines forwarded stores when they are partial stores (i.e. only writing part of the doubleword, as for a byte store). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 years ago
Paul Mackerras	f40842d9b2	tests/fpu: Test FPU unavailable interrupt following a load This adds a load before a floating-point load which should generate a floating-point unavailable interrupt, to test for the bug where unavailability interrupts can get dropped while loadstore1 is executing instructions. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	4 years ago
Paul Mackerras	18120f153d	MMU: Implement a vestigial partition table This implements a 1-entry partition table, so that instead of getting the process table base address from the PRTBL SPR, the MMU now reads the doubleword pointed to by the PTCR register plus 8 to get the process table base address. The partition table entry is cached. Having the PTCR and the vestigial partition table reduces the amount of software change required in Linux for Microwatt support. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Anton Blanchard	d26a157cd7	Add a test to read from all SPRs Make sure the SPRs are initialized and we can't read X state. (Mikey: rebased and added console/bin file for testing) Signed-off-by: Anton Blanchard <anton@linux.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org>	5 years ago
Paul Mackerras	ec5730a75a	tests: Add tests for lq/stq and lqarx/stqcx. Lq and stq are tested in both BE and LE modes (though only 64-bit mode) by the 'modes' test. Lqarx and stqcx. are tested by the 'reservation' test in LE mode mode (64-bit). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Paul Mackerras	29fabeb12e	tests/misc: Add a test for correct CTR and LR updating by branches This adds a test with a bdnzl followed immediately by a bdnz, to check that CTR and LR both get evaluated and written back correctly in this situation. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Paul Mackerras	144433218f	tests/trace: Test trace interrupt vs. FP unavailable interrupt Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Paul Mackerras	dc1544db69	FPU: Implement floating multiply-add instructions This implements fmadd, fmsub, fnmadd, fnmsub and their single-precision counterparts. The single-precision versions operate the same as the double-precision versions until the final rounding and overflow/underflow steps. This adds an S register to store the low bits of the product. S shifts into R on left shifts, and can be negated, but doesn't do any other arithmetic. This adds a test for the double-precision versions of these instructions. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago
Paul Mackerras	c350bc1f25	FPU: Implement fsqrt[s] and add a test for fsqrt This implements the floating square-root calculation using a table lookup of the inverse square root approximation, followed by three iterations of Goldschmidt's algorithm, which gives estimates of both sqrt(FRB) and 1/sqrt(FRB). Then the residual is calculated as FRB - R * R and that is multiplied by the 1/sqrt(FRB) estimate to get an adjustment to R. The residual and the adjustment can be negative, and since we have an unsigned multiplier, the upper bits can be wrong. In practice the adjustment fits into an 8-bit signed value, and the bottom 8 bits of the adjustment product are correct, so we sign-extend them, divide by 4 (because R is in 10.54 format) and add them to R. Finally the residual is calculated again and compared to 2*R+1 to see if a final increment is needed. Then the result is rounded and written back. This implements fsqrts as fsqrt, but with rounding to single precision and underflow/overflow calculation using the single-precision exponent range. This could be optimized later. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	5 years ago

1 2 3

105 Commits (6fe4b549f5bd08461f5062bcd4572b254f407884)