Compare commits

...

20 Commits

Author SHA1 Message Date
Paul Mackerras 35e0dbed34
Merge pull request #353 from tianrui-wei/master
fix: fix icache_tb not finishing correctly
2 years ago
Michael Neuling cd52390bf1
Merge pull request #373 from antonblanchard/icache-insn-u-state
icache: Don't output X on i_out.insn
2 years ago
Michael Neuling b983d5080e
Merge pull request #376 from antonblanchard/loadstore-init
loadstore1: reduce U state being output
2 years ago
Michael Neuling d4db331467
Merge pull request #374 from antonblanchard/icache-unused-sig
core: Remove unused icache_inv signal
2 years ago
Michael Neuling ee5e3778ed
Merge pull request #364 from shenki/readme-updates
Readme updates
2 years ago
Michael Neuling c43692f4c7
Merge pull request #372 from antonblanchard/dcache-unused-sig
dcache: remove unused do_write signal
2 years ago
Michael Neuling 956df2c863
Merge pull request #371 from antonblanchard/unused-sig
execute1: sub_mux_sel and result_mux_sel are unused
2 years ago
Michael Neuling 3627f102db
Merge pull request #370 from antonblanchard/divider-init
divider: Fix d_out.overflow U state issue
2 years ago
Paul Mackerras 6e1e763c02
Merge pull request #368 from antonblanchard/icache-pmu-events
icache: Hook up PMU events
2 years ago
Anton Blanchard 1047239a37
Merge pull request #377 from antonblanchard/fpu-init
fpu: Reduce uninitialised signals
2 years ago
Anton Blanchard b47b71821e loadstore1: reduce U state being output
While these signals should only be read when valid is true, they
are only a small number of bits and we want to reduce the amount of
U/X state bouncing around the chip.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard a527d9b959 core: Remove unused icache_inv signal
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard e7f0a7c7ac icache: Don't output X on i_out.insn
decode1 has a lot of logic that uses i_out.insn without first looking at
i_iout.valid. Play it safe and never output X state.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard 39220be311 dcache: remove unused do_write signal
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard 843361f2be execute1: sub_mux_sel and result_mux_sel are unused
Remove them.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard d3a7517318 divider: Fix d_out.overflow U state issue
While we should only look at this when d_out.valid = 1, we may as remove
some U state across interfaces.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Anton Blanchard f06abb67ad icache: Hook up PMU events
We weren't connecting the icache PMU events up.

Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
2 years ago
Joel Stanley 9ec22af256 README: Add Linux on Microwatt instructions
These instructions are similar to those at

 https://ozlabs.org/~joel/microwatt/README

except they describe how to build the artifacts from scratch instead of
downloading them.

Signed-off-by: Joel Stanley <joel@jms.id.au>
3 years ago
Joel Stanley a31725d989 README: Add uart to fusesoc instructions
The SoC defaults to using the uart16550 so provide instructions on how
to fetch that library when seetting up fusesoc.

Also remove the text about a working directory; fusesoc doesn't need
one.

Signed-off-by: Joel Stanley <joel@jms.id.au>
3 years ago
Tianrui Wei 844ca0e6b5
fix: fix icache_tb not finishing correctly
Setting icache to be privileged and accessing physical memory directly.
And set big_endian to 0 to correspond to the testbench result.

Signed-off-by: Tianrui Wei <tianrui@tianruiwei.com>
3 years ago

@ -103,14 +103,8 @@ sudo dnf install fusesoc


``` ```
fusesoc init fusesoc init
``` fusesoc fetch uart16550

fusesoc library add microwatt /path/to/microwatt
- Create a working directory and point FuseSoC at microwatt:

```
mkdir microwatt-fusesoc
cd microwatt-fusesoc
fusesoc library add microwatt /path/to/microwatt/
``` ```


- Build using FuseSoC. For hello world (Replace nexys_video with your FPGA board such as --target=arty_a7-100): - Build using FuseSoC. For hello world (Replace nexys_video with your FPGA board such as --target=arty_a7-100):
@ -128,6 +122,68 @@ You should then be able to see output via the serial port of the board (/dev/tty
fusesoc run --target=nexys_video microwatt fusesoc run --target=nexys_video microwatt
``` ```


## Linux on Microwatt

Mainline Linux supports Microwatt as of v5.14. The Arty A7 is the best tested
platform, but it's also been tested on the OrangeCrab and ButterStick.

1. Use buildroot to create a userspace

A small change is required to glibc in order to support the VMX/AltiVec-less
Microwatt, as float128 support is mandiatory and for this in GCC requires
VSX/AltiVec. This change is included in Joel's buildroot fork, along with a
defconfig:
```
git clone -b microwatt https://github.com/shenki/buildroot
cd buildroot
make ppc64le_microwatt_defconfig
make
```

The output is `output/images/rootfs.cpio`.

2. Build the Linux kernel
```
git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux
make ARCH=powerpc microwatt_defconfig
make ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- \
CONFIG_INITRAMFS_SOURCE=/buildroot/output/images/rootfs.cpio -j`nproc`
```

The output is `arch/powerpc/boot/dtbImage.microwatt.elf`.

3. Build gateware using FuseSoC

First configure FuseSoC as above.
```
fusesoc run --build --target=arty_a7-100 microwatt --no_bram --memory_size=0
```

The output is `build/microwatt_0/arty_a7-100-vivado/microwatt_0.bit`.

4. Program the flash

This operation will overwrite the contents of your flash.

For the Arty A7 A100, set `FLASH_ADDRESS` to `0x400000` and pass `-f a100`.

For the Arty A7 A35, set `FLASH_ADDRESS` to `0x300000` and pass `-f a35`.
```
microwatt/openocd/flash-arty -f a100 build/microwatt_0/arty_a7-100-vivado/microwatt_0.bit
microwatt/openocd/flash-arty -f a100 dtbImage.microwatt.elf -t bin -a $FLASH_ADDRESS
```

5. Connect to the second USB TTY device exposed by the FPGA

```
minicom -D /dev/ttyUSB1
```

The gateware has firmware that will look at `FLASH_ADDRESS` and attempt to
parse an ELF there, loading it to the address specified in the ELF header
and jumping to it.

## Testing ## Testing


- A simple test suite containing random execution test cases and a couple of - A simple test suite containing random execution test cases and a couple of

@ -117,7 +117,6 @@ architecture behave of core is
signal complete: instr_tag_t; signal complete: instr_tag_t;
signal terminate: std_ulogic; signal terminate: std_ulogic;
signal core_rst: std_ulogic; signal core_rst: std_ulogic;
signal icache_inv: std_ulogic;
signal do_interrupt: std_ulogic; signal do_interrupt: std_ulogic;


-- Delayed/Latched resets and alt_reset -- Delayed/Latched resets and alt_reset

@ -1121,7 +1121,6 @@ begin
rams: for i in 0 to NUM_WAYS-1 generate rams: for i in 0 to NUM_WAYS-1 generate
signal do_read : std_ulogic; signal do_read : std_ulogic;
signal rd_addr : std_ulogic_vector(ROW_BITS-1 downto 0); signal rd_addr : std_ulogic_vector(ROW_BITS-1 downto 0);
signal do_write : std_ulogic;
signal wr_addr : std_ulogic_vector(ROW_BITS-1 downto 0); signal wr_addr : std_ulogic_vector(ROW_BITS-1 downto 0);
signal wr_data : std_ulogic_vector(wishbone_data_bits-1 downto 0); signal wr_data : std_ulogic_vector(wishbone_data_bits-1 downto 0);
signal wr_sel : std_ulogic_vector(ROW_SIZE-1 downto 0); signal wr_sel : std_ulogic_vector(ROW_SIZE-1 downto 0);

@ -42,6 +42,8 @@ begin
quot <= (others => '0'); quot <= (others => '0');
running <= '0'; running <= '0';
count <= "0000000"; count <= "0000000";
is_32bit <= '0';
overflow <= '0';
elsif d_in.valid = '1' then elsif d_in.valid = '1' then
if d_in.is_extended = '1' then if d_in.is_extended = '1' then
dend <= '0' & d_in.dividend & x"0000000000000000"; dend <= '0' & d_in.dividend & x"0000000000000000";

@ -113,8 +113,6 @@ architecture behaviour of execute1 is
signal misc_result: std_ulogic_vector(63 downto 0); signal misc_result: std_ulogic_vector(63 downto 0);
signal muldiv_result: std_ulogic_vector(63 downto 0); signal muldiv_result: std_ulogic_vector(63 downto 0);
signal spr_result: std_ulogic_vector(63 downto 0); signal spr_result: std_ulogic_vector(63 downto 0);
signal result_mux_sel: std_ulogic_vector(2 downto 0);
signal sub_mux_sel: std_ulogic_vector(2 downto 0);
signal next_nia : std_ulogic_vector(63 downto 0); signal next_nia : std_ulogic_vector(63 downto 0);
signal current: Decode2ToExecute1Type; signal current: Decode2ToExecute1Type;



@ -555,7 +555,11 @@ begin
-- I prefer not to do just yet as it would force fetch2 to know about -- I prefer not to do just yet as it would force fetch2 to know about
-- some of the cache geometry information. -- some of the cache geometry information.
-- --
i_out.insn <= read_insn_word(r.hit_nia, cache_out(r.hit_way)); if r.hit_valid = '1' then
i_out.insn <= read_insn_word(r.hit_nia, cache_out(r.hit_way));
else
i_out.insn <= (others => '0');
end if;
i_out.valid <= r.hit_valid; i_out.valid <= r.hit_valid;
i_out.nia <= r.hit_nia; i_out.nia <= r.hit_nia;
i_out.stop_mark <= r.hit_smark; i_out.stop_mark <= r.hit_smark;
@ -820,4 +824,7 @@ begin
end process; end process;
log_out <= log_data; log_out <= log_data;
end generate; end generate;

events <= ev;

end; end;

@ -74,6 +74,9 @@ begin
i_out.req <= '0'; i_out.req <= '0';
i_out.nia <= (others => '0'); i_out.nia <= (others => '0');
i_out.stop_mark <= '0'; i_out.stop_mark <= '0';
i_out.priv_mode <= '1';
i_out.virt_mode <= '0';
i_out.big_endian <= '0';


m_out.tlbld <= '0'; m_out.tlbld <= '0';
m_out.tlbie <= '0'; m_out.tlbie <= '0';

@ -275,10 +275,27 @@ begin
if rising_edge(clk) then if rising_edge(clk) then
if rst = '1' then if rst = '1' then
r1.req.valid <= '0'; r1.req.valid <= '0';
r1.req.tlbie <= '0';
r1.req.is_slbia <= '0';
r1.req.instr_fault <= '0';
r1.req.load <= '0';
r1.req.priv_mode <= '0';
r1.req.sprn <= (others => '0');
r1.req.xerc <= xerc_init;

r2.req.valid <= '0'; r2.req.valid <= '0';
r2.req.tlbie <= '0';
r2.req.is_slbia <= '0';
r2.req.instr_fault <= '0';
r2.req.load <= '0';
r2.req.priv_mode <= '0';
r2.req.sprn <= (others => '0');
r2.req.xerc <= xerc_init;

r2.wait_dc <= '0'; r2.wait_dc <= '0';
r2.wait_mmu <= '0'; r2.wait_mmu <= '0';
r2.one_cycle <= '0'; r2.one_cycle <= '0';

r3.dar <= (others => '0'); r3.dar <= (others => '0');
r3.dsisr <= (others => '0'); r3.dsisr <= (others => '0');
r3.state <= IDLE; r3.state <= IDLE;

Loading…
Cancel
Save