Chisel OpenNPU¶
An open-source Neural Processing Unit implementation in Chisel 6. Targets low-power, edge-oriented SoC integration.
Source code: GitHub
Notation¶
The following three symbols appear throughout all documentation, source code, and tests. Confusing them causes hard-to-debug hardware elaboration errors.
Parameter definitions
| Symbol | Meaning | Test default | Top (K=64) |
|---|---|---|---|
N (N(bits)) |
Base lane width in bits. Matches MMALU nbits. Always spelled N(bits) in prose. |
8 | 8 |
L |
Number of base VX registers. Must be divisible by 4. | 32 | 32 |
K |
SIMD lane count per register. Equals MMALU array-side n at the backend boundary. |
8 | 64 |
Register classes share the same physical bytes (L × K × N/8 total):
| Class | Count | Lane width | Aliases |
|---|---|---|---|
| VX[0..L-1] | 32 | N bits | native |
| VE[0..L/2-1] | 16 | 2N bits | VE[i] = VX[2i] ∥ VX[2i+1] |
| VR[0..L/4-1] | 8 | 4N bits | VR[i] = VX[4i..4i+3] |
ISA Designs¶
- Instructions (ISA) — 32-bit RISC-V-style encoding, 13 opcode families, funct7 attribute map, timing reference
- Memory
- Buses
Implementation Details¶
-
Neural Core (NCore) —
NCoreBackend: InstrDecoder + MultiWidthRF + MMALU + VALU pipeline- Processing Element (PE)
- Systolic Array (SA)
- Vector ALU (VALU) — K-lane, FP32/BF16/BF8, multi-width arithmetic
- Register Files —
MultiWidthRegisterBlock, VX/VE/VR aliasing
-
Quantization Pipeline — worked example: MMA → vcvt → vfma → vcvt INT8 requantization
-
FPGA Verification Platform (xc7k480t) — PCIe Gen2×8 + dual DDR3 + K=32 MMALU on Kintex-7; timing closure history and 200 MHz fabric architecture
Tutorials¶
- GEMM + Softmax Quantization — post-accumulation quantization pipeline for transformer attention activation; demonstrates reduction ops, programmable LUT activation (
vlut/vsetlut), numerical stability, and full end-to-end quantization chain with Scala reference verification
Quick Start¶
# Build the dev image
make image
# Enter the dev container
make container
# Run all tests (inside container or via Docker)
make test
# Elaborate top-level design (writes top.sv)
make build
See README.md for full setup instructions.