forked from aws/aws-fpga
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
150 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
|
||
# AWS EC2 FPGA HDK+SDK Errata | ||
|
||
Any items in this release marked as WIP (Work-in-progress) or NA (Not avaiable yet) are not currently supported by the 1.2.0 release. | ||
|
||
## Integrated DMA in Beta Release. AWS Shell now includes DMA capabilities on behalf of the CL | ||
* The DMA bus toward the CL is multiplexed over sh_cl_dma_pcis AXI4 interface so the same address space can be accessed via DMA or directly via PCIe AppPF BAR4 | ||
* DMA usage is covered in the new [CL_DRAM_DMA example](./hdk/cl/examples/cl_dram_dma) RTL verification/simulation and Software | ||
* A corresponding AWS Elastic DMA ([EDMA](./sdk/linux_kernel_drivers/edma)) driver is provided. | ||
* [EDMA Installation Readme](./sdk/linux_kernel_drivers/edma/edma_install.md) provides installation and usage guidlines | ||
* The initial release supports a single queue in each direction | ||
* DMA support is in Beta stage with a known issue for DMA READ transactions that cross 4K address boundaries. See [Kernel_Drivers_README](./sdk/linux_kernel_drivers/edma/README.md) for more information on restrictions for this releas | ||
|
||
## Implementation Restrictions | ||
|
||
* PCIE AXI4 interfaces between Custom Logic(CL) and Shell(SH) have following restrictions: | ||
* All PCIe transactions must adhere to the PCIe Exress base spec | ||
* 4Kbyte Address boundary for all transactions(PCIe restriction) | ||
* Multiple outstanding outbound PCIe Read transactions with same ID not supported | ||
* PCIE extended tag not supported, so read-request is limited to 32 outstanding | ||
* Address must match DoubleWord(DW) address of the transaction | ||
* WSTRB(write strobe) must reflect appropriate valid bytes for AXI write beats | ||
* Only Increment burst type is supported | ||
* AXI lock, memory type, protection type, Quality of service and Region identifier are not supported | ||
* PCIE AXI4 interfaces between Custom Logic(CL) and Shell(SH) must follow the AMBA AXI4 protocol specification. | ||
* Prior to running on F1 instance, it is highly recommended that developers run logic simulations with the ARM or Xilinx AXI4 protocol checker | ||
|
||
|
||
## Unsupported Features (Planned for future releases) | ||
|
||
* PCI-M AXI interface is not supported in this release. | ||
* FPGA to FPGA communication over PCIe for F1.16xl | ||
* FPGA to FPGA over the 400Gbps Ring for F1.16xl | ||
* Aurora and Reliabile Aurora modules for the FPGA-to-FPGA | ||
* Preserving the DRAM content between different AFI loads (by the same running instance) | ||
* Cadence RTL simulations tools | ||
* All AXI-4 interfaces (PCIM, DDR4) do not support AxSIZE other than 0b110 (64B) | ||
|
||
## Known Bugs/Issues | ||
|
||
* The PCI-M AXI interface is not supported in this release. | ||
* The interface is included in cl_ports.vh and required in a CL design, but not enabled for functional use | ||
|
||
* The integrated DMA function is in Beta stage. Known issues: | ||
* DMA READ addresses crossing 4K page boundaries. The failure can be triggered by READ transfers that start on an address other than 4K aligned AND cross the 4K page boundary. READ transfers that do not cross the 4K boundary OR transfers that start at the beginning of a 4K page and greater than 4K size are not susceptible to the error. WRITE transfers are not affected by this issue Developers should use 4K aligned address boundaries on any READ transfer that can cross a 4K boundary to avoid the issue. | ||
* Transfer sizes of 8KB or less are supported with the integrated DMA engine for this revision of the Shell. Integrated DMA with large transfer sizes (16KB or greater) can cause timeouts between the Shell and CL if the Shell can’t respond with all data before the timeout. Please see documentation on how to [detect a timeout has occured](./hdk/docs/HOWTO_detect_shell_timeout.md) | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
|
||
# AXI Timeouts | ||
|
||
* The Shell provides a timeout mechanism which terminates any outstanding AXI transactions after 2.5 uS. There is a separate timeout per interface. Upon the first timeout, metrics registers are updated with the offending address and a counter is incremented. Upon further timeouts the counter is incremented. These metrics registers can be read via the fpga-describe-local-image found in [Amazon FPGA Image Management Tools README](../../sdk//userspace/fpga_mgmt_tools/README.md) | ||
|
||
* Timeouts can occur for three reasons: | ||
1. The CL doesn’t respond to the address (reserved address space) | ||
2. The CL has a protocol violation on AXI which hangs the bus | ||
3. The address is going to F1 card’s DDR memory and the CL design’s latency is exceeding timeout value. | ||
|
||
* Best practice is to ensure addresses to reserved address space are fully decoded in your CL design. | ||
* DMA accesses to DDR will accumulate which can sometimes lead to timeouts. | ||
* CL designs which have multiple masters to DDR will also incur arbitration delays. | ||
* If you suspect a timeout, debug by reading the metrics registers. The saved offending address should help narrow whether this is to DDR or registers/RAMs inside the FPGA. If it’s inside the FPGA the developer should investigate protocol violations. | ||
|
||
# How to detect a shell timeout has occured | ||
|
||
* Shell-CL interface timeouts can be detected by checking for non-zero timeout counters. These metrics can be read using this command: | ||
``` | ||
$sudo fpga-describe-local-image -S 0 --metrics | ||
AFI 0 agfi-0f0e045f919413242 loaded 0 ok 0 0x04151701 | ||
AFIDEVICE 0 0x1d0f 0xf000 0000:00:1d.0 | ||
sdacl-slave-timeout=0 | ||
virtual-jtag-slave-timeout=0 | ||
ocl-slave-timeout=0 | ||
bar1-slave-timeout=0 | ||
dma-pcis-timeout=0 | ||
pcim-range-error=0 | ||
pcim-axi-protocol-error=0 | ||
pcim-axi-protocol-4K-cross-error=0 | ||
pcim-axi-protocol-bus-master-enable-error=0 | ||
pcim-axi-protocol-request-size-error=0 | ||
pcim-axi-protocol-write-incomplete-error=0 | ||
pcim-axi-protocol-first-byte-enable-error=0 | ||
pcim-axi-protocol-last-byte-enable-error=0 | ||
pcim-axi-protocol-bready-error=0 | ||
pcim-axi-protocol-rready-error=0 | ||
pcim-axi-protocol-wchannel-error=0 | ||
sdacl-slave-timeout-addr=0x0 | ||
sdacl-slave-timeout-count=0 | ||
virtual-jtag-slave-timeout-addr=0x0 | ||
virtual-jtag-slave-timeout-count=0 | ||
ocl-slave-timeout-addr=0x8001 | ||
ocl-slave-timeout-count=0 | ||
bar1-slave-timeout-addr=0x2001 | ||
bar1-slave-timeout-count=0 | ||
dma-pcis-timeout-addr=0x0 | ||
dma-pcis-timeout-count=0 | ||
pcim-range-error-addr=0x0 | ||
pcim-range-error-count=0 | ||
pcim-axi-protocol-error-addr=0x0 | ||
pcim-axi-protocol-error-count=0 | ||
pcim-write-count=0 | ||
pcim-read-count=0 | ||
DDR0 | ||
write-count=0 | ||
read-count=0 | ||
DDR1 | ||
write-count=0 | ||
read-count=0 | ||
DDR2 | ||
write-count=29797854199 | ||
read-count=4 | ||
DDR3 | ||
write-count=0 | ||
read-count=0 | ||
``` | ||
* For detailed infomation on metrics, see [Amazon FPGA Image Management Tools README](../../sdk//userspace/fpga_mgmt_tools/README.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
HDK_VERSION=1.2.2 | ||
HDK_VERSION=1.2.3 |