1      PCI IP设计

虽然PCI已经逐渐淘汰,但是还是有不少应用需要这样的接口通讯。

设计目的是为了提供基于源码的PCI IP,这样硬件就不必受限于某一个FPGA型号,也方便ASIC迁移。由于PCI的电气标准都是标准3.3V电平,不像PCIe需要高速收发器、8b/10b编码等技术的支持,因此设计一个基于源码的PCI IP是完全可行的,并且我们设计的IP也确实经过了验证。

1.1     功能需求

l  接收FPGA其它模块的参数设置,由外部参数决定何时启动、停止PCI数据传输

l  兼容PCI 2.2 协议,33Mhz主频,32bit地址/数据

l  工作在Target模式,可接收数据读写、寄存器读写

l  总线状态监测,具备错误汇报能力,并根据错误类型决定后续动作,比如重试、停止当前传输

1.2     IP特性

l  标准32位33Mhz PCI Target接口;

l  Wishbone master接口,支持block/burst读写;

l  数据均使用小端模式,可以方便在windows上进行存储,符合常用习惯;

l  具备标准PCI配置寄存器空间,参数支持修改;

l  BAR0 register, 在PCI memory map空间占据32Mbytes;

l  需要支持的PCI指令如下:

0110                   Memory Read

0111                   Memory Write

1010                   Configuration Read

1011                   Configuration Write

1100                   Memory Read Multiple

1110                   Memory Read Line

1111                   Memory Write and Invalidate

l  支持对PCI读写进行重试,用户可通过wishbone master接口发起;

1.3     接口定义/引脚描述

图 5‑1 PCI Core接口与参数列表(在VIVADO中的形式)

接口主要分成2个部分:

l  PCI target接口;

l  Wishbone master接口;

参数说明:

l  Bars “1BARMEM"/"1BARIO",支持memory 模式和IO模式,推荐使用默认memory模式;

l  Wb endian:  wishbone总线使用大端还是小端模式,推荐使用默认小端模式;

l  Wb size: wishbone bus size,推荐使用默认32 ;

l  Class code ID: PCI class code id;

l  Device ID: PCI device id,可以自由指定,当前设备的ID;

l  Revision ID: PCI Revision ID,可以自由指定,当前固件的版本;

l  Subsystem ID: 子系统ID,一般和Device ID相同,也可以不同;

l  Subsystem Vendor ID: 子系统厂商ID,由用户指定,一般会和Vendor ID相同;

l  Vendor ID: PCI提供商ID,表面设备生产厂商,正式的Vendor ID需要向PCI SIG申请,但是在特定系统使用时,可以由用户自己指定,只要不和现有设备冲突即可;

因为CPU依靠Vendor ID和Device ID来区分不同类型的设备。

接口类型声明

引脚输入输出方向:

In               standard input only

out          standard output only

t/s              Tri-State is a bi-directional, tri-state input/output pin

s/t/s          在时序上要对这类信号多加注意,避免时序出错。Sustained Tri-State is an active low tri-state signal owned and driven by one and only one agent at a time. The agent that drives an s/t/s pin low must drive it high for at least one clock before letting it float. A

new agent cannot start driving a s/t/s signal any sooner than one clock

after the previous owner tri-states it. A pullup is required to sustain

the inactive state until another agent drives it and must be provided by

the central resource.

o/d         Open Drain allows multiple devices to share as a wire-OR. A pull-up is required to sustain the inactive state until another agent drives it and must be provided by the central resource.

1.3.1       内部用户接口

内部接口即用户接口,是wishbone master接口,本小节介绍接口定义与关键时序。

名称

方向

分组

定义

wb_adr_o

out

Internal pins

Whisbone address

wb_dat_i

in

Internal pins

Whisbone data in

wb_dat_o

out

Internal pins

Whisbone data out

wb_sel_o

out

Internal pins

Whisbone data byte selection . The select output array [SEL_O()] indicates where valid data is expected on the [DAT_I()] signal array during READ cycles, and where it is placed on the [DAT_O()] signal array during WRITE cycles. 表征当前bit对应的byte数据有效,数据周期一般是0xf,表示32bit数据全部有效

wb_we_o

out

Internal pins

Whisbone write enable. The write enable output [WE_O] indicates whether the current local bus cycle is a READ or WRITE cycle. The signal is negated during READ cycles, and is asserted during WRITE cycles. 表征当前操作是读还是写

wb_stb_o

out

Internal pins

Whisbone data strobe. The strobe output [STB_O] indicates a valid data transfer cycle. It is used to qualify various other signals on the interface such as [SEL_O()]. The SLAVE asserts either the [ACK_I], [ERR_I] or [RTY_I] signals in response to every assertion of the [STB_O] signal 数据有效标志

wb_cyc_o

out

Internal pins

Whisbone cycle. The cycle output [CYC_O], when asserted, indicates that a valid bus cycle is in progress. The signal is asserted for the duration of all bus cycles. For example, during a BLOCK transfer cycle there can be multiple data transfers. The [CYC_O] signal is asserted during the first data transfer, and remains asserted until the last data transfer. 会在整个数据传输周期保持有效

wb_ack_i

in

Internal pins

Whisbone acknowledge. The acknowledge input [ACK_I], when asserted, indicates the normal termination of a bus cycle.如果是block传输,则可能一直有效

wb_rty_i

in

Internal pins

Whisbone retry. The retry input [RTY_I] indicates that the interface is not ready to accept or send data, and that the cycle should be retried. Slave要求master重试

wb_err_i

in

Internal pins

Whisbone error. The error input [ERR_I] indicates an abnormal cycle termination.

wb_int_i

in

Internal pins

Whisbone interrupt. 中断引脚,实际并不使用

1.3.2       Wish bone读写时序

图 5‑2 PCI通过wishbone接口进行单次读取,这里可以用做读写用户寄存器,wb_stb_o&wb_cyc_o==1时,当前读请求有效,slave在下一个时钟周期进行回应,同时返回指定地址的数据

图 5‑3 PCI通过wishbone接口进行单次写入,这里可以用做用户寄存器写入,wb_stb_o&wb_cyc_o==1时,当前读请求有效,slave在下一个时钟周期进行回应,表面请求写入的动作已经完成。

图 5‑4 PCI block read方式读,可以用做DMA数据读取,PCI的读操作转换成wishbone的读操作,wb_stb_o&wb_cyc_o==1时,当前读请求有效,在下一个时钟周期返回有效数据,最后一个周期的读请求无需响应。

图 5‑5 PCI block write方式写入,可以用做DMA数据写入,PCI的读操作转换成wishbone的写操作,wb_stb_o&wb_cyc_o==1时,当前写请求有效,在下一个时钟周期返回写完成相应,最后一个周期的写请求无需响应。

1.3.3       PCI物理接口

参考来自PCI SPEC V2.2

表 5‑3 PCI接口

名称

方向

分组

定义

clk33

in

System pins

Clock provides timing for all transactions on PCI and is an input to every PCI device. All other PCI signals, except RST#, INTA#, INTB#, INTC#, and INTD#, are sampled on the rising edge of CLK and all other timing parameters are defined with respect to this edge 该信号应来自板载晶振

rst

in

System pins

Reset is used to bring PCI-specific registers, sequencers, and signals to a consistent state. Anytime RST# is asserted, all PCI output signals must be driven to their benign state. In general, this means they must be asynchronously tri-stated. REQ# and GNT# must both be tristated (they cannot be driven low or high during reset). To prevent AD, C/BE#, and PAR signals from floating during reset, the central resource may drive these lines during reset (bus parking) but only to a logic low level; they may not be driven high. 该信号来自板载复位

ad [31:0]

t/s

Address and Data Pins

Address and Data are multiplexed on the same PCI pins. A bus transaction consists of an address2 phase followed by one or more data phases. PCI supports both read and write bursts.

cbe [3:0]#

t/s

Address and Data Pins

Bus Command and Byte Enables are multiplexed on the same PCI pins. During the address phase of a transaction, C/BE[3::0]# define the bus command (refer to Section 3.1. for bus command definitions). During the data phase, C/BE[3::0]# are used as Byte Enables

par

t/s

Address and Data Pins

Parity is even parity across AD[31::00] and C/BE[3::0]#. Parity generation is required by all PCI agents. Parity generation is not optional, it must be done by all PCI-compliant devices 偶检验是必须实现的功能

frame#

s/t/s

Interface Control Pins

Cycle Frame is driven by the current master to indicate the beginning and duration of an access. FRAME# is asserted to indicate a bus transaction is beginning. While FRAME# is asserted, data transfers continue. When FRAME# is deasserted, the transaction is in the final data phase or has completed. 读写时序图上可以看到传输是如何结束的

irdy#

s/t/s

Interface Control Pins

Initiator Ready indicates the initiating agent’s (bus master’s) ability to complete the current data phase of the transaction. IRDY# is used in conjunction with TRDY#. A data phase is completed on any clock both IRDY# and TRDY# are asserted. During a write, IRDY# indicates that valid data is present on AD[31::00]. During a read, it indicates the master is prepared to accept data. Wait cycles are inserted until both IRDY# and TRDY# are asserted together.

trdy#

s/t/s

Interface Control Pins

Target Ready indicates the target agent’s (selected device’s) ability to complete the current data phase of the transaction. TRDY# is used in conjunction with IRDY#. A data phase is completed on any clock both TRDY# and IRDY# are asserted. During a read, TRDY# indicates that valid data is present on AD[31::00]. During a write, it indicates the target is prepared to accept data. Wait cycles are inserted until both IRDY# and TRDY# are asserted together.

stop

s/t/s

Interface Control Pins

Stop indicates the current target is requesting the master to stop the current transaction.

devsel

s/t/s

Interface Control Pins

Device Select, when actively driven, indicates the driving device has decoded its address as the target of the current access. As an input, DEVSEL# indicates whether any device on the bus has been selected.

idsel

in

Interface Control Pins

Initialization Device Select is used as a chip select during configuration read and write transactions.

perr#

s/t/s

Error Reporting Pins

Parity Error is only for the reporting of data parity errors during all PCI transactions except a Special Cycle. The PERR# pin is sustained tri-state and must be driven active by the agent receiving data (when enabled) two clocks following the data when a data parity error is detected. The minimum duration of PERR# is one clock for each data phase that a data parity error is detected.

serr#

o/d

Error Reporting Pins

System Error is for reporting address parity errors, data parity errors on the Special Cycle command, or any other system error where the result will be catastrophic.

intb#

o/d

Interrupt Pins

Interrupt C is used to request an interrupt and only has meaning on a multi-function device 本项目不会使用该引脚

下面是客户提供的名称,其中REQ和GNT是Master专用,本设计实际并不需要

  • STOP
  • SERR
  • PERR
  • RST
  • FRAME
  • TRDY
  • DEVSEL
  • PAR
  • CLK
  • INTC
  • REQ[3:0]          Bus Masters Only,因此不需要
  • GNT[3:0]           Bus Masters Only,因此不需要
  • CBE[3:0]
  • AD[31:0]

1.4     状态机

本部分参考PCI SPEC V2.2中参考状态机进行修改设计。

以下是spec对状态机的描述

“Caution needs to be taken when an agent is both a master and a target. Each must have its own state machine that can operate independently of the other to avoid deadlocks. This means that the target state machine cannot be affected by the master state machine. Although they have similar states, they cannot be built into a single machine.”

图 5‑6 PCI Core状态机示意图,实际会多出2个状态

实际为了将PCI操作转化为Wishbone读写,状态S_DATA2和TURN_ARE状态,这两个状态主要目的是满足信号时序逻辑。

状态机状态:PCIIDLE, B_BUSY, S_DATA1, S_DATA2, BACKOFF, TURN_ARL, TURN_ARE。详细跳转过程可以参照代码进行理解,文档对该部分暂时不做详细说明。

1.5     PCI关键时序

该部分将用时序图的方式,分别对PCI的各种操作给出PCI Spec要求的时序、以及本项目提供的PCI Core实际仿真时序。其中仿真时序中IRDY信号并未严格满足要求(该信号来源于PCI master,本文档不需要提供PCI master,只是提供仿真模型),但是并不影响实际结果。

图 5‑7 读配置寄存器

图 5‑8 读取device id 0x0001,vendor id 0x4150,各占用16bit,其中vendor id占用低16bit,当前图示时序的irdy比理论多出一个有效的时钟周期,这只是仿真逻辑pci master的问题,且并不影响数据传输的正确性,因此可以忽略

图 5‑9 写入PCI配置寄存器,base address registers, 地址0x10,写入数据0x3000_0000

图 5‑10 PCI 配置寄存器列表

图 5‑11 从用户地址空间读取用户寄存器

图 5‑12 向用户地址空间写入寄存器

图 5‑13 从用户地址空间以block的方式读出数据

图 5‑14 向用户地址空间以block的方式写入数据

在仿真代码中,加入对寄存器读写的测试。

start_read =0;

addr =0;

length=1;

rst=1;

#200;

rst=0;

//vendor id device id

pci_configuration_read(0,data_value);

assert(data_value=='h14150);

//cmd status

pci_configuration_read(4,data_value);

assert(data_value=='h02000002);

pci_configuration_write('h10,'h3000_0000);

pci_configuration_read(0,data_value);

assert(data_value=='h14150);

pci_configuration_read('h10,data_value);

assert(data_value=='h3000_0000);

pci_memory_read('h3000_0000,data_value);

assert(data_value==0);

pci_memory_read('h3000_0004,data_value);// address add 4 every time, must increment WBSIZE/8

assert(data_value==1);

pci_memory_read('h3000_0008,data_value);// address add 4 every time

assert(data_value==2);

pci_memory_write('h3000_0000,100);

pci_memory_read('h3000_0000,data_value);

assert(data_value==100);

pci_memory_write('h3000_0000,0);//重新写0

pci_memory_read_line('h3000_0000,4,data_value);//read line

pci_memory_write_line('h3000_0000,4,111);//write line

pci_memory_read_line('h3000_0000,4, data_value);//read line

pci_memory_read_line('h3000_0000,5, data_value);//read line

pci_memory_read_line('h3000_0000,6, data_value);//read line

pci_memory_read_line('h3000_0010,10, data_value);//read line

1.6     支持的PCI指令

l  支持的PCI指令

0110                   Memory Read

0111                   Memory Write

1010                   Configuration Read

1011                   Configuration Write

1100                   Memory Read Multiple

1110                   Memory Read Line

1111                   Memory Write and Invalidate

下面是一些spec对相关指令的描述,用户并不需要特别关注下面的内容,这只是开发人员的记录。

从上面这段话可以看出,单纯的memory read 是一次只能读取4个字节,效率很低。

而memory read line/ memory read multiple自治领推荐实现cache line size 寄存器。下面这段话可以看出cache line size 寄存器所起的作用和约束。

It is highly recommended that the Cacheline Size register be implemented to ensure

correct use of the read commands. 这个只是针对master设备而言,target设备只负责接收请求。

Bridge会读取比master更多的数据来实现高性能数据传输。

那么bridge是有责任负责存储预取的隐藏数据责任,最简单的方式是认为该部分数据无效并抛弃,需要注意,这样的方式不进行小心的处理可能导致数据的丢失。

从6章节关于cache line size寄存器的描述可以看出,对于target类型的slave设备,除非需要支持cacheline wra addressing模式,否则是可以不实现cache line size的,数据的传输大小由master设备决定。

图 5‑15 block read

数据传输的终止,取决于master的frame信号,frame信号为高表示要终止传输,下一个数据即最后一个数据。

PCI Verilog IP的更多相关文章

  1. PCI Express

    1.1课题研究背景 在目前高速发展的计算机平台上,应用软件的开发越来越依赖于硬件平台,尤其是随着大数据.云计算的提出,人们对计算机在各个领域的性能有更高的需求.日常生活中的视频和图像信息包含大量的数据 ...

  2. 转载:Why using Single Root I/O Virtualization (SR-IOV) can help improve I/O performance and Reduce Costs

    Introduction While server virtualization is being widely deployed in an effort to reduce costs and o ...

  3. Spartan6系列之器件引脚功能详述

    1.   Spartan-6系列封装概述 Spartan-6系列具有低成本.省空间的封装形式,能使用户引脚密度最大化.所有Spartan-6 LX器件之间的引脚分配是兼容的,所有Spartan-6 L ...

  4. uboot学习之BL3的流程

    BL2的最后通过汇编调用了board_init_r函数,此时进入BL3的阶段,此时的主要工作: 这一阶段涉及的文件及任务如下 arch/arm/lib/board.c           1. boa ...

  5. Multi-Channel PCIe QDMA Subsystem

    可交付资料: 详细的用户手册 Design File:Post-synthesis EDIF netlist or RTL Source Timing and layout constraints,T ...

  6. 如何将自己写的verilog模块封装成IP核

    如何将自己写的verilog模块封装成IP核 (2014-11-21 14:53:29) 转载▼ 标签: 财经 分类: 我的东东 =======================第一篇========= ...

  7. 第II篇PCI Express体系结构概述

    虽然PCI总线取得了巨大的成功,但是随着处理器主频的不断提高,PCI总线提供的带宽愈发显得捉襟见肘.PCI总线也在不断地进行升级,其位宽和频率从最初的32位/33MHz扩展到64位/66MHz,而PC ...

  8. 基于Verilog语言的可维护性设计技术

    [注]本文内容主体部分直接翻译参考文献[1]较多内容,因此本文不用于任何商业目的,也不会发表在任何学术刊物上,仅供实验室内部交流和IC设计爱好者交流之用. “曲意而使人喜,不若直节而使人忌:无善而致人 ...

  9. 网络知识学习1---(基础知识:ISO/OSI七层模型和TCP/IP四层模型)

    以下的内容和之后的几篇博客只是比较初级的介绍,想要深入学习的话建议自己钻研<TCP/IP详解 卷1:协议> 1.ISO/OSI七层模型    下四层是为数据传输服务的,物理层是真正的传输数 ...

随机推荐

  1. python 字典 分别根据值或键进行排序的方法

    最近经常遇到根据字母出现的频率进行排序的题目 我的思路一般是借用字典统计字母出现的频率 然后对字典按照值进行排序 但是每次按照值进行排序时 都会忘记排序方法 在此记录一下,以加深印象 字典原始值如下: ...

  2. [ vue ] 解耦vuex(按照组件来组织vuex的结构)

    问题描述 随着应用复杂度的增加,vuex用一个 store/index.js 文件来描述已经很难维护了,我们想把这些状态分割到单独文件里面. 参考1:https://vuex.vuejs.org/zh ...

  3. [ css ] 实现漂亮的输入框动画(借鉴自panjiachen的后台管理项目)

    效果预览 HTML <div class="l-custom-input"> <input size="large" id="l-i ...

  4. Python的内存管理和垃圾回收机制

    内存管理 Python解释器由c语言开发完成,py中所有的操作最终都由底层的c语言来实现并完成,所以想要了解底层内存管理需要结合python源码来进行解释. 1. 两个重要的结构体 include/o ...

  5. Java实现163邮箱发送邮件到QQ邮箱

    注:图片如果损坏,点击文章链接:https://www.toutiao.com/i6812973124141711876/ 先创建一个maven的普通项目 添加依赖,附在文档末尾 其中几个注意的地方 ...

  6. LINUX学习-Mysql集群-多主一从备份

    基本原理:从服务器开启两个线程,一个备份主1,一个备份主2. 一.准备 主1:192.168.88.20 主2:192.168.88.30 从:192.168.88.40 两个主服务器开启binlog ...

  7. Windows 重装系统,配置 WSL,美化终端,部署 WebDAV 服务器,并备份系统分区

    最新博客文章链接 最近发现我 Windows11 上的 WSL 打不开了,一直提示我虚拟化功能没有打开,但我看了下配置,发现虚拟化功能其实是开着的.然后试了各种方法,重装了好几次系统,我一个软件一个软 ...

  8. kafka学习笔记(六)kafka的controller模块

    概述 今天我们主要看一下kafka的controller的代码,controller代码是kafka的非常重要的代码,需要我们深入学习.从某种意义上来说,它是kafka最核心的组件,一方面,他要为集群 ...

  9. F5 BIG-IP 远程代码执行漏洞环境搭建

    最近F5设备里的远程代码执行漏洞可谓是火爆,漏洞评分10分,所以,我也想搭建下环境复现一下该漏洞 漏洞详情 F5 BIG-IP 是美国F5公司一款集成流量管理.DNS.出入站规则.web应用防火墙.w ...

  10. 对飞猪H5端API接口sign签名逆向实验

    免责声明 本文章所提到的技术仅用于学习用途,禁止使用本文章的任何技术进行发起网络攻击.非法利用等网络犯罪行为,一切信息禁止用于任何非法用途.若读者利用文章所提到的技术实施违法犯罪行为,其责任一概由读者 ...