3. Variation of the latency in a SerDes device

A SerDes may show latency variations related to its serial and/or parallel sub-components. In the serializer, the transmission clock that strobes the parallel data into the device is multiplied to provide the high-speed serial clock for the PISO. On the other hand, in the deserializer, the high-speed serial clock is recovered from the stream and divided back to obtain the clock for the parallel data. These clocks are used to clock, respectively, the serial and parallel side of the SIPO. However, the clock division leads to an uncertainty of the phase of the recovered clock. The phase of the recovered clock may vary in integer multiples of the unit intervals (UIs) and it causes a consequent variation of the delay of the data strobed by the clock.

Let us imagine to multiply a signal clk in frequency by a factor M and let us call clkM, the result of this operation (Figure 2). We can label each clkM edge with an integer number from 0 to M - 1. If Tclk is the clock period of clk, the ith edge of clkM will be shifted by a delay <sup>i</sup> <sup>M</sup> Tclk with respect to the rising edge of clk. Let us now suppose to use a counter to divide clkM in frequency back by a factor M, there are now M possible phases for the result, which are represented by the clki signals (with i = 0 to M−1). The obtained signal depends on which edge of the clkM signal marks the 0 in the counter. Data crossing the clock domain from clk to one of the clki will do it with a latency related to their relative phase. After a reset or a power cycle of the system, the resulting clki signal might vary and the data latency with it. The system designer has to foresee a dedicated logic to remove this variation and to generate always the same clki signal and consequently the same data delay.

Figure 2. Clock multiplication and subsequent division (case M=4).

modes), data are multiplexed 2:1 into 8- or 10-bit words and retimed on the TXUSRCLK clock, which in this case runs at double the rate of TXUSRCLK2. When the FPGA interface is configured for single-width operation, data are passed through without any processing and the two TXUSRCLK and TXUSRCLK2 coincide. A dedicated encoder can be activated for 8b10b-based protocols, while an elastic buffer (i.e. a first in first out memory) is included to cross the clock domain boundary from TXUSRCLK and XCLK reliably. In some applications, it may happen that XCLK and TXUSRCLK are derived from the same clock, therefore, they toggle at the same average frequency, with a constant phase difference. In this case, the elastic buffer can be bypassed and a dedicated circuitry is used to ensure a safe transfer of data from the TXUSRCLK clock domain to XCLK. The PISO block serializes data and outputs them synchronously with TX\_HSCLK. It is worth mentioning that the PLL produces also another clock (TXOUTCLK), which can be routed to a clock buffer in the fabric and used as a TXUSRCLK. Unfortunately, due to architectural constraints of the GTP, this signal cannot be

At the receiver side, the CDR extracts the receiver high-speed clock (RX\_HSCLK) from the stream and recovers the serial data. A dedicated prescaler divides RX\_HSCLK down to generate the RXRECCLK, namely the recovered clock for clocking data out from the parallel output block and for the PCS operation. Since it is synchronous with the parallel data in the PCS, this clock can also be forwarded to the fabric and it can be used to synchronize the logic processing the deserialized data. An interesting and very useful block is the "Comma Detector and Aligner" which can search for special symbols in the serial stream and align the symbol boundary to them automatically, saving the designer to perform this operation in the fabric. The rest of the blocks in the receiver's PCS are symmetrical to the transmitter ones, they perform elastic buffering toward the RXUSRCLK clock domain and data demultiplexing when needed (FPGA interface). The RXUSRCLK2 signal synchronizes the data from the FPGA interface into the fabric. For single-width operation modes, RXUSRCLK and RXUSRCLK2 are the same signal, while for double-width modes they are edge-aligned but RXUSRCLK2 toggles

A SerDes may show latency variations related to its serial and/or parallel sub-components. In the serializer, the transmission clock that strobes the parallel data into the device is multiplied to provide the high-speed serial clock for the PISO. On the other hand, in the deserializer, the high-speed serial clock is recovered from the stream and divided back to obtain the clock for the parallel data. These clocks are used to clock, respectively, the serial and parallel side of the SIPO. However, the clock division leads to an uncertainty of the phase of the recovered clock. The phase of the recovered clock may vary in integer multiples of the unit intervals (UIs) and it

Let us imagine to multiply a signal clk in frequency by a factor M and let us call clkM, the result of this operation (Figure 2). We can label each clkM edge with an integer number from 0 to M -

<sup>M</sup> Tclk with

1. If Tclk is the clock period of clk, the ith edge of clkM will be shifted by a delay <sup>i</sup>

causes a consequent variation of the delay of the data strobed by the clock.

used when the elastic buffer is not in use.

252 Field - Programmable Gate Array

at half the frequency with respect to RXUSRCLK.

3. Variation of the latency in a SerDes device

In the parallel sub-components of the SerDes, elastic buffers might induce variations of latency. Even if a buffer is written to and read from at the same frequency, after each reset or power cycle its latency depends on the difference between the internal write and read pointers. This difference in turns depends on the functionality of the logic accessing the buffer and it induces variations in terms of integer multiples of the clock period used for reading and writing. A dedicated logic has to be added in order to match the number of written words before they start being read at each reset or power up.

Therefore, the overall latency variation ΔL between two resets or power cycles of the device is as follows:

$$
\Delta L = nT\_{\text{ser}} + mT\_{\text{pur}} \tag{1}
$$

where Tser is the high-speed serial clock period, Tpar is the parallel clock period, n and m are integers and their ranges depend on the SerDes device and how it is configured and operated.
