Parallel Scrambler Generator
Scramblers are used in many communicaton protocols such as PCI Express, SAS/SATA, USB, Bluetooth to randomize the transmitted data. To keep this post short and focused I’ll not discuss the theory behind scramblers. For more information about scramblers see [1], or do some googling. The topic of this post is the parallel implementation of a scrambler generator. Protocol specifications define scrambling algorithm using either hex or polynimial notation. This is not always suitable for efficient hardware or software implementation. Please read my post on parallel CRC Generator about that.
The Parallel Scrambler Generator method that I’m going to describe has a lot in common with the Parallel CRC Generator. The difference is that CRC generator outputs CRC value, whereas Scrambler generator produces scrambled data. But the internal working of both based on the same principle.
Here is an example of a scrambler with the polynomial G(x) = x16+x5+x4+x3+1
Following is the description of the Parallel Scrambler Generator algorithm:
(1) Let’s denote N=data width, M=generator polynomial width.
(2) Implement serial scrambler generator using given polynomial or hex notation. It’s easy to do in any programming language or script: C, Java, Perl, Verilog, etc.
(3) Parallel Scrambler implementation is a function of N-bit data input as well as M-bit current state of the polynomial, as shown in the above figure. We’re going to build three matrices:
- Mout (next state polynomial) as a function of Min(current state polynomial) when N=0 and
- Nout as a function of Nin when Min=0.
- Nout as a function of Min when Nin=0
Note that the polynomial next state doesn’t depend on the scrambled data, therefore we need only three matrices.
(4) Using the routine from (3) calculate scrambled data for the Mout values given Min, when Nin=0. Each Min value is one-hot encoded, that is there is only one bit set.
(5) Build MxM matrix, Each row contains the results from (4) in increasing order. For example, 1’st row contains the result of input=0×1, 2′nd row is input=0×2, etc. The output is M-bit wide, which the polynomial width.
(6) Calculate the Nout values given Nin, when Min=0. Each Nin value is one-hot encoded, that is there is only one bit set.
(7) Build NxN matrix, Each row contains the results from (6) in increasing order. The output is N-bit wide, which the data width.
(8) Calculate the Nout values given Min, when Nin=0. Each Min value is one-hot encoded, that is there is only one bit set.
(9) Build MxN matrix, Each row contains the results from (7) in increasing order. The output is N-bit wide, which the data width.
(10) Now, build an equation for each Nout[i] bit: all Nin[j] and Min[k] set bits in column [i] from three matrices participate in the equation. The participating inputs are XORed together.
Nout is the parallel scrambled data.
Keep me posted if the Parallel Scrambler Generation tool works for you, or you need more clarifications on the algorithm.
References:
Hi Ravikumar,
I cannot reproduce the problem. You’re saying that the code cannot be generated ?
Thanks,
Evgeni
Evgeni,
Is there an easy way to reverse the order the scrambler generates data? I’m messing with Rapid IO and its scrambled IDLE2 sequence and while the generator polynomial is correct, the output order is reversed from my reference model. For example, if I expect 1, 2, 3, 4 the parallel scrambler generated here outputs 4, 3, 2, 1. So I get the right sequence, but it’s backward.
Also thanks for the site! These tools have been very helpful.
Hi Mike,
Perhaps it’s something to do with the ordering of input data. Something like bytes or words are swapped.
Thanks,
Evgeni
Hi Evgeni,
Thanks for this wonderful works.
I am using your website to understand “Parallel Scrambler Generator”. But I am not able to find any information how I have to decide “data_c” assignment.
I am using :scrambler module for data[7:0], lfsr[22:0]=1+x^2+x^5+x^8+x^16+x^21+x^23;
But not able to understand :
This part:
data_c[0] = data_in[0] ^ lfsr_q[22];
data_c[1] = data_in[1] ^ lfsr_q[21];
data_c[2] = data_in[2] ^ lfsr_q[20] ^ lfsr_q[22];
data_c[3] = data_in[3] ^ lfsr_q[19] ^ lfsr_q[21];
data_c[4] = data_in[4] ^ lfsr_q[18] ^ lfsr_q[20] ^ lfsr_q[22];
data_c[5] = data_in[5] ^ lfsr_q[17] ^ lfsr_q[19] ^ lfsr_q[21];
data_c[6] = data_in[6] ^ lfsr_q[16] ^ lfsr_q[18] ^ lfsr_q[20] ^ lfsr_q[22];
data_c[7] = data_in[7] ^ lfsr_q[15] ^ lfsr_q[17] ^ lfsr_q[19] ^ lfsr_q[21] ^ lfsr_q[22];
Please can you help me to understand this part.
What is the logic behind this and how that logic works?
Thanks
Anurag
Hi Anurag,
I wrote an article on how the algorithm works to generate CRC. Very similar approach is used for parallel scrambler generation as well.
Here is the link: http://outputlogic.com/my-stuff/circuit-cellar-january-2010-crc.pdf
Thanks,
Evgeni
Hi ,
I am writing code for self synchronous scrambler with polynomial 1+x^39+x^58.
datawidth =64 and polywidth =58 .When iam trying to give input as all 0’s(scrambler input) the output is giving as all 64-bit ‘0s(scrambler output).
Here i have a doubt that as per spec scrambler is defined as it will used to maintain DC balance(long sequences of 1’s or 0’s will be scrambled).so as per spec when iam giving all 1’s it is giving some scrambled data with adding 0’s to output.
but when iam giving all 0’s it is not adding 1’s in output.
So can you explain why it is not effecting at output for input of all 0’s.
Thanks,
Shravan Kumar
Hi Shravan,
Scrambling is essentially an XOR operation on the data input and current scrambler state. If your input data is all-zero, and scrambler is initialized with zero, then the output is also zero. You need to initialize the scrambler, usually with all-Fs.
Thanks,
Evgeni
Hi Evgeni,
As per IEEE 802.3 std, PCS uses the self-synchronizing scrambler, and “There is no requirement on the initial value for the scrambler. The scrambler is run continuously on all payload bits”. In this condition, I have to assume my initial state of scrambler is zero then all input of 0’s generates all 0’s scrambler output. And that is violating the scrambler functionality. Please guide me to understand scrambler thoroughly.
Regards,
Deepak
Hi,
I am implementing SATA scrambler. Just confused with seed value. Spec says it is FFFF. But the example given at the end using parallel scrambling method says f0f6.
Does parallel scrambler require different seed value. Another thing I noted serial method output doesn’t match with parallel one. Not sure which one to use. Please give some pointer.
@Deepak Ameta
this tool doesn’t work for self-synchronizing scrambler
Hi Evegni,
Could you please add an example to clarify the above alogrithm ?
I wanted to get some more clarification on the steps 4 – 9.
I tried generating the lfsr using the following info
// data[15:0], lfsr[15:0]=1+x^3+x^4+x^5+x^16;
I am able to match the lfsr next state equations to the ones generated
using your online generator, However the data_c[15:0] equations are not
matching for me.
Thanks
Hi Evgeni,
Thanks for putting a very useful website. Can you please provide a descrambler for PCIE 8 GT/s spec. It uses a polynomial that is different to 2.5 and 5 GT/s rates. I have tried to generate the descrambler, by defining the polynomial at your web-site’s “user-defined” method. But, at simulation, I found that the descrambler is not working correcly.
Hi Evgeni,
The scrambler works fine. I hadn’t given reset and enable signals correctly to it. When I fixed that it works fine. Thanks a lot for this very useful website.
Has anyone tried to design a pipelined scrambler ?
Fantastic! Saved me a bunch of time from writing Perl or doing in manually. Thank you!!
You’re welcome!
Hi Evgeni,
I have looked at the code for your stand alone crc-gen program. Do you have a stand alone for the scrambler? I am trying to understand how to generate the MxN scrambler matrix. Would it be possible to share your algorithm or take a peek at that code?
Thanks
Hi Ron,
The algorithm for scrambler is similar to the one for CRC.
No, I don’t have scrambler code that is publicly available.
Thanks,
Evgeni
Will scrambler code works for descrambling data also
I’m trying to generate a parallel scrambler from your site. No matter what I put in for the polynomial in step 2, hitting the generate verilog or generate vhdl button produces nothing. Is this page broken or not yet implemented?
Thanks…
Hi Tony,
Just tried it – the tool works fine on Chrome and IE.
Please send me a screenshot of the parameters you put in. Another possibility is that browser blocks the results of “generate” button because the site doesn’t have SSL certificate. You might want to check developer’s window in the browser for security-related errors.
Thanks,
Evgeni
do you have a self-synchronizing scrambler and descrambler for poly 1+x^39+x^58
and width 26?
Do you have a scrambler and descrambler for lfsr[7:0]=1+x^43?
Do you have a scrambler and descrambler for G(X)=1+x^43 ?Input bit 8bit width.
Can an 8-bit input self-synchronizing scrambler and descrambler be generated? The scrambling polynomial is g(x)=X43+1.
Hi,
I am working on pci gen 5.0 scramble and descramble.
Can you please send me the link for descrambler. I want verilog code
Data length 8,16,32,64.
Your parall scrambler is based on Galois LFSR, right? How do you do if you have a fibonacci LFSR?
Hello Evgeni,
I am trying to learn how someone has coded a scrambler polynomial for ethernet protocol. I/O data is 9 bits wide, G(X)=1+x^4+x^15;
always_comb
begin
// Here State14 is sent out first.
S_w[14] = S_r[5];
S_w[13] = S_r[4];
S_w[12] = S_r[3];
S_w[11] = S_r[2];
S_w[10] = S_r[1];
S_w[9] = S_r[0];
S_w[8] = (use_master_poly_w == 1’b1) ? (S_r[14] ^ S_r[3]) : (S_r[14] ^ S_r[10]);
S_w[7] = (use_master_poly_w == 1’b1) ? (S_r[13] ^ S_r[2]) : (S_r[13] ^ S_r[9]);
S_w[6] = (use_master_poly_w == 1’b1) ? (S_r[12] ^ S_r[1]) : (S_r[12] ^ S_r[8]);
S_w[5] = (use_master_poly_w == 1’b1) ? (S_r[11] ^ S_r[0]) : (S_r[11] ^ S_r[7]);
S_w[4] = (use_master_poly_w == 1’b1) ? (S_r[10] ^ S_r[14] ^ S_r[3]) : (S_r[10] ^ S_r[6]);
S_w[3] = (use_master_poly_w == 1’b1) ? (S_r[9 ] ^ S_r[13] ^ S_r[2]) : (S_r[9 ] ^ S_r[5]);
S_w[2] = (use_master_poly_w == 1’b1) ? (S_r[8 ] ^ S_r[12] ^ S_r[1]) : (S_r[8 ] ^ S_r[4]);
S_w[1] = (use_master_poly_w == 1’b1) ? (S_r[7 ] ^ S_r[11] ^ S_r[0]) : (S_r[7 ] ^ S_r[3]);
S_w[0] = (use_master_poly_w == 1’b1) ? (S_r[6 ] ^ S_r[10] ^ S_r[14] ^ S_r[3]) : (S_r[6 ] ^ S_r[2]);
end
But when I generate the Verilog code using your tool, the equations are different. Can you please tell me what I am missing here?