Home > Logic Design > Parallel Scrambler Generator

Parallel Scrambler Generator


  Scramblers are used in many communicaton protocols such as PCI Express, SAS/SATA, USB, Bluetooth to randomize the transmitted data. To keep this post short and focused I’ll not discuss the theory behind scramblers. For more information about scramblers see [1], or do some googling.  The topic of this post is the parallel implementation of a scrambler generator. Protocol specifications define scrambling algorithm using either hex or polynimial notation. This is not always suitable for efficient hardware or software implementation. Please read my post on parallel CRC Generator about that.

 The Parallel Scrambler Generator method that I’m going to describe has a lot in common with the Parallel CRC Generator. The difference is that CRC generator outputs CRC value, whereas Scrambler generator produces scrambled data. But the internal working of both based on the same principle.

Here is an example of a scrambler with the polynomial G(x) = x16+x5+x4+x3+1

scrambler1

 

Following is the description of the Parallel Scrambler Generator algorithm:

(1) Let’s denote N=data width, M=generator polynomial width. scrambler21

(2) Implement serial scrambler generator using given polynomial or hex notation. It’s easy to do in any programming language or script: C, Java, Perl, Verilog, etc. 

(3) Parallel Scrambler implementation is a function of N-bit data input as well as M-bit current state of the polynomial, as shown in the above figure. We’re going to build three matrices:

  • Mout (next state polynomial) as a function of Min(current state polynomial) when N=0 and
  • Nout as a function of Nin when Min=0. 
  • Nout as a function of Min when Nin=0

      Note that the polynomial next state doesn’t depend on the scrambled data, therefore we need only three matrices.

 

(4) Using the routine from (3) calculate scrambled data for the Mout values given Min, when Nin=0. Each Min value is one-hot encoded, that is there is only one bit set. 

(5) Build MxM matrix, Each row contains the results from (4) in increasing order. For example, 1’st row contains the result of input=0×1, 2′nd row is input=0×2, etc. The output is M-bit wide, which the polynomial width.

(6) Calculate the Nout values given Nin, when Min=0. Each Nin value is one-hot encoded, that is there is only one bit set. 

(7) Build NxN matrix, Each row contains the results from (6) in increasing order. The output is N-bit wide, which the data width.

(8) Calculate the Nout values given Min, when Nin=0. Each Min value is one-hot encoded, that is there is only one bit set. 

(9) Build MxN matrix, Each row contains the results from (7) in increasing order. The output is N-bit wide, which the data width.

(10) Now, build an equation for each Nout[i] bit: all Nin[j] and Min[k] set bits in column [i] from three matrices participate in the equation. The participating inputs are XORed together.

 

Nout is the parallel scrambled data.

 

Keep me posted if the Parallel Scrambler Generation tool works for you, or you need more clarifications on the algorithm.


References:

  1. Scramblers on Wiki




  1. PT
    May 20th, 2009 at 10:37 | #1

    What a cool site, thank you for your good job! But, because English is not my home language, I have to pay more time on understanding the Parallel Scrambler Generator algorithm in this article. I think if you add an simple demo [as N = 4, M = 4 and a simple generator polynomial] of these step, people could understand this algorithm better, thank you.

  2. May 21st, 2009 at 00:35 | #2

    @PT

    Thanks for the positive feedback. I’ll add a simple example.

  3. Chico
    May 25th, 2009 at 09:29 | #3

    Very nice site and useful information and generators.

    For me, it only miss one thing to be excelent: the option of generating VHDL inspite of Verilog.

    Thank’s for the tools

    Chico

  4. May 26th, 2009 at 14:35 | #4

    @Chico
    Hi Chico,
    Thanks for the positive feedback. I’ve added VHDL support.

  5. Tayyab Ahmed
    July 27th, 2009 at 04:02 | #5

    I have produced the scrambler for polynomial mentioned in 802.3-2005_section4(10Gbps PCS layer) for 64-bit datapath i.e. datawidth=64 and polynomialwidth=58 using the polynomial 1+x^39+x^58. The polynomial has been picked from the IEEE standard mentioned above.

    The results of the scrambler are not as expected, also the equations in Verilog looked strange and there is no shifting of the scrambled register.

  6. July 27th, 2009 at 12:57 | #6

    Dear Ahmed,

    I did the following experiment. I generated two scramblers:

    (1) data width=64,poly width=58, G(x)= 1+x39+x58 (1 and x^39 boxes checked)

    (2) data width=1,poly width=58, G(x)= 1+x39+x58 (1 and x^39 boxes checked).
    This is a serial scrambled that should produce the same output after 64 clocks as (1) given the same input.

    I simulated both with different inputs and haven’t found any problems – the results always matched.

    Let me know if you still experience problems,

    OutputLogic

  7. Tayyab Ahmed
    July 28th, 2009 at 09:34 | #7

    Evgeni,

    Yeh that’s true but the equations regarding XOR’s does not conform to the Algorithm described in the paper below which is followed in most of the implementations.

    Parallel Scrambler for High-Speed Applications
    Chih-Hsien Lin, Chih-Ning Chen, You-Jiun Wang, Ju-Yuan Hsiao, and Shyh-Jye Jou

    Could you provide the descriptions of the equations of Verilog HDL.

    Regards,
    Ahmed

  8. Evgeni
    July 28th, 2009 at 11:51 | #8

    Ahmed,

    Thanks for the pointer.
    I don’t claim that my method of generating parallel scrambler is the most efficient one in terms of speed or logic utilization. But I’ll read the paper and do the comparison.
    Parallel scrambler generator is using the same approach as the parallel CRC. It’s described here: http://outputlogic.com/my-stuff/parallel_crc_generator_whitepaper.pdf

    I’d need to answer another question: since a given scrambler is described by a system of linear equations, how come there are multiple solutions to it (that is, multiple ways to generate the same parallel scrambler).

  9. totolapp
    July 29th, 2009 at 03:21 | #9

    hello,
    excellent site, useful information and Scrambler generator

    how descrambler?
    did you make a generator descrambler ?

    regards
    jean

  10. July 29th, 2009 at 12:07 | #10

    Jean, thanks for the comment.

    Typically, scrambler and descrambler implementation is the same. That is, if you have a scrambled data input, on the output of the scrambler/descrambler you’d get the original data.
    Why is that so ? To put it simple, because there is a property of two XOR operations (scrambler then descrambler) cancelling each other: I^S^S = I.

  11. Tayyab Ahmed
    July 30th, 2009 at 07:38 | #11

    Evgeni,

    Thanks for the description of parallel scrambler generator.

    I have also compared your generated scrambler with that of xilinx 10G PCS reference design scrambler and the results are not matching.You can also see xapp775 and compare.

    Regards,
    Ahmed

  12. Evgeni
    July 30th, 2009 at 21:43 | #12

    Ahmed,

    I looked at the scramble.v file in xapp775. The scrambler implemented there is different from what my tool generates.
    Let me explain why.


      line 95: assign `TPDA Scr_wire[6] = TXD_input[6]^Scrambler_Register[32]^Scrambler_Register[51];
      line 226: Scrambler_Register[57] <= `TPDB Scr_wire[6];   line 221: TXD_Scr[65:0] <= `TPDB {Scr_wire[63:0], Sync_header[1:0]};

    That is, the next state of the scrambler LFSR is the function of the current state of the scrambler LFSR and the input data.

    What my tool generates is that the next state of the scrambler LFSR is the function of only the current state.

    Both approaches use the same polynomial (G=1 +x^39 +x^58), but the results are different.

    Unfortunately, different protocols use different scrambler approaches given the same polynomial.
    Even in 802.3-2005_section4 spec WIS scrambler in section 50.3.3 has scrambler LFSR independent of the input data, whereas 66/64 bit scrambler in section 49.2.6 has scrambler LFSR dependent of the input data.

    This is an important difference and I'll need to provide an option to generate scrambler code either way.

  13. Tayyab Ahmed
    July 31st, 2009 at 08:25 | #13

    Evgeni,

    Thanks for the analysis.

    What I understand is, your scrambler generator is generating the code for “frame-synchronous scrambler” while there is another type of scrambler which is called “self synchronous scrmabler”. i.e. in 802.3-2005_section4, 50.3.3 refers to the frame-synchronous scrambler while 49.2.6 to self synchronous scrambler.

    Regards,
    Ahmed.

  14. September 8th, 2009 at 08:12 | #14

    Evgeni,

    Thanks for the description of parallel scrambler generator.
    Your scrambler generator is generating the code for “frame-synchronous scrambler” while there is another type of scrambler which is called “self synchronous scrambler”.

    Did you provide an option to generate scrambler code either way ?
    And to generate descrambler code for “self synchronous descrambler” ?

    Regards,
    Jacques.

  15. Jacques
    September 8th, 2009 at 08:53 | #15

    Evgeni,

    Thanks for the description of parallel scrambler generator.

    Your scrambler generator is generating the code for “frame-synchronous scrambler” while there is another type of scrambler which is called “self synchronous scrambler”.

    Did you provide an option to generate scrambler code either way ?
    And to generate descrambler code for “self synchronous descrambler” ?

    Regards,
    Jacques.

  16. September 8th, 2009 at 09:05 | #16

    Hi Jacques,

    Not yet. I was asked exact same question by a couple of other users. I’ll provide that option sometime later.

    Thanks,
    Evgeni

  17. otkim
    October 20th, 2009 at 10:34 | #17

    @Evgeni
    How to generate a parallel unscramber ?
    You had said that their implementation are same on above reply, Does it mean two circuit are identical ?
    If I don’t understand, let me know how to make it.

    Regards,
    otkim

  18. October 20th, 2009 at 21:00 | #18

    Hi otkim,

    Yes, descrambler circuit is typically the same as the scrambler.

    Thanks,
    Evgeni

  19. otkim
    October 22nd, 2009 at 04:37 | #19

    I am implementing a SDI interface in FPGA.
    Now I am trying to make a frame synchronizer(word framing circuit).
    In SD-SDI i/f, the frame sync. pattern is a 30-bit word{0x3ff, 0x000, 0x000} on unscrambed data. so a SDI receiver have to find the sequence after descrambler.
    For the same purpose, is it possible to find the scrambled sequence of the sync. pattern before descrambler

    Regards,
    otkim

  20. October 22nd, 2009 at 12:02 | #20

    Hi otkim,

    Can you point me to the SD-SDI specification – I’ll take a look.

    Thanks,
    Evgeni

  21. hongwei guo
    December 29th, 2009 at 08:19 | #21

    good job!!Many thanks!!you give me more time in my job!!

  22. March 3rd, 2010 at 10:37 | #22

    I can see that it been a while since anyone have replyed or posted on this site.
    So is it still monitored? and is it still possible to ask question. I have on in regard to the scrambler generator.

    But i wont start explaning my problem, since it will take time, and it whould be a waste if noone will answer it anyway

    Regards Vinther.

  23. March 3rd, 2010 at 12:37 | #23

    Hi Vinther,

    I’ll try to answer your question.

    Thanks,
    Evgeni

  24. March 4th, 2010 at 03:15 | #24

    ok .. this is a nice site and alot of help to people like me who cant just sit down and calculate this out right. However, i need a 64 bit wide x^7+x^6+1 scrambler, and i tryed your generator. however it dosnt work.

    So i tryed your generator with a 1 bit wide bus to see if i could make heads and tail in how it does ..

    Your generator makes this for a x^7+x^+1 scrambler with 1 bit wide bus.


    lfsr_c(0) <= lfsr_q(6);
    lfsr_c(1) <= lfsr_q(0);
    lfsr_c(2) <= lfsr_q(1);
    lfsr_c(3) <= lfsr_q(2);
    lfsr_c(4) <= lfsr_q(3);
    lfsr_c(5) <= lfsr_q(4);
    lfsr_c(6) <= lfsr_q(5) xor lfsr_q(6);

    data_c(0) <= data_in(0) xor lfsr_q(6);
    "

    But what i need it to generate is this ( also represented in 1 bit wide bus)

    "
    lfsr_c(0) <= lfsr_q(5) xor lfsr_q(6);
    lfsr_c(1) <= lfsr_q(0);
    lfsr_c(2) <= lfsr_q(1);
    lfsr_c(3) <= lfsr_q(2);
    lfsr_c(4) <= lfsr_q(3);
    lfsr_c(5) <= lfsr_q(4);
    lfsr_c(6) <= lfsr_q(5);

    data_c(0) <= data_in(0) xor lfsr_q(6);
    "

    And according to wiki this must then be a x^-7+x^-6+1 if you look under the Additive (synchronous) scramblers section.

    But i cant get your generator to generate this, is this something you can help me with. And do you know what the diffrence between the two types is. I know the above one works because ive implemented it already however it runs 64 times slower that what i need.

    Thanks in advanced

    /vinther

  25. March 4th, 2010 at 06:55 | #25

    Hi Vinther,

    I’m generating scrambler LFSR in Galois notation, which is more popular. You’re asking to generate it in Fibonacci notation. There is a conversion procedure of feedback taps between the two to produce equivalent implementation. Please try G(x) = x^7+x^1+1.

    Wiki entry on LFSRs has some information about this.

    Thanks,
    Evgeni

  26. March 4th, 2010 at 07:36 | #26

    @Evgeni
    Thank you, i can see that however i have to starte with another init value in this case “1111110”, as to get the right result. Now i just have to see if that works with a 64 bit wide bus. Else i have to use my backup plan an keep all the values in memory and xor det with data.

    Thanks you for you help

    /Vinther

  27. March 4th, 2010 at 11:01 | #27

    Well it worked.

    Thanks alot for you help

  28. Ashok
    August 2nd, 2010 at 00:12 | #28

    Why don’t you use variables and generate statement… It will be easier to read and very few lines…

  29. August 2nd, 2010 at 12:41 | #29

    Hi Ashok,

    There are several ways to generate the code. I just picked this one as the most “basic”. As for the generate statements, it’s specific to Verilog-2001, and many developers are still using ’95.
    I guess having different options for the code generation will be a nice feature.

    Also, using generate will result in less efficient code, at least for FPGA tools. I discuss it a bit in my Circuit Cellar article

    Thanks,
    Evgeni

  30. Jordy
    August 16th, 2010 at 04:21 | #30

    Hi Evgeni,
    at your generated code “data_out <= scram_en ? data_c : data_out;"
    why I think it should be "data_out <= scram_en ? data_c : data_in;"
    in your code if scram_en == 0, then data_out <= data_out;
    this code can never work, data_out will be always the reset value…
    thank you

  31. Jordy
    August 16th, 2010 at 04:43 | #31

    @Jordy
    I’m sorry for my mistake, 🙂
    you are right, it means hold the value for next scram_en valid.
    thank you

  32. kernel
    October 19th, 2010 at 04:27 | #32

    Hi Evgeni,

    Thanks for a very nice tool for generating scramblers.
    Are you planning to add option to generate ‘self synchronous scrambler’?

  33. October 19th, 2010 at 07:26 | #33

    Hi,

    At some point I will add the ’self synchronous scrambler’, since there are quite a few requests from the users.

    Thanks,
    Evgeni

  34. Chaitanya
    October 20th, 2010 at 00:23 | #34

    Hi Evgeni,

    Thanks for a nice tool and providing us deep insight about working of parallel scrambler. It was of great help. Thanks you once again.

  35. Kuoping
    November 6th, 2010 at 07:13 | #35

    Hi,

    Do you provide stand-alone application for scrambler?

  36. November 6th, 2010 at 09:53 | #36

    Hi,

    At this moment I don’t provide a stand-alone application for the parallel scrambler generator. Can you tell why you need it, and what cannot be done with the online version ?

    Thanks,
    Evgeni

  37. anne
    December 8th, 2010 at 02:42 | #37

    hi,
    im confused with the signal ‘scram_rst’,can u plz explain the need of that signal? is it similar to data valid signal?

  38. December 8th, 2010 at 08:44 | #38

    Hi,

    scram_rst enables initialization of lfsr_q register with all-1’s. scram_rst is not essential; the same initialization can be done with the ‘rst’ signal.

    Thanks,
    Evgeni

  39. SCHOCH
    February 9th, 2011 at 07:08 | #39

    Hi,

    Thank you for your tool, very usefull
    as i used to do parallel scrambler by “hand”.

    Have you been looking for the self synchronous scrambler ?
    any idea of modifications needed ?

    thanks a lot .

  40. February 9th, 2011 at 08:48 | #40

    Hi,

    Several users have asked for a self-synchronous scrambler. I think my current approach will still work, but it’ll require quite a few modifications.

    Thanks,
    Evgeni

  41. Lam Nguyen
    March 5th, 2011 at 19:55 | #41

    Thank you

  42. Naveen
    March 9th, 2011 at 02:28 | #42

    Hi,
    in scrambler, my datawidth is 64 and my polynomial width is 58, my polynomial is 1+x^39+x^58. scrambler i got. but still litle confusion in descrambler. please explain me how descrambler works as same data width, that is 64 bits.

  43. March 9th, 2011 at 07:04 | #43

    Hi Naveen,

    In most cases, descrambler code is identical to scrambler. It’s usually described in a protocol specification, unless you implement something custom.

    Thanks,
    Evgeni

  44. chan
    April 5th, 2011 at 09:52 | #44

    Hi,

    I used the PCIe scrambler from your website and I am trying to instantiate it in the altera FPGA. I am not sure at what point am I supposed to enable the scrambler?

    At this point, I am looking at PCIe x1 lane, the deserializer is bringing out 8 bit output for this one lane. At this point the deserializer data is scrambled but 8b10b decoded. You the scram_en, scram_rst and rst signal. Ihave tied them to pcie reset signal. Is it OK to use PCIe reset signal to enable the scrambler? In this application, I am trying to de-scrambled the 8 bit scrambled data from the deserializer. I am assuming that same scrambled module can be used to de-scramble.

    wire [7:0] descrambled_data;

    scrambler ds1(
    .data_in(rxdata0_ext[7:0]),
    .scram_en(pcie_rstn),
    .scram_rst(~pcie_rstn),
    .data_out(descrambled_data[7:0]),
    .rst(~pcie_rstn),
    .clk(pclk_in)
    );

  45. April 5th, 2011 at 10:56 | #45

    Hi,

    Reset input is used to initialize scrambler LFSR to all-ones.
    Enable is used to qualify the data (data valid).
    If your data is valid in every clock cycle, then it should work.

    Thanks,
    Evgeni

  46. Matt
    July 19th, 2011 at 05:49 | #46

    Hi Evgeni,

    Nice tool. Has there been any progress on adding the ability to generate a self synchronous scrambler for Ethernet applications?

    Thanks,

    — Matt

  47. iambeast
    October 16th, 2011 at 22:19 | #47

    Hi,I want descramble the USB 3.0,Can I use it’s scramble code?

  48. October 17th, 2011 at 07:21 | #48

    Hi,

    Yes, descrambler in USB 3.0 is the same as scrambling. It’s stated on page 451 in Appendix B.1 of the spec.

    Thanks,
    Evgeni

  49. iambeast
    October 17th, 2011 at 18:31 | #49

    I used VHDL decrambling the USB 3.0’s scramble,the code was generated by Scrambler Generator Tool.the Data Width=16 of the scramble code,but I need a the Data Width=32 of the descramble code,is it OK?

  50. October 17th, 2011 at 19:35 | #50

    Hi,

    That should work. That’s the whole purpose of this tool – to generate scrambler/descrambler code with any data width.

    Thanks,
    Evgeni

Comment pages
1 2 3 179
  1. No trackbacks yet.