Comments

Please leave your comments

  1. August 26th, 2012 at 21:44 | #1

    Hi Dmitry,

    Yes, there is a typo.
    Swapping bits in 0×6522df69 will get 0x9add2096
    Swizzling all bits in 0x9add2096 will get 0x6904bb59

    Thanks,
    Evgeni

  2. Jamil
    November 7th, 2013 at 07:41 | #2

    CRC parallel calculation
    I have implemented a verilog code for CRC8 parallel with help of your document “A Practical Parallel CRC Generation Method”. For this many thanks.

    I do have a question regarding CRC polynomial: How can I insert the CRC polynomial (0x 107) as a parameter in my design? Is it possible?

    Thank you
    Jamil

  3. November 20th, 2013 at 06:49 | #3

    @Jamil

    Hi Jamil,

    One way to parametrize CRC polynomial is to do bitwise logic AND with the taps, for example, using “for” loop in Verilog.

    Thanks,
    Evgeni

  4. Kevin
    May 9th, 2014 at 07:21 | #4

    Hi Evgeni,
    I read you paper and understood the general idea of CRC parallel calculation, but not in depth. However, is it possible to implement the CRC parallel calculation using purely combinational logic (XOR arrary)? That is, if I load in the data and I can get the CRC checksum instantly. In the code generated by the website, it looks like the CRC of next block of data will depend on the CRC of previous block of data, which is not reasonable to me.
    Thanks for any help.
    Kevin

  5. May 9th, 2014 at 08:14 | #5

    Hi Kevin,

    This is the idea behind parallel CRC. The disadvantage of the parallel CRC is that it requires more XOR gates and more levels of logic as the data goes wider.
    It’s possible to implement, for example, 4096-bit CRC, and calculate it instantly on 4096-bit data. But in practice it’s implemented as 32-bit data (can be other value as well), and so it’d take 4096/32=128 clocks. That’s why there are registers on the output of the CRC to store intermediate results.

    Thanks,
    Evgeni

  6. Kevin
    May 9th, 2014 at 09:16 | #6

    @Evgeni
    OK, thanks. I am only using 8-13 bit data, so area is not a concern. If I want it calculated instantly, I just need to follow the method you described in paper and ignore the regs, right? In this case, I should just need to fix the signals which are used to be previous CRC state to all ‘1’s, right?

  7. May 9th, 2014 at 09:24 | #7

    Hi Kevin,

    That’s right. Most of the CRC specifications use ‘1’s to initialize the CRC. But strictly speaking, it doesn’t need to be the case.

    Thanks,
    Evgeni

  8. Kevin
    May 9th, 2014 at 09:39 | #8

    @Evgeni
    Thank you so much. I think I figured out the problem. Initialization with ‘1’ or ‘0’ may lead to different CRC checksum, but both of them are compliant to the same polynominal. In the standard arithmetic computation, we are used to putting ‘0’ when doing the long division, that’s the only difference in practical design.

  9. Denis
    July 3rd, 2014 at 02:25 | #9

    Hello. Your arcticle about CRC generator is very interesting. But replacement “for … loop” algorithm with “XOR” have advantage only for bad synthesizer like Xilinx. I use Altera Quartus for your example and two methods have the same result.
    Denis

  10. Terry Cornall
    August 27th, 2014 at 19:10 | #10

    Hi Evgeni
    Thanks very much for your article in outputlogic.com/my-stuff/circuit-cellar-january-2010-crc.pdf
    I found it very helpful in starting to grasp the concepts behind the design of parallel CRC generation.
    I did notice that maybe there is a typo on page 41 where you present a set of equations for Mout. I think the first line which is:
    Mout[0]=Min[1]^Min[1]^Min[0]^Min[3]
    should be
    Mout[0]=Min[1]^Min[1]^Nin[0]^Nin[3]

    Thanks again,
    Terry

  11. August 27th, 2014 at 20:57 | #11

    Hi Terry,

    That’s right, there is a typo. It was found a long time ago, here: http://outputlogic.com/?p=158#comment-8235

    Thanks,
    Evgeni

  12. Dick Karaus
    March 30th, 2015 at 04:10 | #12

    Has anyone tried the CRC for the CoaxPress standard? It is a 32bit CRC based on the IEEE_802.3. But, as noted in the standard, the input bits are reversed, the output crc is reversed, etc, etc. They give an example message packet in the standard and its computed CRC, but I can’t for the life of me get their CRC in their example. The standard is here and the example is on page 35 in the Comment. As they mention in the body of the text, the 8B/10B codes of K27.7 are treated as the D27.7 which is hex FB. They don’t exactly say what the input data length is, but I have tried 8bit, 16bit and 32bit inputs to no avail. I have tried negating the output crc….no good. If anyone has had success at this, please let me know. Thanks.

  13. March 30th, 2015 at 06:18 | #13

    Have you tried 1-bit data, all-zeros?

  14. Dick Karaus
    March 31st, 2015 at 04:07 | #14

    Evgeni,
    I’m not sure what you mean. I have been trying to recreate the example in the coaxpress standard so that I know my crc calculation code is correct. They give an example message with its calculated crc. It seems simple enough. I just can’t seem to recreate it.

  15. Dick Karaus
    March 31st, 2015 at 10:15 | #15

    I found my error. All is right with the world again.

  16. Frank Bruno
    July 17th, 2015 at 05:08 | #16

    Hi. Great site and really useful.
    I’m trying to generate a scrambler for 100GE. I am using a 640 bit bus w/ a 57 bit LSFR 1+x^38+x^57. I am getting the following error when it completes:

    Internal Server Error
    The server encountered an internal error or misconfiguration and was unable to complete your request.
    Please contact the server administrator, and inform them of the time the error occurred, and anything you might have done that may have caused the error.
    More information about this error may be available in the server error log.
    Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

    Thanks,
    Frank

  17. July 17th, 2015 at 09:05 | #17

    Hi Frank,

    The time it takes to generate scrambler code grows quadratically with the number of data bits. Therefore, it’s quite possible that server times out. There is a command-line utility to generate CRC with wide data to overcome this kind of a problem. But unfortunately, I haven’t made public such a utility to generate scrambler code.

    Thanks,
    Evgeni

  18. shravan
    January 10th, 2016 at 21:23 | #18

    Hi

    In scrambler when we are giving input as all 0’s.then the generator giving output as all 0’s. but as per the scrambler defination long sequences can be changed with 0’s and 1’s.

    But in above case it is not effecting with all 0’s input.

    Can you please explain why it is not changing output with all 0’s input.

    Thanks,
    Shravan kumar

  19. shravan
    January 10th, 2016 at 22:09 | #19

    Hi Evgeni,

    Thanks for your response,
    As per your comments i tried with scrambler initial value with all 1’s and and scrambler input is all 0’s. then also its giving all 0’s output.because it it xor’ed with x^39+x^58(1 xor 1 =0) ,so 0(taps xor’ed) xor 0(input) output is 0. so it is not effecting by initial value(if the initial value like 10101010..seguence then it will effect).

    And iam using self synchronous scrambler(10gbase-R IEEE 802.3-2012_section4.pdf 49.2.6 scrambler)

    So can you please refer the above document.

    Thanks,
    Shravan kumar

  20. January 10th, 2016 at 22:40 | #20

    Hi Shravan,

    Seeding with all-1’s is just one option to implement a scrambler. Ethernet specification you’re referring to implements it differently to avoid the problem with long 0 sequences. The specification requires new scrambler seed for each packet. In addition, the data is NRZ and differentially encoded.

    Thanks,
    Evgeni

  21. John Vogel
    May 9th, 2016 at 14:22 | #21

    Hi Evgeni, I just wanted to thank you for this wonderful website. You totally rock!

    Best,
    John

  22. John Vogel
    May 10th, 2016 at 03:47 | #22

    Hi Evgeni, I want to understand how I can design a circuit, and variants of it with different polynomials, similar to the one your software produced ( scrambler module for data[31:0], lfsr[22:0]=1+x^2+x^5+x^8+x^16+x^21+x^23;); what paper or textbook would be the place to find the information? I want to understand the details of the math and algorithms that your software implemented, so that I can understnad and design the circuits myself.

    Thanks,
    John

  23. May 10th, 2016 at 10:54 | #23

    Hi John,

    You can start with this article:
    http://outputlogic.com/my-stuff/circuit-cellar-january-2010-crc.pdf

    It’s about CRC, but math used in scramblers is similar.

    Thanks,
    Evgeni

  24. sanjayy
    June 3rd, 2016 at 04:08 | #24

    Hi Evgeni,
    I was going through your paper “A Practical Parallel CRC Generation Method” at http://outputlogic.com/my-stuff/parallel_crc_generator_whitepaper.pdf
    and saw there is a typo in equation
    Mout[0]= Min[1] ^ Min[4] ^ Min[0] ^ Min[3] at page 43
    instead the equation should come out to be
    Mout[0]= Min[1] ^ Min[4] ^ Nin[0] ^ Nin[3]

    Can you please have a look at it.

    Thanks,
    sanjay

  25. June 3rd, 2016 at 06:59 | #25

    Hi Sanjay,

    Yes, it’s a known typo.

    Thanks,
    Evgeni

  26. Fenghua Feng
    August 3rd, 2016 at 05:25 | #26

    Hi Evgeni,

    Thank you so much for the wonderful tool. It’s really useful.

    I generated a verilog code for 16-bit CRC with polynomial X^16 + X^12 + X^5 + 1. The input data width is 16-bit. It seems that, in the verilog code, the input data is XORed with shift register bit by bit first, then the shift register shifts for 16 time, resulting the new shift register values for next cycle. Is this the standard generation process? I also see some methods that the input data is fed into the shift register bit by bit during the shifting operation. The results from two methods are different. I get confused which one I should use. If I use one method, and the receiver uses another, will it work correctly?

    Thanks,
    Fenghua

  27. Fenghua Feng
    August 3rd, 2016 at 11:03 | #27

    Hi Evgeni,

    I figured it out now. It seems that MSB first and LSB first will give different results.

    Thanks.
    Fenghua

  28. Daniyal Khurram
    August 7th, 2016 at 13:28 | #28

    Hi Evgeni,
    I just implemented your code to calculate CRC-6. I had the following question : Since the Parallel CRC generation method uses a serial CRC generator(called recursively) to calculate the H1 and H2 tables values which it then uses to come up with a refined parallel implementation, if I wanted to calculate the CRC register with the initial values set to all 0’s instead of 1’s as we are doing, can I just do so by setting the initial lfsr value to all 0’s during reset or would i need to calculate the table values again aswell and then come up with a XOR implementation using that?

  29. August 7th, 2016 at 20:27 | #29

    Hi Daniyal,

    H1 and H2 matrices are independent of the initial value of CRC.

    Thanks,
    Evgeni

  30. semo
    November 29th, 2016 at 05:46 | #30

    how we can generate pseudo – randonm number of length 79 by using lfsr

  31. Rakshitha A
    November 28th, 2018 at 05:29 | #31

    For 1001 data and 1011 generator polynomial i got a sequence of CRC_output like 7,2,1,0,6,4,3. for this actual output is 110 how to know which is the CRC and explain me someone please. Thanks in advance

  32. Rakshitha A
    November 28th, 2018 at 05:30 | #32

    For 1001 data and 1011 generator polynomial i got a sequence of CRC_output like 7,2,1,0,6,4,3. for this actual output is 110 how to know which is the CRC and explain me someone please.

  33. January 24th, 2019 at 08:56 | #33

    Hi Evgeni,

    Many thanks for your work on this topic – but I need you to check it for me.

    I put your generated (VHDL) in SIMULINK for the polynomial X^7 + X^6 + 1 (which is used on SONET SDH) for 4 bit data,
    I take it with additive (side) PRBS scramblers, the scrambler and the descrambler are the same circuit, unlike in multiplicative scramblers where the data is fed in and out at opposite ends of the delay chain.

    In which case I put a scrambler in series with a descrambler, i.e. 2x the same implementation of your VHDL
    but only 1 bit worked…the other 3 remained scrambled.

    I can implement and try the VHDL, but that will take me a bit longer, I just need a create a test-bench to drive it.
    I can send you a picture of the SIMULINK Circuit I constructed as an email attachment for you to see what I have done.
    or even the SIMULINK .SLX file in case you have SIMULINK or maybe a free SIMULINK viewer ?
    It appears to be quite simple in SIMULINK to construct a circuit mode from your VHDL, (i.e. I think my circuit is right)
    which is why I am wondering how these parallel forms of the polynomial are generated….?
    Is there a book or paper or standard maths that describes changing the LFSR to a parallel implantation.

    I don’t understand how you generate the terms for a parallel scrambler architecture,
    it doesn’t seem to match what I see in a text=book “high speed serdes devices and applications”, page 142.
    Actually yours looks more correct, theirs looks over simplified.

    See the following link and arrow to page 142.

    https://books.google.co.uk/books?id=Cx3r0H-4AhEC&pg=PA141&lpg=PA141&dq=scrambler+and+descramblers&source=bl&ots=voSExBbg_H&sig=ACfU3U1SXzU09_1AmbAsEP-lF2ZZ90NDdQ&hl=en&sa=X&ved=2ahUKEwjwx7Kx64HgAhXdTBUIHchPC8Q4FBDoATADegQIBRAB#v=onepage&q=scrambler%20and%20descramblers&f=false

    IF you engage with me on this, I will buy your book 🙂 and our company could offer you some consultancy.
    We are using MATLAB/SIMULINK for a software-Defined-Radio project. the Scrambler block is not supported
    for auto HDL-coding I simlink, so we’re creating the scrambler as a circuits of XORs and D types that will code.

    I can get additive (Fibonacci) and Galois forms working for bit-serial LFSRs,
    but so far, no joy in turning things into parallel.

    Also, I don’t quite know why I can’t just split the input stream (say 1 byte wide) into 8 bit-serial scramblers
    and the recombine it in the same that way post the descramblers.
    It perhaps uses more hardware resource, but overall, not much hardware resource, either way…????
    If I did use multiple bit-serial scramblers, my XOR gates reduce in fan out and depth,
    allowing faster operation..

    Best Regards,
    Mike Brewin
    Design Engineer, ECS, England.

  34. January 24th, 2019 at 20:38 | #34

    Hi Mike,

    I’m going to reply to your email.

    Thanks,
    Evgeni

  35. Rucha
    July 24th, 2019 at 12:31 | #35

    Hi Evgeni,

    I implemented your algorithm and generated CRC-64-ECMA for single bit and for 256-bit. I made a testbench in which i gave the data input as 0 for both the systems. In the serial implementation, I get different values of CRC as the clock keeps advancing. I do not get why that happens. And if so, what is the final value of my CRC. Also, the serial and parallel CRC do not seem to match in any clock cycle. What am i doing wrong? Could you please email me at your convenience to rucha_95@yahoo.com

  36. July 24th, 2019 at 13:58 | #36

    Hi Rucha,

    There are several reasons why serial and parallel implementation don’t match. One is the incorrect bit and byte order.
    I think most of the questions in this thread deal with exact same issue. You might want to review it first.

    Thanks,
    Evgeni

  37. Rucha
    July 31st, 2019 at 13:22 | #37

    Hi,

    I have the parallel implementation and I have my CRC getting generated in the first clock for some input data. Now, I want to test for the correctness of the data: Take the data, append the CRC at the end , and XOR this with the CRC that we received. And the result should be zero. How do i go about implementing this?

  38. July 31st, 2019 at 15:09 | #38

    Hi Rucha,

    The result of the CRC check depends on the way CRC is appended to the original data. But in most cases it’s not zero for a reason – to detect trailing zero sequences in data. It’s usually called “magic number” that you need to compare against.

    Thanks,
    Evgeni

  39. Naveen
    September 4th, 2019 at 01:14 | #39

    Hi Evgeni,

    Thanks for sharing the knowledge.
    I have seen CRC implementation based on the mask table, is there any such way of implementing ? If yes, then the implementation of CRC-16 through which method is better either polynomial or mask_table.
    Also can I get the reference of the Mask Table implementation procedure?

    Thanks,
    M.M.Naveen Kumar

  40. John
    January 3rd, 2020 at 13:10 | #40

    Hi Evgeni,

    I have been using your parallel CRC generator and I am getting a correct output when I compare it to a serial crc generator. I notice however that the result I’m getting are noted as CRC-32(MPEG-2) which means the input and output are not reflected. CRC-32(JAMCRC) uses the same polynomial and initial value but it has a reflected input and output. I’ve tried swapping the bit order of my incoming data and the output CRC but I am not able to reproduce the serial result for this version. Do you know if it possible to generate this output using this calculator or does it require different logic for the data to be operated on?

    Thanks,
    John

  41. thien
    January 4th, 2020 at 00:20 | #41

    Hi.
    At first,your RTL code is wonderful source.
    When i use your code for generating a 16 bit (ccitt) with poly x16+x12+x5+x0 and compare with result at :https://crccalc.com/
    It show that the result of your code (poly x16+x12+x5+x0) is belong to: CRC-16/CCITT-FALSE Algorithm.
    Whilst, my reference book show the result as CRC-16/MCRF4XX Algorithm which has the same Poly 0x1021 and Init 0xFFFF.
    I wondering that, do you have any code for this situation.
    Pls, this will help me a lot.
    Thanks

  42. January 5th, 2020 at 18:10 | #42

    Hi John,
    I did a quick search of JAMCRC. It says that it’s simply the bitwise-not of the standard CRC-32, whatever you call “reflected output”.
    If that’s the case, it’s trivial to change generated output by adding “~” (bitwise-not).
    Thanks,
    Evgeni

  43. Tony Duong
    January 14th, 2020 at 03:49 | #43

    HI,

    I am trying to generate the scrambler for 1+x^43 as *User define* with poly mentioned. However, the generated coded seems does not do anything at all since lfsr_q *NEVER* change regardless of what the input are.

    So if the start seed is all FFF’s then the output just an invert of the input, nothing else..

    Any comment on that?

    Thanks,
    Tony

  44. Diego
    March 30th, 2020 at 07:15 | #44

    Hi Evgeni,
    I need to design a parallel CRC generator with 32 bits of data width and a 20 bits polynomial. I tried your CRC generator with the parameters a mentioned but if my message word has also 32 bits and I need compute the checksums of each word per cycle, the initial value of the lsfr_q register must be initialized for every word right? I mean, I dont have to assign them to the lfsr_c value for every clock cycle, rather what I have to do is for every cycle you have to assert reset to 1.
    In the other hand, I was reading the Campobello papar that you have referenced in your article and says that for this method the number of bits of your message (k) and your degree polynomial (m) must be multiples of the number of bits to be procesed in parallel
    (w). But in my case I have k=w=32 and m=20 and that condition does not acomplished but I tried anyways, using asserting reset to 1 for every clock cycle as I mentioned above, and it seems to work. So I’m not sure what is going on.
    I hope you can clarify my doubts
    Greetings,
    Diego

  45. Diego
    March 30th, 2020 at 07:17 | #45

    Diego :
    Hi Evgeni,
    I need to design a parallel CRC generator with 32 bits of data width and a 20 bits polynomial. I tried your CRC generator with the parameters mentioned before but if my message word has also 32 bits and I need compute the checksums of each word per cycle, the initial value of the lsfr_q register must be initialized for every word right? I mean, I dont have to assign them to the lfsr_c value for every clock cycle, rather what I have to do is for every cycle I have to assert reset to 1.
    In the other hand, I was reading the Campobello paper that you have referenced in your article and says that for this method the number of bits of your message (k) and your degree polynomial (m) must be multiples of the number of bits to be procesed in parallel
    (w). But in my case I have k=w=32 and m=20 and that condition does not acomplished but I tried anyways, using asserting reset to 1 for every clock cycle as I mentioned above, and it seems to work. So I’m not sure what is going on.
    I hope you can clarify my doubts
    Greetings,
    Diego

  46. March 30th, 2020 at 10:44 | #46

    Hi Diego,

    20-bit CRC output becomes LFSR initialization value for the next clock. In the very first clock, LFSR is typically initialized with all-1s. But it’s not always the case. Some protocols might use a different value.

    Thanks,
    Evgeni

  47. Diego
    March 31st, 2020 at 08:01 | #47

    @Evgeni
    I think I understand now, the checksum bits of a frame are computed, not of a single word.
    On the other hand, I’m not sure if I’m misunderstanding it but there is a restriction in the Campobello’s paper that says that the that the degree of polynomial generator (m) and the length of the message to be processed (k) are both multiples of the number of bits to be processed in parallel (w). So, according that, I cannot design a CRC generator with m=20 and w=32, is that right? Regardless that restriction, I’ve tried anyways with this polynomial 0xC1ACF and a data_width=32 and it seems it works.
    Thank you very much for your time and patience, I’m a desperate student learning on the way about this.

  48. March 31st, 2020 at 09:29 | #48

    Hi Diego,

    You can have CRC and data of any width and any with polynomial, there is absolutely no restriction. But only certain class of polynomials have desired error detection properties. If your CRC and data are too wide, you’d have performance issues (clock frequency in FPGA or ASIC). Also, if CRC is too wide, it’s going to increase the communication overhead. For parallel CRCs, it’s exactly the same as serial 1-bit CRC with unroll loop.

    Thanks,
    Evgeni

  49. Diego
    March 31st, 2020 at 10:22 | #49

    @Evgeni
    Understood, thank you very much Evgeni!
    Greetings

  50. August 22nd, 2020 at 13:02 | #50

    The generators have been invaluable to me. It’s a terrific resource!

    The website is definitely running, but it looks like the generators aren’t running. I specify the parameters and in step 2 I only get the message:
    Specify polynomial coefficients

    It’s a lot ask, but would you consider sharing your code?

    Thanks!

Comment pages
  1. No trackbacks yet.