1. August 26th, 2012 at 21:44 | #1

Hi Dmitry,

Yes, there is a typo.
Swapping bits in 0×6522df69 will get 0x9add2096
Swizzling all bits in 0x9add2096 will get 0x6904bb59

Thanks,
Evgeni

2. November 7th, 2013 at 07:41 | #2

CRC parallel calculation
I have implemented a verilog code for CRC8 parallel with help of your document “A Practical Parallel CRC Generation Method”. For this many thanks.

I do have a question regarding CRC polynomial: How can I insert the CRC polynomial (0x 107) as a parameter in my design? Is it possible?

Thank you
Jamil

3. November 20th, 2013 at 06:49 | #3

Hi Jamil,

One way to parametrize CRC polynomial is to do bitwise logic AND with the taps, for example, using “for” loop in Verilog.

Thanks,
Evgeni

4. May 9th, 2014 at 07:21 | #4

Hi Evgeni,
I read you paper and understood the general idea of CRC parallel calculation, but not in depth. However, is it possible to implement the CRC parallel calculation using purely combinational logic (XOR arrary)? That is, if I load in the data and I can get the CRC checksum instantly. In the code generated by the website, it looks like the CRC of next block of data will depend on the CRC of previous block of data, which is not reasonable to me.
Thanks for any help.
Kevin

5. May 9th, 2014 at 08:14 | #5

Hi Kevin,

This is the idea behind parallel CRC. The disadvantage of the parallel CRC is that it requires more XOR gates and more levels of logic as the data goes wider.
It’s possible to implement, for example, 4096-bit CRC, and calculate it instantly on 4096-bit data. But in practice it’s implemented as 32-bit data (can be other value as well), and so it’d take 4096/32=128 clocks. That’s why there are registers on the output of the CRC to store intermediate results.

Thanks,
Evgeni

6. May 9th, 2014 at 09:16 | #6

@Evgeni
OK, thanks. I am only using 8-13 bit data, so area is not a concern. If I want it calculated instantly, I just need to follow the method you described in paper and ignore the regs, right? In this case, I should just need to fix the signals which are used to be previous CRC state to all ‘1’s, right?

7. May 9th, 2014 at 09:24 | #7

Hi Kevin,

That’s right. Most of the CRC specifications use ‘1’s to initialize the CRC. But strictly speaking, it doesn’t need to be the case.

Thanks,
Evgeni

8. May 9th, 2014 at 09:39 | #8

@Evgeni
Thank you so much. I think I figured out the problem. Initialization with ‘1’ or ‘0’ may lead to different CRC checksum, but both of them are compliant to the same polynominal. In the standard arithmetic computation, we are used to putting ‘0’ when doing the long division, that’s the only difference in practical design.

9. July 3rd, 2014 at 02:25 | #9

Hello. Your arcticle about CRC generator is very interesting. But replacement “for … loop” algorithm with “XOR” have advantage only for bad synthesizer like Xilinx. I use Altera Quartus for your example and two methods have the same result.
Denis

10. August 27th, 2014 at 19:10 | #10

Hi Evgeni
Thanks very much for your article in outputlogic.com/my-stuff/circuit-cellar-january-2010-crc.pdf
I found it very helpful in starting to grasp the concepts behind the design of parallel CRC generation.
I did notice that maybe there is a typo on page 41 where you present a set of equations for Mout. I think the first line which is:
Mout[0]=Min[1]^Min[1]^Min[0]^Min[3]
should be
Mout[0]=Min[1]^Min[1]^Nin[0]^Nin[3]

Thanks again,
Terry

11. August 27th, 2014 at 20:57 | #11

Hi Terry,

That’s right, there is a typo. It was found a long time ago, here: http://outputlogic.com/?p=158#comment-8235

Thanks,
Evgeni

12. March 30th, 2015 at 04:10 | #12

Has anyone tried the CRC for the CoaxPress standard? It is a 32bit CRC based on the IEEE_802.3. But, as noted in the standard, the input bits are reversed, the output crc is reversed, etc, etc. They give an example message packet in the standard and its computed CRC, but I can’t for the life of me get their CRC in their example. The standard is here and the example is on page 35 in the Comment. As they mention in the body of the text, the 8B/10B codes of K27.7 are treated as the D27.7 which is hex FB. They don’t exactly say what the input data length is, but I have tried 8bit, 16bit and 32bit inputs to no avail. I have tried negating the output crc….no good. If anyone has had success at this, please let me know. Thanks.

13. March 30th, 2015 at 06:18 | #13

Have you tried 1-bit data, all-zeros?

14. March 31st, 2015 at 04:07 | #14

Evgeni,
I’m not sure what you mean. I have been trying to recreate the example in the coaxpress standard so that I know my crc calculation code is correct. They give an example message with its calculated crc. It seems simple enough. I just can’t seem to recreate it.

15. March 31st, 2015 at 10:15 | #15

I found my error. All is right with the world again.

16. July 17th, 2015 at 05:08 | #16

Hi. Great site and really useful.
I’m trying to generate a scrambler for 100GE. I am using a 640 bit bus w/ a 57 bit LSFR 1+x^38+x^57. I am getting the following error when it completes:

Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, and inform them of the time the error occurred, and anything you might have done that may have caused the error.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

Thanks,
Frank

17. July 17th, 2015 at 09:05 | #17

Hi Frank,

The time it takes to generate scrambler code grows quadratically with the number of data bits. Therefore, it’s quite possible that server times out. There is a command-line utility to generate CRC with wide data to overcome this kind of a problem. But unfortunately, I haven’t made public such a utility to generate scrambler code.

Thanks,
Evgeni

18. January 10th, 2016 at 21:23 | #18

Hi

In scrambler when we are giving input as all 0’s.then the generator giving output as all 0’s. but as per the scrambler defination long sequences can be changed with 0’s and 1’s.

But in above case it is not effecting with all 0’s input.

Can you please explain why it is not changing output with all 0’s input.

Thanks,
Shravan kumar

19. January 10th, 2016 at 22:09 | #19

Hi Evgeni,

As per your comments i tried with scrambler initial value with all 1’s and and scrambler input is all 0’s. then also its giving all 0’s output.because it it xor’ed with x^39+x^58(1 xor 1 =0) ,so 0(taps xor’ed) xor 0(input) output is 0. so it is not effecting by initial value(if the initial value like 10101010..seguence then it will effect).

And iam using self synchronous scrambler(10gbase-R IEEE 802.3-2012_section4.pdf 49.2.6 scrambler)

So can you please refer the above document.

Thanks,
Shravan kumar

20. January 10th, 2016 at 22:40 | #20

Hi Shravan,

Seeding with all-1’s is just one option to implement a scrambler. Ethernet specification you’re referring to implements it differently to avoid the problem with long 0 sequences. The specification requires new scrambler seed for each packet. In addition, the data is NRZ and differentially encoded.

Thanks,
Evgeni

21. May 9th, 2016 at 14:22 | #21

Hi Evgeni, I just wanted to thank you for this wonderful website. You totally rock!

Best,
John

22. May 10th, 2016 at 03:47 | #22

Hi Evgeni, I want to understand how I can design a circuit, and variants of it with different polynomials, similar to the one your software produced ( scrambler module for data[31:0], lfsr[22:0]=1+x^2+x^5+x^8+x^16+x^21+x^23;); what paper or textbook would be the place to find the information? I want to understand the details of the math and algorithms that your software implemented, so that I can understnad and design the circuits myself.

Thanks,
John

23. May 10th, 2016 at 10:54 | #23

Hi John,

http://outputlogic.com/my-stuff/circuit-cellar-january-2010-crc.pdf

It’s about CRC, but math used in scramblers is similar.

Thanks,
Evgeni

24. June 3rd, 2016 at 04:08 | #24

Hi Evgeni,
I was going through your paper “A Practical Parallel CRC Generation Method” at http://outputlogic.com/my-stuff/parallel_crc_generator_whitepaper.pdf
and saw there is a typo in equation
Mout[0]= Min[1] ^ Min[4] ^ Min[0] ^ Min[3] at page 43
instead the equation should come out to be
Mout[0]= Min[1] ^ Min[4] ^ Nin[0] ^ Nin[3]

Can you please have a look at it.

Thanks,
sanjay

25. June 3rd, 2016 at 06:59 | #25

Hi Sanjay,

Yes, it’s a known typo.

Thanks,
Evgeni

26. August 3rd, 2016 at 05:25 | #26

Hi Evgeni,

Thank you so much for the wonderful tool. It’s really useful.

I generated a verilog code for 16-bit CRC with polynomial X^16 + X^12 + X^5 + 1. The input data width is 16-bit. It seems that, in the verilog code, the input data is XORed with shift register bit by bit first, then the shift register shifts for 16 time, resulting the new shift register values for next cycle. Is this the standard generation process? I also see some methods that the input data is fed into the shift register bit by bit during the shifting operation. The results from two methods are different. I get confused which one I should use. If I use one method, and the receiver uses another, will it work correctly?

Thanks,
Fenghua

27. August 3rd, 2016 at 11:03 | #27

Hi Evgeni,

I figured it out now. It seems that MSB first and LSB first will give different results.

Thanks.
Fenghua

28. August 7th, 2016 at 13:28 | #28

Hi Evgeni,
I just implemented your code to calculate CRC-6. I had the following question : Since the Parallel CRC generation method uses a serial CRC generator(called recursively) to calculate the H1 and H2 tables values which it then uses to come up with a refined parallel implementation, if I wanted to calculate the CRC register with the initial values set to all 0’s instead of 1’s as we are doing, can I just do so by setting the initial lfsr value to all 0’s during reset or would i need to calculate the table values again aswell and then come up with a XOR implementation using that?

29. August 7th, 2016 at 20:27 | #29

Hi Daniyal,

H1 and H2 matrices are independent of the initial value of CRC.

Thanks,
Evgeni

30. November 29th, 2016 at 05:46 | #30

how we can generate pseudo – randonm number of length 79 by using lfsr

31. November 28th, 2018 at 05:29 | #31

For 1001 data and 1011 generator polynomial i got a sequence of CRC_output like 7,2,1,0,6,4,3. for this actual output is 110 how to know which is the CRC and explain me someone please. Thanks in advance

32. November 28th, 2018 at 05:30 | #32

For 1001 data and 1011 generator polynomial i got a sequence of CRC_output like 7,2,1,0,6,4,3. for this actual output is 110 how to know which is the CRC and explain me someone please.

33. January 24th, 2019 at 08:56 | #33

Hi Evgeni,

Many thanks for your work on this topic – but I need you to check it for me.

I put your generated (VHDL) in SIMULINK for the polynomial X^7 + X^6 + 1 (which is used on SONET SDH) for 4 bit data,
I take it with additive (side) PRBS scramblers, the scrambler and the descrambler are the same circuit, unlike in multiplicative scramblers where the data is fed in and out at opposite ends of the delay chain.

In which case I put a scrambler in series with a descrambler, i.e. 2x the same implementation of your VHDL
but only 1 bit worked…the other 3 remained scrambled.

I can implement and try the VHDL, but that will take me a bit longer, I just need a create a test-bench to drive it.
I can send you a picture of the SIMULINK Circuit I constructed as an email attachment for you to see what I have done.
or even the SIMULINK .SLX file in case you have SIMULINK or maybe a free SIMULINK viewer ?
It appears to be quite simple in SIMULINK to construct a circuit mode from your VHDL, (i.e. I think my circuit is right)
which is why I am wondering how these parallel forms of the polynomial are generated….?
Is there a book or paper or standard maths that describes changing the LFSR to a parallel implantation.

I don’t understand how you generate the terms for a parallel scrambler architecture,
it doesn’t seem to match what I see in a text=book “high speed serdes devices and applications”, page 142.
Actually yours looks more correct, theirs looks over simplified.

See the following link and arrow to page 142.

IF you engage with me on this, I will buy your book 🙂 and our company could offer you some consultancy.
We are using MATLAB/SIMULINK for a software-Defined-Radio project. the Scrambler block is not supported
for auto HDL-coding I simlink, so we’re creating the scrambler as a circuits of XORs and D types that will code.

I can get additive (Fibonacci) and Galois forms working for bit-serial LFSRs,
but so far, no joy in turning things into parallel.

Also, I don’t quite know why I can’t just split the input stream (say 1 byte wide) into 8 bit-serial scramblers
and the recombine it in the same that way post the descramblers.
It perhaps uses more hardware resource, but overall, not much hardware resource, either way…????
If I did use multiple bit-serial scramblers, my XOR gates reduce in fan out and depth,
allowing faster operation..

Best Regards,
Mike Brewin
Design Engineer, ECS, England.

34. January 24th, 2019 at 20:38 | #34

Hi Mike,

Thanks,
Evgeni

35. July 24th, 2019 at 12:31 | #35

Hi Evgeni,

I implemented your algorithm and generated CRC-64-ECMA for single bit and for 256-bit. I made a testbench in which i gave the data input as 0 for both the systems. In the serial implementation, I get different values of CRC as the clock keeps advancing. I do not get why that happens. And if so, what is the final value of my CRC. Also, the serial and parallel CRC do not seem to match in any clock cycle. What am i doing wrong? Could you please email me at your convenience to rucha_95@yahoo.com

36. July 24th, 2019 at 13:58 | #36

Hi Rucha,

There are several reasons why serial and parallel implementation don’t match. One is the incorrect bit and byte order.
I think most of the questions in this thread deal with exact same issue. You might want to review it first.

Thanks,
Evgeni

37. July 31st, 2019 at 13:22 | #37

Hi,

I have the parallel implementation and I have my CRC getting generated in the first clock for some input data. Now, I want to test for the correctness of the data: Take the data, append the CRC at the end , and XOR this with the CRC that we received. And the result should be zero. How do i go about implementing this?

38. July 31st, 2019 at 15:09 | #38

Hi Rucha,

The result of the CRC check depends on the way CRC is appended to the original data. But in most cases it’s not zero for a reason – to detect trailing zero sequences in data. It’s usually called “magic number” that you need to compare against.

Thanks,
Evgeni

39. September 4th, 2019 at 01:14 | #39

Hi Evgeni,

Thanks for sharing the knowledge.
I have seen CRC implementation based on the mask table, is there any such way of implementing ? If yes, then the implementation of CRC-16 through which method is better either polynomial or mask_table.
Also can I get the reference of the Mask Table implementation procedure?

Thanks,
M.M.Naveen Kumar

40. January 3rd, 2020 at 13:10 | #40

Hi Evgeni,

I have been using your parallel CRC generator and I am getting a correct output when I compare it to a serial crc generator. I notice however that the result I’m getting are noted as CRC-32(MPEG-2) which means the input and output are not reflected. CRC-32(JAMCRC) uses the same polynomial and initial value but it has a reflected input and output. I’ve tried swapping the bit order of my incoming data and the output CRC but I am not able to reproduce the serial result for this version. Do you know if it possible to generate this output using this calculator or does it require different logic for the data to be operated on?

Thanks,
John

41. January 4th, 2020 at 00:20 | #41

Hi.
At first,your RTL code is wonderful source.
When i use your code for generating a 16 bit (ccitt) with poly x16+x12+x5+x0 and compare with result at :https://crccalc.com/
It show that the result of your code (poly x16+x12+x5+x0) is belong to: CRC-16/CCITT-FALSE Algorithm.
Whilst, my reference book show the result as CRC-16/MCRF4XX Algorithm which has the same Poly 0x1021 and Init 0xFFFF.
I wondering that, do you have any code for this situation.
Pls, this will help me a lot.
Thanks

42. January 5th, 2020 at 18:10 | #42

Hi John,
I did a quick search of JAMCRC. It says that it’s simply the bitwise-not of the standard CRC-32, whatever you call “reflected output”.
If that’s the case, it’s trivial to change generated output by adding “~” (bitwise-not).
Thanks,
Evgeni

43. January 14th, 2020 at 03:49 | #43

HI,

I am trying to generate the scrambler for 1+x^43 as *User define* with poly mentioned. However, the generated coded seems does not do anything at all since lfsr_q *NEVER* change regardless of what the input are.

So if the start seed is all FFF’s then the output just an invert of the input, nothing else..

Any comment on that?

Thanks,
Tony

44. March 30th, 2020 at 07:15 | #44

Hi Evgeni,
I need to design a parallel CRC generator with 32 bits of data width and a 20 bits polynomial. I tried your CRC generator with the parameters a mentioned but if my message word has also 32 bits and I need compute the checksums of each word per cycle, the initial value of the lsfr_q register must be initialized for every word right? I mean, I dont have to assign them to the lfsr_c value for every clock cycle, rather what I have to do is for every cycle you have to assert reset to 1.
In the other hand, I was reading the Campobello papar that you have referenced in your article and says that for this method the number of bits of your message (k) and your degree polynomial (m) must be multiples of the number of bits to be procesed in parallel
(w). But in my case I have k=w=32 and m=20 and that condition does not acomplished but I tried anyways, using asserting reset to 1 for every clock cycle as I mentioned above, and it seems to work. So I’m not sure what is going on.
I hope you can clarify my doubts
Greetings,
Diego

45. March 30th, 2020 at 07:17 | #45

Diego :
Hi Evgeni,
I need to design a parallel CRC generator with 32 bits of data width and a 20 bits polynomial. I tried your CRC generator with the parameters mentioned before but if my message word has also 32 bits and I need compute the checksums of each word per cycle, the initial value of the lsfr_q register must be initialized for every word right? I mean, I dont have to assign them to the lfsr_c value for every clock cycle, rather what I have to do is for every cycle I have to assert reset to 1.
In the other hand, I was reading the Campobello paper that you have referenced in your article and says that for this method the number of bits of your message (k) and your degree polynomial (m) must be multiples of the number of bits to be procesed in parallel
(w). But in my case I have k=w=32 and m=20 and that condition does not acomplished but I tried anyways, using asserting reset to 1 for every clock cycle as I mentioned above, and it seems to work. So I’m not sure what is going on.
I hope you can clarify my doubts
Greetings,
Diego

46. March 30th, 2020 at 10:44 | #46

Hi Diego,

20-bit CRC output becomes LFSR initialization value for the next clock. In the very first clock, LFSR is typically initialized with all-1s. But it’s not always the case. Some protocols might use a different value.

Thanks,
Evgeni

47. March 31st, 2020 at 08:01 | #47

@Evgeni
I think I understand now, the checksum bits of a frame are computed, not of a single word.
On the other hand, I’m not sure if I’m misunderstanding it but there is a restriction in the Campobello’s paper that says that the that the degree of polynomial generator (m) and the length of the message to be processed (k) are both multiples of the number of bits to be processed in parallel (w). So, according that, I cannot design a CRC generator with m=20 and w=32, is that right? Regardless that restriction, I’ve tried anyways with this polynomial 0xC1ACF and a data_width=32 and it seems it works.

48. March 31st, 2020 at 09:29 | #48

Hi Diego,

You can have CRC and data of any width and any with polynomial, there is absolutely no restriction. But only certain class of polynomials have desired error detection properties. If your CRC and data are too wide, you’d have performance issues (clock frequency in FPGA or ASIC). Also, if CRC is too wide, it’s going to increase the communication overhead. For parallel CRCs, it’s exactly the same as serial 1-bit CRC with unroll loop.

Thanks,
Evgeni

49. March 31st, 2020 at 10:22 | #49

@Evgeni
Understood, thank you very much Evgeni!
Greetings

50. August 22nd, 2020 at 13:02 | #50

The generators have been invaluable to me. It’s a terrific resource!

The website is definitely running, but it looks like the generators aren’t running. I specify the parameters and in step 2 I only get the message:
Specify polynomial coefficients