Home > Logic Design > Book: 100 Power Tips for FPGA Designers

Book: 100 Power Tips for FPGA Designers

Front cover

This book is a collection of articles on various aspects of FPGA design: synthesis, simulation, porting ASIC designs, floorplanning and timing closure, design methodologies, performance, area and power optimizations, RTL coding, IP core selection, and many others.

The book is intended for system architects, design engineers, and students who want to improve their FPGA design skills. Both novice and seasoned logic and hardware engineers can find bits of useful information.

This book is written by a practicing FPGA logic designer, and contains a lot of illustrations, code examples, and scripts. Rather than providing information applicable to all FPGA vendors, this book edition focuses on Xilinx Virtex-6 and Spartan-6 FPGA families. Code examples are written in Verilog HDL.

Download excerpt from the book
Download source code, projects, and scripts

Paperback edition on Amazon.com , Amazon.de, and Amazon.co.uk

Number of pages: 474
Publisher: CreateSpace

Kindle edition on Amazon.com

The book can be read in color on a PC or MAC using free Kindle for PC or Kindle for MAC application.
It can also be read on an iPhone or iPad using free Kindle for iPhone or Kindle for iPad application.

Readers based in India can purchase the book on Flipkart.com

Chinese-speaking readers can purchase the book on PHEI

Google eBook edition
The book can be read in color on a PC, MAC, Tablet/iPad. Extensive preview is available.

ePub edition on Barnes and Noble
The book can also be read using free Nook for PC, Adobe Digital Edition applications, or on other eReaders that support ePub format.

Any questions, comments, suggestions about the book are welcome.

  1. Divya Patel
    June 3rd, 2011 at 04:52 | #1

    I’ve a question about performance results of different counters in your book. Can you elaborate why the performance of LFSR counter is slower than the binary.

  2. June 3rd, 2011 at 05:38 | #2

    Hi Divya,

    The reason is that there is a comparator on the output of the LFSR. Without the comparator, LFRS counter becomes a shift register and will have a higher performance.


  3. June 7th, 2011 at 07:28 | #3

    A link to German FPGA news site with a comment on the book: http://www.fpga-news.de/2011/06/neues-aus-der-fpga-welt-23/
    Here is the link translating the page into English

  4. June 30th, 2011 at 13:01 | #4


    can you explain a bit more why “asynchronous reset nets usually have more relaxed timing constraints comparing to the synchronous reset”?

    Thanks for the great book!


  5. June 30th, 2011 at 16:25 | #5


    Designers that use asynchronous resets are usually less concerned that all the registers come out of reset in the same clock. If they were, they would use synchronous reset instead.

    For that reason there is no point to apply tight timing constraints to asynchronous reset nets. In fact, asynchronous reset nets don’t even participate in the timing analysis by default. You need to add “ENABLE=reg_sr_o” and “ENABLE=reg_sr_r” constraints (Xilinx UCF format) to do so (the default is different for Xilinx V6/S6 comparing to earlier architectures). And even if you do want to constrain async reset nets, none of the constraints – FROM-TO, MAXDELAY, etc. (Xilinx UCF format) – can guarantee minimum delay, only the maximum. That’s important, because in large designs with high-fanout async reset nets, routing delays can reach several ns, and the difference between longest and shortest reset net can exceed a clock period.


  6. July 9th, 2011 at 11:04 | #6

    I see what you mean. It’s not the asynchronous nature of the reset that makes it more relaxed, it’s just that it isn’t usually included in the static timing analysis. That’s a good point.

    In some FPGA technologies – like Spartan-6 – it’s also technically possible, although not recommended, to use a global clock network to distribute a global reset signal. Do you have any experience with that?


  7. July 9th, 2011 at 11:15 | #7


    Low-skew clock networks can be used not only for resets, but for high-fanout control signals. I’ve successfully used clock enables routed as global clocks in large pipelines.


  8. July 25th, 2011 at 16:55 | #8

    Here is a review of the book in EETimes

  9. don jk
    July 27th, 2011 at 01:27 | #9

    plz. provide the source codes in vhdl too.

  10. July 27th, 2011 at 07:16 | #10

    Unfortunately, VHDL code is not available, only Verilog. Most of the examples are simple enough, and you might want to consider using Verilog to VHDL converters.


  11. Alexey
    July 29th, 2011 at 01:10 | #11


    what max working frequency have u achieved in your practice?

    What is best result?

    Thank you.

  12. July 29th, 2011 at 07:30 | #12

    Hi Alexey,

    I worked with designs where small parts are running at a frequency close to the maximum supported by that particular FPGA: 500-550MHz. On average (if it’s correct to apply it in this context), I deal with 200-250MHz frequencies.


  13. vipin
    July 30th, 2011 at 00:57 | #13

    Where can I buy your book in india.. I can only pay in indian currency.

  14. July 30th, 2011 at 02:23 | #14


    Ok, let’s figure that out. A few questions:
    Are you interested in paperback or electronic version ? Can you pay with PayPal ? What websites are you using to buy books/other products ?


  15. vipin
    August 1st, 2011 at 08:57 | #15


    I generally use http://www.flipkart.com for any online purchase. Many other verilog books are listed there. I would love to have a paperback version. A proper paperback version not a scanned one.


  16. vipin
    August 2nd, 2011 at 01:54 | #16


    Nothing like making your book available on flipkart. Most of the people here use the same site fo online purchase because of free shipment and assured genuineness. I feel it will the best plateform if u want to launch your book in a country like india where design and development work is on its all time high.
    I do not have a paypal account.. :(

  17. August 4th, 2011 at 19:01 | #17

    A reader has asked about ASIC gates to Virtex-6 logic cells conversion on page 187: 1 logic cell = 15 ASIC gates.
    The first “official” mentioning of Xilinx logic cell to ASIC gate conversion is in Xilinx Zynq-7000 product brief: http://www.xilinx.com/publications/prod_mktg/zynq7000/Product-Brief.pdf , note 3 on page 2. This number is close to what I’ve seen in ASIC emulation projects using Virtex-5 and Virtex-6 devices.

  18. mathu
    August 4th, 2011 at 20:36 | #18

    hi sir..
    could implement the all microprocessor architecture in fpga???

  19. August 4th, 2011 at 20:42 | #19


    Tip #65 in the book discusses some of the processor architectures supported in Xilinx FPGAs – either soft cores or embedded.


  20. August 5th, 2011 at 12:32 | #20

    A reader pointed out the following error. In Tip #36 “FPGA CONFIGURATION” on page 179, the configuration time formula should read:
    Config time = bitstream size / ( clock frequency * data width )

  21. August 5th, 2011 at 15:20 | #21

    And yet another comment from a reader. On page 69 in “Inferring Register” section, the code for a register with both sync and async resets should be as follows:

    always @( posedge clk , posedge areset)
    if( areset )
    dout <= 1′b1;
    else if( sreset )
    dout <= 1′b0;
    dout <= din;

  22. September 20th, 2011 at 07:33 | #22

    Hi Evgeni,

    in Tip #22, when computing the MTBF for a clock-domain crossing, where do you get the values for the T0, T and tau parameters from?


  23. September 20th, 2011 at 20:08 | #23

    Hi Guy,

    I discussed the MTBF formula with an engineer knowledgeable in the subject matter. T0,T, and tau are an example, and don’t necessarily relate to Xilinx FPGAs characteristics.


  24. September 21st, 2011 at 00:50 | #24

    @Evgeni I just posted the question on a Xilinx forum (http://forums.xilinx.com/t5/Spartan-Family-FPGAs/Synchronizer-Mean-Time-Between-Failure-MTBF-Estimation/td-p/178822) — let’s see if we can get an answer there.


  25. Raul Huertas
    October 11th, 2011 at 20:35 | #25

    I’ve just received my book, I’m so excited! content looks better than expected :)

  26. November 8th, 2011 at 03:29 | #26

    The book is available on flipkart.com for readers based in India.

  27. dkk
    January 18th, 2012 at 01:13 | #27

    where to get an electronic version of the book ?

  28. January 18th, 2012 at 02:49 | #28


    There are three options to gen an electronic version: Amazon Kindle, Barnes & Noble Nook, and Google eBook. Kindle and Nook versions can be read on a PC, just like PDF.


  29. Vincent Mirian
    March 8th, 2012 at 09:46 | #29

    Hi Evgeni,

    I read about the primitive component CCGLUT5 in section 80 of the book. Can you provide more detail on the example with the pattern_matcher.

    Are their other uses for this component? How does the tools leverage this component in synthesis optimization or another manner, then how does it do so? Any help would be appreciated.

    Thank you in advance,

  30. March 8th, 2012 at 11:23 | #30

    Hi Vincent,

    The idea is to reconfigure LUT with a pattern using CFGLUT5 primitive instead of a more “traditional” approach of storing mask, match, and data in registers and use regular LUTs to do comparison. That way the circuit uses much less logic resources.

    CFGLUT5 is a fairly new Xilinx primitive that became available starting from Virtex-5 chips. As far as I know, current synthesis and p&r tools don’t automatically infer it, if that’s what you’re asking. And it’s not a trivial thing to do, because of the way the primitive is programmed.

    Another use of this primitive could be a small re-programmable ROM.


  31. Vincent Mirian
    March 10th, 2012 at 15:02 | #31

    Hi Evgeni,

    Thank you for your reply. I looked into the example further. I have a few questions which I hope you can answer or guide me to the solution:

    1- Example 2 has a one-bit output. Example 1 using the CFGLUT5 has a 32-bit output. Would we need additional LUTs to reduce the output to one-bit?

    2- To load the CFGLUT5, would we need 32 CLK cycles? If so, the trade-off between the examples is speed vs. area?

    3- This is a minor note, but I calculate 240 flip flops (not 180). Did I miss something?

    Thanks again for the book… soo much knowledge ;-)

  32. March 12th, 2012 at 03:56 | #32

    Hi Vincent,

    You’re right about 1 and 2. But even with LUTs to reduce to 1-bit output, CFGLUT5 utilizes less area.
    Data, data mask, and pattern match require 80 registers each. So it’s 180 in total. You can look into pattern_match.v RTL and its MAP report in the accompanied source code.


    April 14th, 2012 at 22:41 | #33

    hello everyone,

    I am a new learner for this book. so all are help me for any doubts. Thank u.

  34. Rick
    May 9th, 2012 at 13:17 | #34

    Just got the book … it looks awesome.

    Is there a consolidated errata ready for download that I can printout and keep with the book?


  35. May 9th, 2012 at 14:13 | #35

    Hi Rick,

    There isn’t much errata, and it’s all in the comments


  36. Arjun BK
    June 21st, 2012 at 01:50 | #36

    I am trying to implement ping-pong(dual) buffer…. can u help me out..

  37. hewraz
    August 5th, 2012 at 12:54 | #37

    Dear Evgeni
    I have some experience in FPGA design with vhdl but now I want to start designing with SystemC and Xilinx’s FPGA.
    Which tools do you offer and you think they lead to approperiate practical results in Xilinx’s FPGAs.


  38. August 6th, 2012 at 07:45 | #38

    Hi Hewraz,

    Xilinx tools – ISE, Vivado – are good for whatever you want to do.


  39. Ash
    August 20th, 2012 at 20:38 | #39

    Hi Evgeni !

    I am interested in getting an e-book version. However, this seems not applicable in South Asia (Singapore).. Please guide me on how to get an e-version in Singapore!

    Thanks !

  40. August 20th, 2012 at 21:09 | #40

    Hi Ash,

    There are three options to get an electronic version: Amazon, Barnes & Noble, or Google Books.
    One or more of those options must be working in Singapore.


  41. David Li
    October 4th, 2012 at 03:40 | #41

    Hi Evgeni,

    Could you give me a free book copy, as I will teach a MSc. Level FPGA module in 2013 spring? I’m happy to include your book in the reading list of the course.



  42. October 4th, 2012 at 09:37 | #42

    Hi David,

    Is soft copy ok (kindle or PDF format), or you want a paperback ?


  43. October 4th, 2012 at 09:40 | #43


    1. The short answer: the same way as regular modules. IP cores can be packaged as a Verilog/VHDL source code, post-synthesis netlist, be encrypted or not.
    2. Using a custom bridge logic


  44. KingNDD
    October 5th, 2012 at 08:29 | #44

    regarding Microblaze processor, various interfaces (DPLB: Data interface, Processor Local Bus
    DOPB: Data interface, On-chip Peripheral Bus
    DLMB: Data interface, Local Memory Bus (BRAM only)
    IPLB: Instruction interface, Processor Local Bus
    IOPB: Instruction interface, On-chip Peripheral Bus
    ILMB: Instruction interface, Local Memory Bus (BRAM only))
    can u please give a brief info. of these?

  45. Salman
    May 15th, 2013 at 09:29 | #45

    i want to use FPGA as switch in controlling the speed of induction motor and need help

  46. PRABHU
    June 23rd, 2013 at 04:18 | #46

    i want to use FPGA with decoder.that decoder may be connected with RF receiver.i want the working of FpGA with that above components

  47. June 23rd, 2013 at 07:04 | #47


    Whether you can use FPGA or not depends on the interface of that RF receiver chip.


  48. nilesh
    July 4th, 2013 at 02:30 | #48

    Hello Evgeni,
    Can you mail me e-book must be in PDF format on my mail(shilankarnilesh09@gmail.com)?

  49. July 5th, 2013 at 10:39 | #49

    Hi Nilesh,

    Thanks for the interest in my book. The book can be only purchased either in paperback or kindle formats.


  50. Sam Reaves
    August 21st, 2013 at 17:50 | #50


    I am trying to implement a Synchronous Decade counter with CE in and CE out in a Spartan 3 device that will have a 50% duty cycle. Do you know where I can find a schematic or code for such a device? I have the counter working and all of the outputs seem as they should even on the target FPGA but when I cascade two counters only the first one in a chain has the proper outputs the others do not clock. Any suggestions would be greatly appreciated.


Comment pages
1 2 671
  1. June 5th, 2011 at 06:09 | #1