New Embedded Solution for Debugging FPGAs

Exostiv Labs recently announced that its EXOSTIV solution for Intel FPGAs will be available in December 2016. Providing up to 200,000 times more visibility on an FPGA than other solutions, EXOSTIV enables the debugging and verification of FPGA board prototypes at speed of operation. It provides extended visibility on internal nodes over long periods of time with minimal impact on the FPGA resources. Thus, you can discover issues related to complex interactions between numerous IPs when simulation is impracticable.

EXOSTIV for Intel FPGAs will be released in December 2016 with support for Arria 10 devices first. Pricing starts at $5,100.

Source: Exostiv Labs 

Low Latency 48-Port FPGA Networking Appliance

BittWare and LDA Technologies are collaborating on a low-latency 48-port FPGA networking appliance. The LDA e4 is a 10/25-Gbps-capable FPGA board enclosure that repurposes the serial links on BittWare’s PCIe FPGA boards into high-speed Ethernet ports.

Features, benefits, and specs:

  • 6″ FPGA-to-port trace lengths
  • Layer 1 replication, support for various CPUs and operating systems
  • A high-accuracy clock source enables accurate timestamping
  • Enables out-of-band management and a zero configuration option

Source: BittWare

FPGA Board Support Packages Simplify App Dev

BittWare recently announced the availability of Arria 10 FPGA Board Support Packages (BSPs) for Altera’s OpenCL SDK 16.0.2. With BittWare’s OpenCL BSPs, you can start developing applications for Altera’s Arria 10 1150GX FPGA using OpenCL.

Using OpenCL, you can code your systems and algorithms in a high-level C-based framework and directly create FPGA programming files from a pure software development flow. The applications are endless, from use in data centers to defense/aerospace systems.

BittWare ‘s Arria 10 BSPs are well suited for acceleration applications such as machine learning. The High Performance Computing (HPC) BSP is the traditional OpenCL model, using a host that moves data to the accelerator system over PCI Express (PCIe). The BSP platform is the standard platform for OpenCL accelerators. In addition, BittWare can provide custom BSPs specifically tailored to your requirements.

BittWare offers an OpenCL Developer’s Bundle comprising a low-profile Arria 10 1150GX FPGA-based PCIe board, BittWorks Lite II software tools, Altera’s OpenCL SDK, and Altera’s Quartus II. You can also get the Developer’s Bundle with a Stratix V board.

The Arria 10 OpenCL Bundle and BSP are currently available. Contact BittWare for pricing.

Source: BittWare

Software-Programmable FPGAs

Modern workloads demand higher computational capabilities at low power consumption and cost. As traditional multi-core machines do not meet the growing computing requirements, architects are exploring alternative approaches. One solution is hardware specialization in the form of application specific integrated circuits (ASICs) to perform tasks at higher performance and lower power than software implementations. The cost of developing custom ASICs, however, remains high. Reconfigurable computing fabrics, such as field-programmable gate arrays (FPGAs), offer a promising alternative to custom ASICs. FPGAs couple the benefits of hardware acceleration with flexibility and lower cost.

FPGA-based reconfigurable computing has recently taken the spotlight in academia and industry as evidenced by Intel’s high-profile acquisition of Altera and Microsoft’s recent announcement to deploy thousands of FPGAs to speed up Bing search. In the coming years, we should expect to see hardware/software co-designed systems supported by reconfigurable computing to become common. Conventional RTL design methodologies, however, cannot productively manage the growing complexity of algorithms we wish to accelerate using FPGAs. Consequently, FPGA programmability is a major challenge that must be addressed both technologically by leveraging high-level software abstractions (e.g., language and compilers), run-time analysis tools, and readily available libraries and benchmarks, as well as scholastically through the education of rising hardware/software engineers.

Recent efforts related to software-programmable FPGAs have focused on designing high-level synthesis (HLS) compilers. Inspired by classical C-to-gates tools, HLS compilers automatically transform programs written in traditional untimed software languages to timed hardware descriptions. State-of-the-art HLS tools include Xilinx’s Vivado HLS (C/C++) and SDAccel (OpenCL) as well as Altera’s OpenCL SDK. Although HLS is effective at translating C/C++ or OpenCL programs to RTL hardware, compilers are only a part of the story in realizing truly software-programmable FPGAs.

Efficient memory management is central to software development. Unfortunately, unlike traditional software programming, current FPGA design flows require application-specific memories to sustain high performance hardware accelerators. Features such as dynamic memory allocation, pointer chasing, complex data structures, and irregular memory access patterns are also ill-supported by FPGAs. In lieu of basic software memory abstractions techniques, experts must design custom hardware memories. Instead, more extensible software memory abstractions would facilitate software-programmability of FPGAs.

In addition to high-level programming and memory abstractions, run-time analysis tools such as debuggers and profilers are essential to software programming. Hardware debuggers and profilers in the form of hardware/co-simulation tools, however, are not ready for tackling exascale systems. In fact, one of the biggest barriers to realizing software-programmable FPGAs are the hours, even days, it takes to generate bitstreams and run hardware/software co-simulators. Lengthy compilation and simulation times cause debugging and profiling to consume the majority of FPGA development cycles and deter agile software development practices. The effect is compounded when FPGAs are integrated into heterogeneous systems with CPUs and GPUs over complex memory hierarchies. New tools, following architectural simulators, may aid in rapidly gathering performance, power, and area utilization statistics for FPGAs in heterogeneous systems. Another solution to long compilation and simulation times is using overlay architectures. Overlay architectures mask the FPGA’s bit-level configurability with a fixed network of simple processing nodes. The fixed hardware in overlay architectures enables faster programmability at the expense of finer grained, bit-level parallelism of FPGAs.

Another key facet of software programming is readily available libraries and benchmarks. Current FPGA development is marred with vendor specific IPs cores that span limited domains. As FPGAs become more software-programmable, we should expect to see more domain experts providing vendor agnostic FPGA-based libraries and benchmarks. Realistic, representative, and reproducible vendor-agnostic libraries and benchmarks will not only make FPGA development more accessible but also serve as reference solutions for developers.

Finally, the future of software-programmable FPGAs lies not only in technological advancements but also in educating the next generation of hardware/software co-designing engineers. Software engineers are rarely concerned with the downstream architecture except when exercising expert optimizations. Higher-level abstractions and run-time analysis tools will improve FPGA programmability but developers will still need a working knowledge of FPGAs to design competitive hardware accelerators. Following reference libraries and benchmarks, software engineers must become fluent with the notion of pipelining, unrolling, partitioning memory into local SRAM blocks and hardened IPs. Terms like throughout, latency, area utilization, power and cycle time will enter software engineering vernacular.

Recent advances in HLS compilers have demonstrated the feasibility of software-programmable FPGAs. Now, a combination of higher-level abstractions, run-time analysis tools, libraries and benchmarks must be pioneered alongside trained hardware/software co-designing engineers to realize a cohesive software engineering infrastructure for FPGAs.

Udit Gupta earned a BS in Electrical and Computer Engineering at Cornell University. He is currently studying toward a PhD in Computer Science at Harvard University. Udit’s past research includes exploring software-programmable FPGAs by leveraging intelligent design automation tools and evaluating high-level synthesis compilers with realistic benchmarks. He is especially interested in vertically integrated systems—exploring the computing stack from applications, tools, languages, and compilers to downstream architectures

New FPGA Board Based on the Xilinx UltraScale VU190 Device

BittWare recently released a new COTS PCIe board based on Xilinx’s 20-nm UltraScale VU190 FPGA. The XUSP3R is a 3/4-length PCIe board offers up to four Gen3 x8 PCIe interfaces, along with four front panel QSFP28 cages, supporting 16 lanes of 25 Gbps or 4 lanes of 100 Gbps, including 100 GbE. Four DIMM sockets support massive memory configurations including up to 256 GB of DDR4 memory across four 72-bit wide banks.

Alternatively, each DIMM socket can be populated with BittWare’s dual bank QDR DIMMs, each providing 576 Mb of QDR-II+. An optional Hybrid Memory Cube (HMC) module with up to 4 GB is also available that can be populated in addition to, and independent of, the DIMMs. Together, these features make the XUSP3R well suited for a variety of data center and networking applications, including compute acceleration, network processing, cybersecurity, and storage.

The board also offers features and tools for simplified development and integration. A comprehensive Board Management Controller (BMC) with host software support for advanced system monitoring simplifies platform management. A complete software tool suite and FPGA development/project examples are also available.

The XUSP3R’s features and specs:

  • High-performance Xilinx Virtex UltraScale 190/160/125
  • Up to four independent PCIe Gen3 x8 interfaces
  • Four QSFP28 cages for 4x 100GbE, 16x 25GbE, 4x 40GbE, or 16x 10GbE (or combinations thereof)
  • Four DIMM sites that support DDR4-2133 SDRAM, QDR-IV, and QDR-II+
  • Optional HMC Module (in addition to, and independent of, the DIMM sites)
  • Board Management Controller for Intelligent Platform Management
  • USB 2.0 for programming, debug, or control with optional integrated Platform Cable USB functionality
  • Timestamping and synchronization support
  • Complete software support with BittWare’s BittWorks II Toolkit
  • FPGA development kit for FPGA board support IP and integration

The XUSP3R board is in production and shipping now. Contact BittWare for more details and pricing.

Source: BittWare

New Dev Kit for Xilinx FPGA-Enabled Accelerator Cards

BittWare recently announced upcoming availability of an OpenPOWER CAPI Developer’s Kit for its Xilinx FPGA-enabled accelerator cards. The kit is intended to give you a fast way to connect the Xilinx All Programmable FPGA to a CAPI-enabled IBM POWER8 system.

The kit includes:

  • BittWare XUSP3S FPGA accelerator card, which is a ¾-length PCIe board featuring the Xilinx Virtex UltraScale VU095, four QSFPs for 4× 100 GbE, and flexible memory configurations with up to 64 GB of memory and support for Hybrid Memory Cube (HMC)
  • IBM Power Service Layer (PSL) IP to provide the connection to the POWER8 chip
  • CAPI host support library
  • An example CAPI design


BittWare’s OpenPOWER CAPI Developer’s Kit is scheduled to be available in Q2 2016.

Source: BittWare

An Introduction to Verilog

If you are new to programming FPGAs and CPLDs or looking for a new design language, Kareem Matariyeh has the solution for you. In this article, he introduces you to Verilog. Although the hardware description language has been used in the ASIC industry for years, it has all the tools to help you implement complex designs, such as a creating a VGA interface or writing to an Ethernet controller. Matariyeh writes:

Programmable logic has been around for well over two decades. Today, due to larger and cheaper devices on the market, FPGAs and CPLDs are finding their way into a wide array of projects, and there is a plethora of languages to choose from. VHDL is the popular choice outside of the U.S. It is preferred if you need a strong typed language. However, the focus of this article will be on another popular language called Verilog, which is a hardware description language that is similar to the C language.

Typically, Verilog is used in the ASIC design industry. Companies such as Sun Microsystems, Advanced Micro Devices, and NVIDIA use Verilog to verify and test new processor architectures before committing to physical silicon and post-fab verification. However, Verilog can be used in other ways, including implementing complex designs such as a VGA interface. Another complex design such as an Ethernet controller can also be written in Verilog and implemented in a programmable device.

This article is mostly tailored to engineers who need to learn Verilog and do not know or know little about the language. Those who know VHDL will benefit from reading this article as well and should be able to pick up Verilog fairly quickly after reviewing the example listings and referring to the Resources at the end of the article. This article does not go over hardware, but I have included some links that will help you learn more about how the hardware interacts with this language at the end.


First, it is best to know what variable types are available in Verilog. The basic types available are: binary, integer, and real. Other types are available but they are not used as often as these three. Keep everything in the binary number system as much as possible because type casting can cause post-implementation issues, but not all writers are the same. Binary and integer types have the ability to use other values such as “z” (high impedance) and “x” (don’t care). Both are nice to have around when you want a shared bus between designs or a bus to the outside world. Binary types can be assigned by giving an integer value. However, there are times when you want to assign or look at a specific bit. Some of the listings use this notation. In case you are curious, it looks like this: X’wY, where X is the word size, w is the number base—b for binary, h for hex—and Y is the value. Any value without this is considered an integer by default. Keeping everything in binary, however, can become a pain in the neck especially when dealing with numbers larger than 8 bits.Table1

Table 1 shows some of the variable types that are available in Verilog. Integer is probably the most useful one to have around because it’s 32 bits long and helps you keep track of numbers easily. Note that integer is a signed type but can also be set with all “z” or “x.” Real is not used that much, when it is used the number is truncated to an integer. It is best to keep this in mind when using the real type, granted it is the least popular compared to binary and integer. When any design is initialized in a simulator, the initial values of a binary and integer are all “x.” Real, on the other hand, is 0.0 because it cannot use “x.” There are other types that are used when interconnecting within and outside of a design. They are included in the table, but won’t be introduced until later.Table2

Some, but not all, operators from C are in Verilog. Some of the operators available in Verilog are in Table 2. It isn’t a complete list, but it contains most of the more commonly used operators. Like C, Verilog can understand operations and perform implicit casting (i.e., adding an integer with a 4-bit word and storing it into a binary register or even a real); typically this is frowned on mostly due to the fact that implicit casting in Verilog can open a new can of worms and cause issues when running the code in hardware. As long as casting does not give any erroneous results during an operation, there should be no show-stoppers in a design. Signed operation happens only if integers and real types are used in arithmetic (add, subtract, multiply) operations.


In Verilog, designs are called modules. A module defines its ports and contains the implementation code. If you think of the design as a black box, Verilog code typically looks like a black box with the top missing. Languages like Verilog and VHDL encourage black box usage because it can make code more readable, make debugging easier, and encourage code reuse. In Verilog, multiple code implementations cannot have the same module name. This is in stark contrast to VHDL, where architectures can share the same entity name. The only way to get around this in Verilog is to copy a module and rename it.

In Listing 1, a fairly standard shift register inserts a binary value at the end of a byte every clock cycle. If you’re experienced with VHDL, you can see that there aren’t any library declarations. This is mainly due to the fact that Verilog originated from an interpretive foundation. However, there are include directives that can be used to add external modules and features. Obviously, the first lines after the module statement are defining the modules’ port directions and type with the reserved words input and output. There is another declaration called inout, which is bidirectional but not in the listing. A module’s input and output ports can use integer and real, but binary is recommended if it is a top-level module.Listing1

The reg statement essentially acts like a storage unit. Because it has the same name as the output port it acts like one item. Using reg this way is helpful because its storage ability allows the output to remain constant while system inputs change between clock cycles. There is another kind of statement called wire. It is used to tie more than one module together or drive combinational designs. It will appear in later listings.

The next line of code is the always statement or block. You want to have a begin and end statement for it. If you know VHDL, this is the same as the process statement and works in the same fashion. If you are completely new to programmable logic in general, it works like this: “For every action X that happens on signals indicated in the sensitivity list, follow these instructions.” In some modules, there is usually a begin and an end statement. This is the equivalent of curly braces seen in C/C++. It’s best to use these with decision structures (i.e., always, if, and case) as much as possible.

Finally, the last statement is a logical left shift operation. Verilog bitwise operators in some instances need the keyword assign for the operation to happen. The compiler will tell you if an assign statement is missing. From there, the code does its insertion operation and then waits for the next positive edge of the clock. This was a pretty straightforward example; unfortunately, it doesn’t do much. The best way to get around that is to add more features using functions, tying-in more modules, or using parameters to increase flexibility.


Tasks and functions make module implementation clearer. Both are best used when redundant code or complex actions need to be split up from the main source. There are some differences between tasks and functions.

A task can call other tasks and functions, while a function can call only other functions. A task does not return a value; it modifies a variable that is passed to it as an output. Passing items to a task is also optional. Functions, on the other hand, must return one and only one value and must have at least one value passed to them to be valid. Tasks are well-suited for test benches because they can hold delay and control statements. Functions, however, have to be able to run within one time unit to work. This means functions should not be used for test benches or simulations that require delays or use sequential designs. Experimenting is a good thing because these constructs are helpful.

There is one cardinal rule to follow when using a function or task. They have to be defined within the module, unlike VHDL where functions are defined in a package to get maximum flexibility. Tasks and functions can be defined in a separate file and then attached to a module with an include statement. This enables you to reuse code in a project or across multiple projects. Both tasks and functions can use types other than binary for their input and output ports, giving you even more flexibility.Listing2

Listing 2 contains a function that essentially acts like a basic ALU. Depending on what is passed to the function, the function will process the information and return the calculated integer value. Tasks work in the same way, but the structure is a little different when dealing with inputs and outputs. As I said before, one of the major differences between a task and a function is that the former can have multiple outputs, rather than just one. This gives you the ability to make a task more complicated internally, if need be.Listing3

Listing 3 is an example of a task in action with more than one output. Note how it is implemented the same way as a function. It has to be defined and called within the module in order to work. But rather than define the task explicitly within the module, the task is defined in a separate file and an include directive is added in the module code just to show how functions and tasks can be defined outside of a module and available for other modules to use.


If too much is added to a module, it can become so large that debugging and editing become a chore. Doing this also minimizes code reuse to the point where new counters and state machines are being recreated when just using small modules/functions from a previous project is more than adequate. A good way to get around these issues is by making multiple modules in the same file or across multiple files and creating an instantiation of that module within an upper-level module to use its abilities. Multiple modules are good to have for a pipelined system. This enables you to use the same kind of module over multiple areas of a system. Older modules can also be used this way so less time is used on constant recreation.Listing4

That is the idea of code reuse in a nutshell. Now I will discuss an example of code reuse and multiple modules. The shift register from Listing 1 is having its data go into an even parity generator and the result from both modules is output through the top-level module in Listing 4. All of this is done across multiple files in one listing for easier reading. In all modular designs, there is always a module called a top-level entity, where all of the inputs and outputs of a system connect to the physical world. It is also where lower-level entities are spawned. Subordinates can spawn entities below themselves as well (see Figure 1).Figure 1

Think of it as a large black box with smaller black boxes connected with wires and those small black boxes have either stuff or even smaller black boxes. Pretty neat, but it can get annoying. Imagine a situation where a memory controller for 10-bit addressing is created and then the address length needs to be extended to 16 bits. That can be a lot of files to go through to change 10 to 16. However, with parameters all that needs to be changed is one value in one file and it’s all done.


Parameters are great to have around in Verilog and can make code reuse even more attractive. Parameters allow words to take the place of a numerical value like #define in C, but with some extra features such as overriding. Parameters can be put in length descriptors, making it easy to change the size of an output, input, or variable. For example, if a VGA generator had a color depth of 8 bits but needed to be changed to 32-bit color depth, then instead of changing the locations where the value occurs, only the value of the parameter would be changed and when the module was recompiled it would be able to display 32-bit color. The same can be done for memory controllers and other modules that have ports, wires, or registers with 1 bit or more in size. Parameters can also be overridden. This is performed just before or when a module is instantiated. This is helpful if the module needs to be the same all the time across separate projects that are using the same source, but needs to be a little different for another project. Parameters can also be used in functions and tasks as long as the parameter is in the same file the implementation code is in. Parameters with functions and tasks give Verilog the flexibility of a VHDL package, granted it really isn’t a package, because the implementation is located in a module and not in a separate construct.Listing5

There are many ways to override parameters. One way is by using the defparam keyword, which explicitly changes the value of the parameter in the instantiated module before it is invoked. Another way is by overriding the parameter when the module is being invoked. Listing 5 shows how both are done with dummy modules that already have defined parameters. The defparam method is from an older version of the language, so depending on the version of Verilog being used, make sure to pick the right method.

Download the entire article.

Evaluation Boards for SuperSpeed USB-to-FIFO Bridge ICs

FTDI recently launched a new family of evaluation/development modules to encourage the implementation of its next-generation USB interfacing technology. Its FT600/1Q USB 3.0 SuperSpeed ICs are in volume production and backed up by the UMFT60XX offering. The family comprises four models that provide different FIFO bus interfaces and data bit widths. With these modules, the operational parameters of FT600/1Q devices can be fully assessed and interfacing with external hardware undertaken, such as FPGA platforms.

At 78.7 mm × 60 mm, the UMFT600A and UMFT601A each have a high-speed mezzanine card (HSMC) interface with 16-bit-wide and 32-bit-wide FIFO buses, respectively. The UMFT600X and UMFT601X measure 70 mm × 60 mm and incorporate field-programmable mezzanine card (FMC) connectors with 16-bit-wide and 32-bit-wide FIFO buses, respectively.

The HSMC interface is compatible with most Altera FPGA reference design boards, while the FMC connector delivers the same functionality in relation to Xilinx boards. Fully compatible with USB 3.0 SuperSpeed (5 Gbps), USB 2.0 High Speed (480 Mbips), and USB 2.0 Full Speed (12 Mbps) data transfer, the UMFT60xx modules support two parallel slave FIFO bus protocols with an achievable data burst rate of around 400 MBps. The multi-channel FIFO mode can handle up to four logic channels. It is complemented by the 245 synchronous FIFO mode, which is optimized for more straightforward operation.

Source: FTDI

Encapsulated 80-A Digital Power Module for FPGAs, Processors, & Memory

Intersil Corp. recently announced the industry’s first 80-A fully encapsulated digital DC/DC PMBus power module that provides point-of-load (POL) conversions for advanced FPGAs, DSPs, ASICs, processors, and memory. The ISL8273M is a complete step-down regulated power supply that delivers up to 80-A output current and operates from industry-standard 5- or 12-V input power rails. Multiphase current sharing of up to four ISL8273M power modules enables you to create a 320-A solution with output voltages as low as 0.6 V. The compact (18 mm × 23 mm) ISL8273M provides high power density and performance for increasingly space-constrained data center equipment and wireless communications infrastructure systems.Intersil ISL8273M

The ISL8273M digital power module leverages a patented ChargeMode control architecture that delivers superior efficiencies, with up to 94% peak efficiency and better than 90% efficiency on most conversions. It also provides a single clock cycle fast transient response to output current load steps common in FPGAs and DSPs processing power bursts.

The 80A ISL8273M further distances itself from competitive digital power modules by delivering 2× higher output current. Its proprietary High Density Array (HDA) package offers unmatched electrical and thermal performance through a single-layer conductive package substrate that reduces lead inductance and dissipates heat primarily through the system board.

Key specs and features:

  • 80-A digital switch mode power supply with current sharing, multiphase and multi-modules support for up to 320-A power rails
  • Wide input voltage range from 4.5 to 14 V and programmable Vout from 0.6 to 2.5 V
  • PMBus-enabled solution for full system configuration, telemetry, and monitoring of all conversions and operating parameters
  • Up to 94% peak conversion efficiency with 1% output voltage accuracy
  • Single clock cycle transient response
  • Programmable Vout, soft-start, soft-stop, sequencing, margining and under-voltage, over-voltage, under-current, over-current, under temperature and over-temperature
  • Monitors Vin, Vout, Iout, temperature, duty cycle, switching frequency, power good and faults
  • Internal nonvolatile memory saves module configuration parameters and fault logging
  • Compact, thermally-enhanced high density array (HDA) package simplifies thermal management, solution positioning and PCB routing

The ISL8273M, available now in a thermally enhanced 18 mm × 23 mm × 7.5 mm HDA package, costs $69 in 1,000-piece quantities. The ISL8273MEVAL1Z 80A digital module evaluation board is available to speed time-to-market and priced at $89.

Source: Intersil Corp.

New Arria 10 Boards Target Cyber/Security, SigInt, & Acceleration

BittWare recently announced two new boards in its Altera Arria 10 FPGA product roadmap to complement their existing Arria 10 3U VPX and PCIe offerings: A10PED and A10XM4.

The A10PED Dual Arria 10 PCIe full-length Gen3 x16 Card supporting either the 660 or 1150 KLE size FPGAs (GX), with one supporting an optional SoC (SX) with dual ARM. Primarily targeting signal and network packet processing applications the board provides 28 lanes of serial I/O up to 10.325 Gbps each, with support for high-accuracy time stamping. Featuring 4x 260-pin DDR4 SODIMMs and a Hybrid Memory Cube (HMC), the A10PED will support up to 68 GB of memory with a peak aggregate memory bandwidth of over 175 GB/sec (not including I/O or PCIe). For latency-sensitive applications, some or all of the DDR4 SODIMMs can be replaced with proprietary QDR-II/IV SRAM SODIMMs. These memory options, coupled with full support for Altera’s OpenCL tools, also make this board compelling for acceleration & co-processing applications.

The A10XM4 Arria 10 XMC (VITA 42) Module provides network interface (NIC) and cyber/security capabilities in addition to host/carrier acceleration for applications in radar, EW, networking, and SigInt. In addition, it will support full conduction cooling. Compatible with any standard XMC carrier, the A10XM4 features an Arria 10 GX FPGA with two lanes of 10 GigE, along with up to 16 GB of memory and PCIe Gen3 x8 PCIe to the host. BittWare’s NIC application example and OpenCL BSP will greatly simplify the integration and development of cyber/security additions to and off-loading of standard host applications.

The A10PED full length PCIe board will be available Q4 2015 and the A10XM4 XMC board will be available Q1 2016.  Contact BittWare for configurations, pricing, and details.

Source: BittWare

Radiation-Tolerant FPGA Kit

Microsemi recently announced the availability of the RTG4 FPGA Development Kit for high-bandwidth space applications. The innovative kit provides space designers an evaluation and development platform for applications such as data transmission, serial connectivity, and more.Microsemi RTG4-Dev Kit

The development kit provides all necessary reference to evaluate and adopt RTG4 technology quickly. You don’t need to build a test board and assemble the device onto the board. The RTG4 Development Kit is ideal for evaluating and designing for remote sensing space payloads, radar and imaging, and spectrometry. Other applications include mobile satellite services (MSS) communication satellites, high-altitude aviation, medical electronics, and civilian nuclear power plant control.

RTG4 FPGAs feature reprogrammable flash configuration, which makes prototyping easier. Reprogrammable flash technology offers complete immunity to radiation-induced configuration upsets in the harshest radiation environments, without the configuration scrubbing required with SRAM FPGA technology. RTG4 supports space applications requiring up to 150,000 logic elements and up to 300 MHz of system performance.

The RTG4 Development Kit’s features and specs:

  • One RT4G150 device in a ceramic package with 1,657 pins
  • Two 1GB DDR3 synchronous dynamic random access memory (SDRAM)
  • 2GB SPI flash memory
  • PCI Express Gen 1 interface
  • One pair SMA connectors for testing of the full-duplex SERDES channel
  • Two FMC connectors with HPC/LPC pinout for expansion
  • RJ45 interface for 10/100/1000 Ethernet
  • USB micro-AB connector
  • Embedded Flashpro5 programmer and external programming header
  • Current measurement test points

The RTG4 Development Kit features a RT4G150 device offering more than 150,000 logic elements in a ceramic package with 1,657 pins. Kits are available now for purchase.

Source: Microsemi

FPGA-Based Storage Reference Design Doubles NAND Flash Life

Altera Corp. recently developed a storage reference design  based on its Arria 10 SoCs that doubles the life of NAND flash. In addition, can increase the number of program-erase cycles by up to 7×. The design features an Arria 10 SoC with an integrated dual-core ARM Cortex A9 processor in an optimized, single-chip solution. It uses a Mobiveil SSD controller and NVMdurance NAND optimization software. This reference design provides improved performance and flexibility in NAND utilization while reducing the cost of the NAND array by increasing the lifetime of data center equipment.NAND_AlteraMobiveil’s controller supports multi-core architectures, enabling threads to run on each core with their own queue and interrupt without any locks required. NVMdurance’s NAND flash optimization software monitors the NAND Flash’s condition and automatically adjusts the control parameters in real time. The reference design also features end-to-end data protection, encryption and compression, and optimizes throughput and power consumption, all in a small silicon footprint.

Altera’s NAND storage reference design is available today.

Source: Altera Corp.

ZestET2-NJ Gigabit Ethernet FPGA Module

Orange Tree Technologies recently launched the ZestET2-NJ high-performance Gigabit Ethernet FPGA module, which comprises a Gigabit Ethernet processing engine, Xilinx Artix-7 FPGA, DDR3 memory, and general-purpose I/O. Delivering the maximum sustained Ethernet bandwidth of over 100 MBps in both directions simultaneously, it is aimed at data acquisition and control applications in markets such as industrial vision, radar, sonar and medical imaging.OrangeTree-zestet2-nj

The Xilinx Artix-7 XC7A35T FPGA, which has more than 33,000 logic cells, 1.8 Mb of Block RAM and 90 DSP slices, is tightly coupled with 512 MB of 400-MHz DDR3 SDRAM, giving it an ample memory bandwidth of 1.6 GBps for high-speed processing and formatting of streaming data.  With ease of integration in mind, there are 105 FPGA I/O pins available for connection to the user’s equipment.

Orange Tree’s proprietary GigEx chip handles the entire TCP/IP stack at over 100 MBps in each direction simultaneously. It enables the User FPGA to be dedicated entirely to the application for maximum efficiency.  The module measures just 40 × 50 mm, making it ideal for integration into your products.

Source: Orange Tree Technologies

USB-to-FPGA Communications: A Case Study of the ChipWhisperer-Lite

Sending data from a computer to an FPGA is often required. This might be FPGA configuration data, register settings, or streaming data. An easy solution is to use a USB-connected microcontroller instead of a dedicated interface chip, which allows you to offload certain tasks into the microcontroller.

In Circuit Cellar 299 (June 2015), Colin O’Flynn writes:

Often your FPGA-based project will require computer communication and some housekeeping tasks. A popular solution is the use of a dedicated USB interface chip, and a soft-core processor in the FPGA for housekeeping tasks.

For an open-source hardware project I recently launched, I decided to use an external USB microcontroller instead of a dedicated interface chip. I suspect you’ll find a lot of useful design tidbits you can use for yourself—and, because it’s open source, getting details of my designs doesn’t involve industrial espionage!

The design is called the ChipWhisperer-Lite (see Photo 1). This device is a training aid for learning about side-channel power analysis of cryptographic implementations. Side-channel power analysis uses measurements of small power variations during execution of the cryptographic algorithms to break the implementation of the algorithm.

Photo 1: This shows the ChipWhisperer-Lite, which contains a Xilinx Spartan 6 LX9 FPGA and Atmel SAM3U2C microcontroller. The remaining circuitry involves the power supplies, ADC, analog processing, and a development device which the user programs with some cryptographic algorithm they are analyzing.

Photo 1: This shows the ChipWhisperer-Lite, which contains a Xilinx Spartan 6 LX9 FPGA and Atmel SAM3U2C microcontroller. The remaining circuitry involves the power supplies, ADC, analog processing, and a development device which the user programs with some cryptographic algorithm they are analyzing.

In a previous article, “Build a SoC Over Lunch” (Circuit Cellar 289, 2014), I made the case for using a soft-core processing in an FPGA. In this article I’ll play the devil’s advocate by arguing that using an external microcontroller is a better choice. Of course the truth lies somewhere in between: in this example, the requirement of having a high-speed USB interface makes an external microcontroller more cost-effective, but this won’t always be the case.

This article assumes you require computer communication as part of your design. There are many options for this. The easiest from a hardware perspective is to use a USB-Serial converter, and many projects use such a system. The downside is a fairly slow interface, and the requirement of designing a serial protocol.

A more advanced option is to use a USB adapter with a parallel interface, such as the FTDI FT2232H. These can achieve very high-speed data rates—basically up to the limit of the USB 2.0 interface. The downside of these options is that it still requires some protocol implemented on your FPGA for many applications, and it has limited extra features (such as if you need housekeeping tasks).

The solution I came to is the use of a USB microcontroller. They are widely available from most vendors with USB 2.0 high-speed (full 480 Mbps data rate) interfaces, and allow you to perform not only the USB interface, but the various housekeeping tasks that your system will require. The USB microcontroller will also likely be around the same price (or possibly cheaper) than the equivalent specialized interface chip.

When selecting a microcontroller, I recommend finding one with an external memory bus interface. This external memory bus is normally designed to allow you to map devices such as SRAM or DRAM into the memory space of the microcontroller. In our case we’ll actually be mapping FPGA registers into the microcontroller memory space, which means we don’t need any protocol for communication with the FPGA.


Figure 1: This figure shows the basic connections used for memory-mapping the FPGA into the microcontroller memory space. Depending on your requirements, you can add some additional custom lines, such as a flag to indicate different FPGA register banks to use, as only a 9-bit address bus is used in this example.

I selected an Atmel SAM3U2C microcontroller, which has a USB 2.0 high-speed interface. This microcontroller is low-cost and available in TQFP package, which is convenient if you plan on hand assembling prototype boards. The connections between the FPGA and microcontroller are shown in Figure 1.

On the FPGA, it is easy to map this data bus into registers. This means that to configure some feature in the FPGA, you can just directly write into a register. Or if you are transferring data, you can read from or write to a block-RAM (BRAM) implemented in the FPGA.

Check out Colin’s ChipWhisperer-Lite KickStarter Video:

New High-Performance VC Z Series Cameras

Vision Components recently announced the availability of its new intelligent camera series VC Z. The embedded systems offer real-time image processing suitable for demanding high-speed and line scan applications. All models are equipped with Xilinx’s Zynq module, an ARM dual-core Cortex-A9 with 866 MHz and an integrated FPGA.Vision Components - VC_Z_series_stapel_pingu

The new camera is based on the board camera series VCSBC nano Z. With a footprint of 40 × 65 mm, these compact systems are especially easy to integrate into machines and plants. They are optionally available with one or two remote sensor heads and thus suitable for stereo applications.You can choose between two enclosed camera types: the VC nano Z, which has housing dimensions of 80 × 45 × 20 mm, and the VC pro Z, which measures 90 × 58 × 36 mm and can be fitted with a lens and an integrated LED illumination. The new operating system VC Linux ensures optimal interaction between hardware and software.

Source: Vision Components