FPGA Partial Reconfiguration

Many field-programmable gate array (FPGA) design modules have parameters, such as particular clock and I/O drive settings, that can only be adjusted during implementation, not at runtime. However, being able to make such adjustments in an FPGA design during operation is convenient.

In Circuit Cellar’s June issue, columnist Colin O’Flynn addresses how to use partial reconfiguration (PR) to sidestep such restrictions. Using difference-based PR, you can adjust digital clock module (DCM) attributes, I/O drive strength, and even look-up-table (LUT) equations, he says. O’Flynn’s article describes how he used PR on a Xilinx Spartan-6 FPGA to solve a specific problem. The article excerpt below elaborates:

Perfect Timing
The digital clock module (DCM) in a Xilinx Spartan-6 FPGA has a variety of features, including the ability to add an adjustable phase shift onto an input clock.

There are two types of phase shifts: fixed and variable. A fixed shift can vary from approximately –360° to 360° in 1.4° steps. A variable shift enables shifting over a smaller range, which is approximately ±5 nS in 30-ps steps.

Note the actual range and step size varies considerably for different operating conditions. Hence the problem: the provided variable phase shift interface is only useful for small phase shifts; any major phase shift must be fixed at design time.

To demonstrate how PR can be used to fix the problem, I’ll generate a design that implements a DCM block and use PR to dynamically reconfigure the DCM.

Figure 1—The system’s general block diagram is shown. A digital clock module (DCM) block synthesizes a new clock from the system oscillator and then outputs the clock to an I/O pin. The internal configuration access port (ICAP) interface is used to load configuration data. The serial interface connects the ICAP interface to a computer via a first in, first out (FIFO) buffer.

Figure 1—The system’s general block diagram is shown. A digital clock module (DCM) block synthesizes a new clock from the system oscillator and then outputs the clock to an I/O pin. The internal configuration access port (ICAP) interface is used to load configuration data. The serial interface connects the ICAP interface to a computer via a first in, first out (FIFO) buffer.

Streaming Bits
I used Xilinx’s ISE design suite to generate a design (see Figure 1). I did the usual step of creating the entire FPGA bitstream, which could then be programmed into the FPGA. The FPGA bitstream is essentially a completely binary blob tells you nothing about your design. The “FPGA native circuit description (NCD)” file is one step above the FPGA bitstream. The NCD file contains a complete description of the design mapped to the blocks available in your specific chip with all the routing of signals between those blocks and useful net names.

The NCD file contains enough information for you to do some edits to the FPGA design. Then you can use the NCD file to generate a new binary blob (bitstream) for programming. A critical part of PR is understanding that when you download this new blob, you can only download the difference between the original file and the new file. This is known as “difference-based PR,” and it is the only type of PR I’ll be discussing.

So what’s in the bitstream? The bitstream actually contains several different commands sent to the FPGA device, which instructs the FPGA to load configuration information, tells it the address information where the data is going to be loaded, and finally sends the actual data to load. Given a bitstream file, you can actually parse this information out. I’ve posted a Python script on ProgrammableLogicInPractice.com that does this for the Spartan-6 device.

A frame is the smallest portion of an FPGA that can be configured. The frame’s size varies per device. (For the Spartan-6 I used in this article, it is 65 × 16-bit words, or 1,040 bits per frame.) You must reload the entire frame if anything inside it changes, which brings me to the first “gotcha.” When using PR, everything inside that “frame” will be reloaded (i.e., parts of your design that haven’t changed may become temporarily invalid because they share a configuration frame with the part of your design that has changed).

O’Flynn’s full article goes on to explain more about framing, troubleshooting challenges along the way, and completing the reconfiguration. The article is available in the June issue, now available for membership download or single-issue purchase.

To further assist readers, O’Flynn has posted more information on the website that complements his monthly Circuit Cellar column, including a video of his project running.

“You can also see an example of how I integrated PR into my open-source ChipWhisperer project, which uses PR to dynamically program a phase shift into the DCMs inside the FPGA,” he says.

All-Programmable SoC Solution

Anyone creating a complex, powerful digital design may want to turn to a single device that integrates high-speed processing and programmable logic.

In Circuit Cellar’s April issue, columnist Colin O’Flynn explores using the Xilinx Zynq  Z-7020 All Programmable SoC (system-on-a-chip) as part of the Avnet ZedBoard development board.

“I used a Xilinx Zynq SoC device, although Altera offers several flavors of a similar device (e.g., the Cyclone V SoC, the Arria V SoC, and the Arria 10 SoC), and Microsemi offers the SmartFusion2 SoC FPGA,” O’Flynn says in his article. “The Xilinx and Altera devices feature a dual-core ARM Cortex-A9 processor, whereas the Microsemi devices feature a less powerful Cortex-M3 processor. You may not need a dual-core A9 processor, so ‘less powerful’ may be an advantage.”

While O’Flynn’s article introduces the ZedBoard, he notes many of its specifics also apply to the MicroZed board, a less expensive option with a smaller SoC. Xilinx’s Zynq device has many interesting applications made highly accessible through the ZedBoard and MicroZed boards, he says.

O’Flynn’s discussion of the Zynq SoC device includes the following excerpt. (The April issue, which includes O’Flynn’s full article, is available for membership download or single-issue purchase.)

WHERE’S THE BEEF?
Originally, I had planned to describe a complete demo project in this article. I was going to demonstrate how to use a combination of a custom peripheral and some of the hard cores to stream data from a parallel ADC device into DDR memory. But there wasn’t enough room to introduce the tools and cover the demo, so I decided to introduce the Zynq device (using the ZedBoard).

A demo project is available at ProgrammableLogicInPractice.com. Several tutorials for the Zynq device are available at xilinx.com and zedboard.org, so there isn’t any point in duplicating work! I’ve linked to some specific tutorials from the April 2014 post on ProgrammableLogicInPractice.com. Photo 1 shows the hardware I used, which includes a ZedBoard with my custom OpenADC board connected through the I/O lines.

An Avnet ZedBoard is connected to the OpenADC. The OpenADC provides a moderate-speed ADC (105 msps), which interfaces to the programmable logic (PL) fabric in Xilinx’s Zynq device via a parallel data bus. The PL fabric then maps itself as a peripheral on the hard-core processing system (PS) in the Zynq device to stream this data into the system DDR memory.

Photo 1: An Avnet ZedBoard is connected to the OpenADC. The OpenADC provides a moderate-speed ADC (105 msps), which interfaces to the programmable logic (PL) fabric in Xilinx’s Zynq device via a parallel data bus. The PL fabric then maps itself as a peripheral on the hard-core processing system (PS) in the Zynq device to stream this data into the system DDR memory.

FPGA PROCESSOR DESIGN 101
Even if you’re experienced in FPGA design, you may not have used Xilinx tools for processor-specific design. These tools include the Xilinx Platform Studio (XPS) and the Xilinx Software Development Kit (SDK). Before the advent of hard-core processors (e.g., Zynq), there have long existed soft-core processors, including the popular Xilinx MicroBlaze soft processor. The MicroBlaze system is completely soft core, so you can use the XPS tool to define the peripherals you wish to include. For the Zynq device, several hard-core peripherals are always present and you can choose to add additional soft-core (i.e., use the FPGA fabric) peripherals.

In a future article I will discuss different soft-core processor options, including some open-source third-party ones that can be programmed from the Arduino environment. For now, I’ll examine only the Xilinx tools, which are applicable to the Zynq device, along with the MicroBlaze core.

The ARM cores in the Zynq device are well suited to run Linux, which gives you a large range of existing code and tools to use in your overall solution. If you don’t need those tools, you can always run on “bare metal” (e.g., without Linux), as the tools will generate a complete base project for you that compiles and tests the peripherals (e.g., printing “Hello World” out the USART). To give you a taste of this, I’ve posted a demo video of bringing up a simple “Hello World” project in both Linux and bare metal systems on ProgrammableLogicInPractice.com.

The FPGA part of the Zynq device is called the programmable logic (PL) portion. The ARM side is called the processing system (PS) portion. You will find a reference to the SoC’s PL or PS portion throughout most of the tutorials (along with this article), so it’s important to remember which is which!

For either system, you’ll be starting with the XPS software (see Photo 2). This software is used to design your hardware platform (i.e., the PL fabric), but it also gives you some customization of the PS hard-core peripherals.

This is the main screen of the Xilinx Platform Studio (XPS) when configuring a Zynq design. On the left you can see the list of available soft-core peripherals to add to the design. You can configure any of the hard-core peripherals by choosing to enable or disable them, along with selecting from various possible I/O connections. Additional screens (not shown) enable you to configure peripherals addressing information, configure I/O connections for the soft-core peripherals, and connect peripherals to various available extension buses.

Photo 2: This is the main screen of the Xilinx Platform Studio (XPS) when configuring a Zynq design. On the left you can see the list of available soft-core peripherals to add to the design. You can configure any of the hard-core peripherals by choosing to enable or disable them, along with selecting from various possible I/O connections. Additional screens (not shown) enable you to configure peripherals addressing information, configure I/O connections for the soft-core peripherals, and connect peripherals to various available extension buses.

MAKING THE CONNECTION
For example, clicking on the list of hard-core peripherals opens the options dialogue so you can enable or disable each peripheral along with routing the I/O connections. The ZedBoard’s Zynq device has 54 multipurpose I/O (MIO) lines that can be used by the peripherals, which are split into two banks. Each bank can use different I/O standards (e.g., 3.3 and 1.5 V).

Enabling all the peripherals would take a lot more than 54 I/O lines. Therefore, most of the I/O lines share multiple functions on the assumption that every peripheral doesn’t need to be connected. Many of the peripherals can be connected to several different I/O locations, so you (hopefully) don’t run into two peripherals needing the same I/O pin.

Almost all of the peripheral outputs can be routed to the PL fabric as well under the name EMIO, which is a dedicated 64-bit bus that connects to the PL fabric. If you simply wish to get more I/O pins, you can configure these extra pins from within XPS. But you can also use this EMIO bus to control existing cores in your FPGA fabric using peripherals on the Zynq device.

Assume you had an existing FPGA design where you had an FPGA core doing some processing connected to a microcontroller or computer via I2C, SPI, or serial. You could simply connect this core to the appropriate PS peripheral and port the existing code onto the Zynq processor by changing the low-level calls to use the Zynq peripherals. You may eventually wish to change this interface to the peripheral bus, the AMBA Advanced eXtensible Interface (AXI), for better performance. However, using standard peripherals to interface to a PL design can still be useful for many cores for which you have extensive existing code.

The MIO/EMIO pins can even be used in a bit-banging fashion, so if you need a special device or core control logic, it’s possible to quickly develop this in software. You can then move to a hardware peripheral for considerably better performance.

O’Flynn’s article goes on to discuss in greater detail the internal buses, peripherals, and taking a design from hardware to software. For more, refer to Circuit Cellar‘s  April issue and related application notes posted at O’Flynn’s companion site ProgrammableLogicInPractice.com.

Programmable Logic Video Lessons

Interested in learning more about programmable logic? You’re in luck. Colin O’Flynn’s first article in his “Programmable Logic in Practice” column appears in Circuit Cellar’s October 2013 issue. To accompany his work, Colin is producing informative videos for you to view after reading his articles.

In the first video, Colin covers the topic of adding the Xilinx ChipScope ILA/VIO core using automatic and manual insertion with ISE.

 

Since 2002, Circuit Cellar has published several of O’Flynn’s articles. O’Flynn  is an engineer and lecturer at Dalhousie University in Halifax, Nova Scotia. He earned a Master’s in applied science from Dalhousie and pursued further graduate studies in cryptographic systems. Over the years, he has developed a wide variety of skills ranging from electronic assembly (including SMDs) to FPGA design in Verilog and VHDL to high-speed PCB design.

New CC Columnist to Focus on Programmable Logic

We’d like to introduce you to Colin O’Flynn, who will begin writing a bimonthly column titled “Programmable Logic In Practice” for Circuit Cellar beginning with our October issue.

Colin at his workbench

You may have already “met.” Since 2002, Circuit Cellar has published five articles from this Canadian electrical engineer, who is also a lecturer at Dalhousie University in Halifax, Nova Scotia, and a product developer.

Colin has been fascinated with embedded electronics since he was a child and his father gave him a few small “learn to solder” kits. Since then, he has constructed many projects, earned his master’s in applied science from Dalhousie, pursued graduate studies in cryptographic systems, and become an engineering consultant. Over the years, he has developed broad skills ranging from electronic assembly (including SMDs), to FPGA design in Verilog and VHDL, to high-speed PCB design.

And he likes to share what he knows, which makes him a good choice for Circuit Cellar.

Binary Explorer

One of his most recent  projects was a Binary Explorer Board, which he developed for use  in teaching a digital logic course at Dalhousie. It fulfilled his (and his students’) need for a simple programmable logic board with an integrated programmer, several switches and LEDs, and an integrated breadboard. He is working to develop the effective and affordable board into a product.

In the meantime, he is also planning some interesting column topics for Circuit Cellar.

He is interested in a range of possible topics, including circuit board layout for high-speed FPGAs; different methods of configuring an FPGA; design of memory into FPGA circuits;

Colin’s LabJack-based battery tester

use of tools such as Altera’s OpenCV libraries to design programmable logic using C code; use of vendor-provided and open-source soft-core microcontrollers; design of a PCI-Express interface for your FPGA; and addition of a USB 3.0 interface to your FPGA.

That’s just a short list reflecting his interest in programmable logic technologies, which have become increasingly popular with engineers and designers.

To learn more about Colin’s interests, check out our February interview with him, his YouTube channel of technical videos, and, of course, his upcoming columns in Circuit Cellar.

 

 

The Future of FPGAs (CC 25th Anniversary Preview)

Field-programmable gate arrays (FPGAs) have been around for more than two decades. What does the future hold for this technology? According to Halifax, Canada-based electrical engineering consultant Colin O’Flynn, current FPGA-related research and recent innovations seem to presage a coming revolution in digital system design, and this could lead to striking fast advances in several fields of engineering.

In the upcoming Circuit Cellar 25th Anniversary Issue—which is slated for publication in early 2013—O’Flynn shares his thoughts on the future of FPGA technology. He writes:

Field-programmable gate arrays (FPGAs) provide a powerful means to design digital systems (see Photo 1). Rather than writing a software program, you can design a number of hardware blocks to perform your tasks at blazing speeds…

Photo 1: Source: C. O’Flynn, CC 25th Anniversary issue

Microcontrollers have long played the peripheral game: the integration of easy-to-use dedicated peripherals onto the same physical chip as your digital core. FPGAs, it would seem, have no use for dedicated logic, since you can just design everything exactly as you desire. But dedicated logic has its advantages.

Beyond technical advantages, such as lower power consumption or smaller area with dedicated cores compared to programmable cores, dedicated cores can also reduce development effort. For example, current technology sees FPGAs with integrated high-end ARM cores, capable of running Linux on the integrated hard-core. Anyone familiar with setting up Linux on an ARM-based microprocessor can use this, without needing to learn about how one develops cores and peripherals on the FPGA itself.
Beyond integrating digital cores to simplify development, you can expect to see the integration of analog peripherals. Looking at the microcontroller market, you can find a variety of tightly integrated SoC devices with analog and digital on a single device. For instance, a variety of radio devices contain a complete RF front-end combined with a digital microcontroller. While current FPGA devices offer very limited analog peripherals (most have none), having a FPGA with an integrated high-speed ADC or DAC would be the making of a highly flexible radio-on-a-chip platform. The high development cost and lack of a current market has meant this remains only an interesting idea. To see where this market comes from, let’s look at some applications for such an FPGA.

Software-Defined Radio
Software-defined radio (SDR) takes a curious approach to receiving radio waves: digitize it all, and let software sort it out. The radio front-end is simple. Typically, the center frequency of interest is just downshifted to the baseband, everything else is filtered out, and a high-speed ADC digitizes it. All the demodulation and decoding then can be down in software. Naturally, this can require some fast sampling speeds. Anything from 20 to 500 MSps is fairly typical for these systems. Dealing with this much data is suited to FPGAs, since one can generate blocks to perform all the different functions that operate simultaneously…

Circuit Cellar’s Circuit Cellar 25th Anniversary Issue will be available in early 2013. Stay tuned for more updates on the issue’s content.

Q&A: Hai (Helen) Li (Academic, Embedded System Researcher)

Helen Li came to the U.S. from China in 2000 to study for a PhD at Purdue University. Following graduation she worked for Intel, Qualcomm, and Seagate. After about five years of working in industry, she transitioned to academia by taking a position at the Polytechnic Institute of New York University, where she teaches courses such as circuit design (“Introduction to VLSI”), advanced computer architecture (“VLSI System and Architecture Design”), and system-level applications (“Real-Time Embedded System Design”).

Hai (Helen) Li

In a recent interview Li described her background and provided details about her research relating to spin-transfer torque RAM-based memory hierarchy and memristor-based computing architecture.

An abridged version of the interview follows.

NAN: What were some of your most notable experiences working for Intel, Qualcomm, and Seagate?

HELEN: The industrial working experience is very valuable to my whole career life. At Seagate, I led a design team on a test chip for emerging memory technologies. Communication and understanding between device engineers and design communities is extremely important. The joined effects from all the related disciplines (not just one particular area anymore) became necessary. The concept of cross layers (including device/circuit/architecture/system) cooptimization, and design continues in my research career.

NAN: In 2009, you transitioned from an engineering career to a career teaching electrical and computer engineering at the Polytechnic Institute of New York University (NYU). What prompted this change?

HELEN: After five years of working at various industrial companies on wireless communication, computer systems, and storage, I realized I am more interested in independent research and teaching. After careful consideration, I decided to return to an academic career and later joined the NYU faculty.

NAN: How long have you been teaching at the Polytechnic Institute of NYU? What courses do you currently teach and what do you enjoy most about teaching?

HELEN: I have been teaching at NYU-Poly since September 2009. My classes cover a wide range of computer engineering, from basic circuit design (“Introduction to VLSI”), to advanced computer architecture (“VLSI System and Architecture Design”), to system-level applications (“Real-Time Embedded System Design”).

Though I have been teaching at NYU-Poly, I will be taking a one-year leave of absence from fall 2012 to summer 2013. During this time, I will continue my research on very large-scale integration (VLSI) and computer engineering at University of Pittsburgh.

I enjoy the interaction and discussions with students. They are very smart and creative. Those discussions always inspire new ideas. I learn so much from students.

Helen and her students are working on developing a 16-Kb STT-RAM test chip.

NAN: You’ve received several grants from institutions including the National Science Foundation and the Air Force Research Lab to fund your embedded systems endeavors. Tell us a little about some of these research projects.

HELEN: The objective of the research for “CAREER: STT-RAM-based Memory Hierarchy and Management in Embedded Systems” is to develop an innovative spin-transfer torque random access memory (STT-RAM)-based memory hierarchy to meet the demands of modern embedded systems for low-power, fast-speed, and high-density on-chip data storage.

This research provides a comprehensive design package for efficiently integrating STT-RAM into modern embedded system designs and offers unparalleled performance and power advantages. System architects and circuit designers can be well bridged and educated by the research innovations. The developed techniques can be directly transferred to industry applications under close collaborations with several industry partners, and directly impact future embedded systems. The activities in the collaboration also include tutorials at the major conferences on the technical aspects of the projects and new course development.

The main goal of the research for “CSR: Small Collaborative Research: Cross-Layer Design Techniques for Robustness of the Next-Generation Nonvolatile Memories” is to develop design technologies that can alleviate the general robustness issues of next-generation nonvolatile memories (NVMs) while maintaining and even improving generic memory specifications (e.g., density, power, and performance). Comprehensive solutions are integrated from architecture, circuit, and device layers for the improvement on the density, cost, and reliability of emerging nonvolatile memories.

The broader impact of the research lies in revealing the importance of applying cross-layer design techniques to resolve the robustness issues of the next-generation NVMs and the attentions to the robust design context.

The research for “Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques” was inspired by memristors, which have recently attracted increased attention since the first real device was discovered by Hewlett-Packard Laboratories (HP Labs) in 2008. Their distinctive memristive characteristic creates great potentials in future computing system design. Our objective is to investigate process-variation aware memristor modeling, design methodology for memristor-based computing architecture, and exploitation of circuit techniques to improve reliability and density.

The scope of this effort is to build an integrated design environment for memristor-based computing architecture, which will provide memristor modeling and design flow to circuit and architecture designers. We will also develop and implement circuit techniques to achieve a more reliable and efficient system.

An electric car model controlled by programmable emerging memories is in the developmental stages.

NAN: What types of projects are you and your students currently working on?

HELEN: Our major efforts are on device modeling, circuit design techniques, and novel architectures for computer systems and embedded systems. We primarily focus on the potentials of emerging devices and leveraging their advantages. Two of our latest projects are a 16-Kb STT-RAM test chip and an electric car model controlled by programmable emerging memories.

The complete interview appears in Circuit Cellar 267 (October 2012).

CC266: Microcontroller-Based Data Management

Regardless of your area of embedded design or programming expertise, you have one thing in common with every electronics designer, programmer, and engineering student across the globe: almost everything you do relates to data. Each workday, you busy yourself with acquiring data, transmitting it, repackaging it, compressing it, securing it, sharing it, storing it, analyzing it, converting it, deleting it, decoding it, quantifying it, graphing it, and more. I could go on, but I won’t. The idea is clear: manipulating and controlling data in its many forms is essential to everything you do.

The ubiquitous importance of data is what makes Circuit Cellar’s Data Acquisition issue one of the most popular each year. And since you’re always seeking innovative ways to obtain, secure, and transmit data, we consider it our duty to deliver you a wide variety of content on these topics. The September 2012 issue (Circuit Cellar 266) features both data acquisition system designs and tips relating to control and data management.

On page 18, Brian Beard explains how he planned and built a microcontroller-based environmental data logger. The system can sense and record relative light intensity, barometric pressure, relative humidity, and more.

a: This is the environmental data logger’s (EDL) circuit board. b: This is the back of the EDL.

Data acquisition has been an important theme for engineering instructor Miguel Sánchez, who since 2005 has published six articles in Circuit Cellar about projects such as a digital video recorder (Circuit Cellar 174), “teleporting” serial communications via the ’Net (Circuit Cellar 193), and a creative DIY image-processing system (Circuit Cellar 263). An informative interview with Miguel begins on page 28.

Turn to page 38 for an informative article about how to build a compact acceleration data acquisition system. Mark Csele covers everything you need to know from basic physics to system design to acceleration testing.

This is the complete portable accelerometer design. with the serial download adapter. The adapter is installed only when downloading data to a PC and mates with an eight pin connector on the PCB. The rear of the unit features three powerful
rare-earth magnets that enable it to be attached to a vehicle.

In “Hardware-Accelerated Encryption,” Patrick Schaumont describes a hardware accelerator for data encryption (p. 48). He details the advanced encryption standard (AES) and encourages you to consider working with an FPGA.

This is the embedded processor design flow with FPGA. a: A C program is compiled for a softcore CPU, which is configured in an FPGA. b: To accelerate this C program, it is partitioned into a part for the software CPU, and a part that will be implemented as a hardware accelerator. The softcore CPU is configured together with the hardware accelerator in the FPGA.

Are you now ready to start a new data acquisition project? If so, read George Novacek’s article “Project Configuration Control” (p. 58), George Martin’s article “Software & Design File Organization” (p. 62), and Jeff Bachiochi’s article “Flowcharting Made Simple” (p. 66) before hitting your workbench. You’ll find their tips on project organization, planning, and implementation useful and immediately applicable.

Lastly, on behalf of the entire Circuit Cellar/Elektor team, I congratulate the winners of the DesignSpark chipKIT Challenge. Turn to page 32 to learn about Dean Boman’s First Prize-winning energy-monitoring system, as well as the other exceptional projects that placed at the top. The complete projects (abstracts, photos, schematic, and code) for all the winning entries are posted on the DesignSpark chipKIT Challenge website.

FPGA-Based VisualSonic Design Project

The VisualSonic Studio project on display at Design West last week was as innovative as it was fun to watch in operation. The design—which included an Altera DE2-115 FPGA development kit and a Terasic 5-megapixel CMOS Sensor (D5M)—used interactive tokens to control computer-generated music.

at Design West 2012 in San Jose, CA (Photo: Circuit Cellar)

I spoke with Allen Houng, Strategic Marketing Manager for Terasic, about the project developed by students from National Taiwan University. He described the overall design, and let me see the Altera kit and Terasic sensor installation.

A view of the kit and sensor (Photo: Circuit Cellar)

Houng also he also showed me the design in action. To operate the sound system, you simply move the tokens to create the sound effects of your choosing. Below is a video of the project in operation (Source: Terasic’s YouTube channel).