Basics of Design CC Blog Research & Design Hub

FPGAs for AI and Machine Learning

Written by Al Mahmud Al Mamun

Reconfigurable Chip for Embedded Solutions

FPGA chips come with a million logic gates and reconfigurable architecture that can deliver the best solutions for artificial intelligence (AI) and machine learning (ML), enable entire processing optimization, and fit for neural network infrastructures.

  • What is a Field-programmable gate array (FPGA)?

  • Why is FPGA important for artificial intelligence and machine learning?

  • What are the benefits of FPGAs?

  • When should you use FPGAs?

  • What are the applications of FPGAs?

  • Xilinx XA Artix-7

  • Intel Stratix 10 NX 2100 FPGA

  • iCE40 UltraPlus FPGA

  • Microchip PolarFire FPGA

Field-programmable gate array (FPGA) chips enable the reprogramming of logic gates that enable overwrite configurations and custom circuits. FPGAs are helpful for artificial intelligence (AI) and machine learning (ML) and are suitable for a wide area of applications through the re-configurable capability. The chips support accelerating development and data processing and are flexible, scalable, and reusable for the embedded systems.

The global AI market size was estimated at $136.6 billion in 2022 (the base year for estimation was 2021 and historical data from 2017 to 2020) and is expected to reach $1.8 billion by 2030, at a 38.1% compound annual growth rate (CAGR), (forecast period 2022 to 2030) [1]. The global market of FPGA chips is growing due to the increasing adoption of AI and ML technologies for edge computing in data centers. The global FPGA market size was estimated at $6 billion in 2021 and is expected to reach $14 billion by 2028, at a 12% CAGR, (forecast period 2022 to 2028) [2].

BRIEF REVIEW OF FPGA

The FPGA chip systems combine an array of logic blocks, ways to program the logic blocks, and relationships among them. They can be customized for multiple uses by changing the specification of the configuration using hardware description languages (HDLs), such as Verilog and VHDL. Proper re-configuration enables them to perform nearly similar to application-specific integrated circuits (ASICs) and the chips can perform better than the central processing unit (CPU) and graphics processing unit (GPU) for data processing acceleration.

Xilinx invented FPGAs in 1985, their first FPGA was the XC2064 (Figure 1), which offered 800 gates and was produced on a 2.0µ process [3]. In the late 1980s, the Naval Surface Warfare Center initiated a research project proposed by Steve Casselman to build an FPGA with more than 600,000 reprogrammable gates, which were successful and the design was patented in 1992 [4]. In the early 2000s, FPGAs reached millions of reprogrammable gates, and Xilinx introduced the Virtex XCVU440 All Programmable 3D IC in 2013, which offers 50 million (equivalent) ASIC logic gates.

Figure 1
The XC2064 FPGA with 800 gates and a 2.0µ process
Figure 1
The XC2064 FPGA with 800 gates and a 2.0µ process

The FPGA chip manufacturers have their own architecture for their products that generally consist of configurable logic blocks, configurable I/O blocks, and programmable interconnect. The FPGAs have three basic types of programmable elements including static RAM, anti-fuses, and flash EPROM. The chips are available with several system gates, including shift registers, logic cells, and look-up tables.

To select the FPGAs you should analyze three things including memory, performance, and interface requirements. FPGA chips are available with several types of memory including CAM, Flash, RAM, Dual-port RAM, ROM, EEPROM, FIFO, and LIFO. The logic families of FPGA chips include Crossbar switch technology (CBT), Gallium arsenide (GaAs), integrated injection logic (I2L), and Silicon on sapphire (SOS). There are four basic IC package types for FPGA chips, including the ball grid array (BGA), the quad flat package (QFP), the single in-line package (SIP), and the dual in-line package (DIP).

— ADVERTISMENT—

Advertise Here

FPGA CHIPS FOR AI AND MACHINE LEARING

AI and ML play key roles in the modern technology revolution in all prospective areas of application that involve a large amount of data and real-time data processing. 5G technology has already started and comes with high speeds and vast amounts of data transferring capability that create new opportunities for AI and ML. For the rapidly growing technology, general computing systems are not enough, and parallel computing becomes necessary. The technology environments are updating and changing rapidly so the use of ASICs becomes difficult and very costly. The FPGA chips with the ability to re-configurable architecture are the best solutions for such purposes and developers are deploying FPGA solutions (Figure 2).

Figure 2
The FPGA chips with the ability to re-configurable architecture are the best solutions for AI and ML.
Figure 2
The FPGA chips with the ability to re-configurable architecture are the best solutions for AI and ML.

Automotive and industrial control systems need to collect data from sensors or measuring devices, process the collected data, and take action by deriving command functions through several elements. The control system is associated with the instrumentation used for real-time data processing, manufacturing, and process control, which vary in size ranges from a few modular panel-mounted controllers to large interconnected distributed control systems. The FPGAs enable the optimization of the entire process controller applications in the industrial and automotive fields. The latest FPGA chips open up opportunities to utilize controller systems designed for any specific applications.

ML algorithms and artificial neural networks (ANNs) became more sophisticated and require a huge amount of data for training and validation to gain higher accuracy and minimize the error rate. The systems are generally task-specific and related to almost every type of applications. So the systems need re-programmable solutions that fit every application purpose and using FPGA to build a neural network infrastructure delivers higher performance.

AUTOMOTIVE SOLUTIONS

With the revolution of AI and ML, the automotive industry is also growing rapidly. The automotive industry is mostly application-specific for the different areas of applications. To support the modern automotive industry for a specific purpose, we need application-specific chip configuration to accelerate data processing. Because the industry increasing requires a variety of applications, it is difficult to find an application-specific chip configuration for every area. In this situation, an FPGA with its application-specific configuration can be the best solution and able to deliver high performance with scalability.

Many manufacturers offer several FPGA chips that can be re-configurable for application-specific automotive solutions. Xilinx XA Artix-7 FPGAs can be useful for your automated systems. XA Artix-7 (Figure 3) is an automotive-grade FPGA that offers optimization for the lowest cost and power with small form-factor packaging for automotive applications and allows designers to leverage more logic per watt. The UltraScale+ MPSoC devices integrate a 64-bit quad-core ARM Cortex-A53 and dual-core ARM Cortex-R5-based processing system. The flexible and scalable FPGA solution is ideally suited for automotive platforms including driver assistance and automated driving systems. It helps to accelerate design productivity with a strong network of automotive-specific third-party ecosystems.

Figure 3
Xilinx XA Artix-7 is an automotive-grade FPGA that integrates a 64-bit quad-core ARM Cortex-A53 and dual-core ARM Cortex-R5-based processing system and increased system performance through 100,000 logic cells.
Figure 3
Xilinx XA Artix-7 is an automotive-grade FPGA that integrates a 64-bit quad-core ARM Cortex-A53 and dual-core ARM Cortex-R5-based processing system and increased system performance through 100,000 logic cells.

Artix-7’s increased system performance through 100,000 logic cells, 264 GMAC/s DSP, 52 Gb/s I/O bandwidth, DDR3 interfaces up to 800 Mb/s, integrated block for PCI express, Ethernet AVB and CAN interface provided for effective video and data distribution. The chip integrated advanced analog mixed-signal (AMS) technology and seamless implementation of independent dual 12-bit, 1 MSPS, 17-channel analog-to-digital converters (ADC). The FPGA delivers programmable and scalable densities and processing speeds for different automotive applications. XA Artix-7 FPGAs meet the high standards of the automotive-grade needs with the temperature of I-Grade from -40°C to +100°C and Q-Grade from -40°C to +125°C. The FPGA chip delivers an unsurpassed combination of performance, power efficiency, and functional safety.

NLP AND REAL-TIME VIDEO SOLUTION

Natural language processing (NLP) requires processing and analyzing large amounts of data that involves speech recognition, understanding, and re-generating. Real-time video analytics is a powerful technology evolution that allows monitoring and identification of violations, troubling behaviors, and unusual actions. Video analytics involves several areas of video processing such as object detection, facial recognition, and anomaly detection. The FPGA chips with efficient configuration are very effective for NPL and real-time video analytics systems to utilize machine learning algorithms and ANNs. Many manufacturers are offering several FPGAs that provide high-level processing performance for natural languages and real-time videos.

Intel Stratix 10 NX 2100 (Figure 4) is Intel’s first AI-optimized FPGA for high-bandwidth and low-latency AI acceleration applications. The chips are good for several NLP applications such as speech recognition, speech synthesis, and real-time video analytic applications such as content recognition and video pre-processing or post-processing. You can use the FPGA chips for AI-based security applications including fraud detection, deep packet inspection, and congestion control identification. They support extending AI+ large models across the multi-node solution.

Figure 4
Intel Stratix 10 NX 2100 FPGA embeds AI Tensor Blocks and supports extending AI+ large models across the multi-node solution.
Figure 4
Intel Stratix 10 NX 2100 FPGA embeds AI Tensor Blocks and supports extending AI+ large models across the multi-node solution.

Stratix 10 NX FPGA embeds AI Tensor Blocks that are tuned for the common matrix-matrix or vector-matrix multiplications. The AI Tensor Block is used in AI computations with capabilities designed to work efficiently. Its integrated memory stacks allow for large, persistent AI models to be stored on-chip that ensure lower latency with large memory bandwidth to prevent memory-bound performance challenges in large models. Maximum non-return to zero (NRZ) and pulse-amplitude modulation (PAM4) transceivers are 96 and 36, respectively. The maximum data rate for NRZ and PAM4 is 28.9Gbps and 57.8Gbps, respectively.

— ADVERTISMENT—

Advertise Here

The PAM4 transceivers implement multi-node AI inference solutions, reducing or eliminating bandwidth connectivity as a limiting factor in multi-node designs and providing scalable connectivity and flexible adaptability to your requirements. The transceivers incorporate hard IPs such as PCIe Gen3, 50/100G Ethernet, and 10/25/100G Ethernet.

INTELLIGENT EDGE SOLUTION

In intelligent edges, data is generated, analyzed, interpreted, and addressed. Its major categories include operational technology edges, information technology edges, and IoT edges. It is a set of connected devices and systems that gather and analyze data and develop solutions related to data, users, or both.

An intelligent edge makes the business more efficient by reducing unexpected delays, costs, and risks. Deployment of FPGAs provides good solutions for data load, tasks, and real-time operation. The AI Engine on an FPGA allows resolving the compromise between performance and latency. The applicability of FPGAs for intelligent edges is proved through their configurability, low latency, parallel computing capability, and higher flexibility.

You can use the iCE40 UltraPlus FPGA chip (Figure 5) from Lattice Semiconductor for your intelligent edge solutions. With 5,000 lookup tables, the FPGA chip is capable of implementing Neural Networks for pattern matching necessary to bring always-on intelligence to the edge.

Figure 5
The iCE40 UltraPlus FPGA chip is designed with 5,000 lookup tables and delivers high performance in signal processing using DSP blocks and the soft neural network IPs and compiler for flexible AI/ML implementation.
Figure 5
The iCE40 UltraPlus FPGA chip is designed with 5,000 lookup tables and delivers high performance in signal processing using DSP blocks and the soft neural network IPs and compiler for flexible AI/ML implementation.

The UltraPlus deliver the lowest power AI and ML solutions with flexible interfaces and allow designers to eliminate latency associated with cloud intelligence at a lower cost. It can solve the connectivity issues with a variety of interfaces and protocols for the rapidly growing system complexity of powering smart homes, factories, and cities. The FPGA chip provides the low-power computation for higher levels of intelligence and multiple packages are available to fit a wide range of applications needs.

Lattice Semiconductor is expanding its mobile FPGA product family with the iCE40 UltraPlus, delivering 1.1 Mbit RAM, twice the digital signal processing blocks, and improved I/O over previous generations. The FPGA chip is designed with Flexible logic architecture with UP3K and UP5K parameters, 2800 and 5280 four-input density LUTs, customizable I/O, and up to 80Kbits dual port and 1Mbit single port embedded memory. The chip delivers high performance in signal processing using DSP blocks, and the soft neural network IPs and compiler for flexible AI/ML implementation.

EMBEDDED VISION SOLUTION

Embedded vision integrates a camera and processing board that opens up several new possibilities (Figure 6). The systems have a wide area of applications including autonomous vehicles, digital dermoscopic, medical vision, and other cutting-edge applications. Embedded vision systems can be deployed for specific-purpose of applications that require application-specific chips for the processing and operation. The FPGA with application-specific configuration can deliver high performance and efficiency for embedded vision solutions. With the rapidly growing vision technology, FPGA chips enable powerful processing in a wide range of applications and deliver the maximum processing solution with their flexible reconfiguration and performance capability.

You can select the PolarFire FPGA chip (Figure 7) from Microchip for your embedded vision system. The FPGA chips offer a variety of solutions for smart embedded vision such as video, imaging, and machine learning IP and tools for accelerating system designs. They come with cost-optimized architecture and power optimization capability. PolarFire FPGA chips support process optimizations for 100K/500K LE devices, transceiver performance optimized for 12.7Gbps, and 1.6Gbps I/Os supporting DDR4/DDR3/LPDDR3, LVDS-hardened I/O gearing logic with CDR. The PolarFire integrated hard IP includes DDR PHY, PCIe endpoint/root port, and crypto processor.

Figure 6
FPGA chips enable powerful processing that can deliver high performance and efficiency for embedded vision solutions.
Figure 6
FPGA chips enable powerful processing that can deliver high performance and efficiency for embedded vision solutions.
Figure 7
Microchip’s PolarFire FPGA chip offers a large variety of solutions for smart embedded vision and supports process optimizations for 100K/500K LE devices.
Figure 7
Microchip’s PolarFire FPGA chip offers a large variety of solutions for smart embedded vision and supports process optimizations for 100K/500K LE devices.

The PolarFire FPGA family has five product models including MPF050 (logic elements 48K and total I/O 176), MPF100 (logic elements 109K and total I/O 296), MPF200 (logic elements 192K and total I/O 364), MPF300 (logic elements 300K and total I/O 512), and MPF500 (logic elements 481K and total I/O 584). The solutions deliver high performance in low-power and small form factors across the industrial, medical, broadcast, automotive, aerospace, and defense solutions.

CONCLUSIONS

FPGA chips are designed with a lightweight, smaller form factor, and very low power consumption and they can process a huge amount of data faster than CPUs and GPUs. The chips are easy to deploy for rapidly growing AI and ML fields. AI is everywhere, and hardware upgrades of a satellite are very expensive whereas FPGAs provide long-term solutions with flexibility.

FPGA chips are the complete ecosystem solution and System-on-Chip (SoC) FPGA chips will expand applicability with real-time compilation and automatic FPGA program generation for next-generation technology demands. 

REFERENCES
[1] https://www.researchandmarkets.com/reports/4375395/artificial-intelligence-market-size-share-and
[2] https://www.gminsights.com/industry-analysis/field-programmable-gate-array-fpga-market-size
[3] https://www.xilinx.com/publications/archives/xcell/Xcell32.pdf
[4] https://web.archive.org/web/20070412183416/http://filebox.vt.edu/users/tmagin/history.htm
For publish: https://books.google.com.bd/books?id=7Xi5BQAAQBAJ

RESOURCES
Xilinx | www.xilinx.com/products/silicon-devices/fpga/xa-artix-7.html
Intel | www.intel.com/content/www/us/en/products/sku/213092/intel-stratix-10-nx-2100-fpga/specifications.html
Lattice Semiconductor | www.latticesemi.com/en/Products/FPGAandCPLD/iCE40UltraPlus
Microchip | https://www.microchip.com/en-us/products/fpgas-and-plds/fpgas/polarfire-fpgas/polarfire-mid-range-fpgas

PUBLISHED IN CIRCUIT CELLAR MAGAZINE • AUGUST 2022 #385 – Get a PDF of the issue

Keep up-to-date with our FREE Weekly Newsletter!

Don't miss out on upcoming issues of Circuit Cellar.

— ADVERTISMENT—

Advertise Here


Note: We’ve made the May 2020 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Sponsor this Article
Editor-in-Chief (Former) at | + posts

Al Mahmud Al Mamun Former (March 2022-July 2022) Editor-in-Chief — Editor-in-Chief Circuit Cellar magazine. With a background in engineering, research, and publishing and a master’s degree in Computer Science and Engineering. His passions include Robotics, Artificial Intelligence, and Smart technologies.

Supporting Companies

Upcoming Events


Copyright © KCK Media Corp.
All Rights Reserved

Copyright © 2023 KCK Media Corp.

FPGAs for AI and Machine Learning

by Al Mahmud Al Mamun time to read: 10 min