Linaro Launches Two 96Boards SOM Specifications

Linaro has launched two SOM specifications for 96Boards—a Compute Module spec and a Wireless spec. It has also released two board designs based on the Compute spec, along with a 96Boards SOM Carrier board compatible with those two boards.

Linaro, the Arm-backed open source collaborative engineering organization, has announced the publication of version 1.0 of 96Boards System-on-Module (SOM) specifications. 96Boards is Linaro’s initiative to build a single software and hardware community across low-cost development boards based on Arm technology. Along with the new specifications, the company has rolled out two board designs: the TB-96AI based on a Rockchip RK3399Pro processor, and the TB-96AIoT based on the newer Rockchip RK1808 processor.

We’ve [Linuxgizmos.com] covered a couple RK3399Pro-based boards just within that last four months, including Geniatech’s DB3399 Pro, Vamrs’ Toybrick RK3399Pro SBC and crowdfunded Khadas Edge-1S SBC from Shenzhen Wesion’s Khadas project. The newer Rockchip RK1808, announced in January at CES, is basically a “lite”, lower power version of the RK3399Pro with the same Network Processing Unti (NPU). See further down for more details on the RK1808.

The launch of the new 96Boards specifications provides developers with a SOM solution that is compatible across SoCs. According to Linaro, SOM solutions today use a variety of different connector solutions including SO-DIMM connectors used in DRAM and Mini Module Plus (MMP) connectors for certain specialist boards. Up until now, there has been no solution offering flexible IO and a robust mounting mechanism, nor a standard form factor, says Linaro. The goal of new 96Boards SOM specifications is to enable plug and play compatibility between a whole range of different SOM solutions.

Two 96Boards SOM specifications have been launched: The Compute Module Specification and the Wireless Specification. Both specifications encourage the development of reliable and cost-effective embedded platforms for building end-products. The specifications have been proposed, created and reviewed by the current 96Boards Steering Committee Members.

The Compute Module Specification defines a SOM with generic module-to-carrier board interface, independent of the specific SoC choice on the module. The Compute module addresses the application requirements of segments including industrial automation, smart devices, gateway systems, automotive, medical, robotics and retail POS systems. Two form factors are defined as SOM-CA and SOM-CB with a maximum of four 100 pin Connectors. The X1 connector is mandatory on all SOMs. The defined interfaces are shown in the table below.


Compute Module Spec — Defined Interfaces
(click image to enlarge)
The Wireless specification designs a SOM for interchangeable wireless module applications, supporting standard and/or proprietary wireless standards such as 802.15.4, BLE, WiFi, LoRa, NB-IoT, LTE-M etc. The specification is designed to enable evolution that will support multiple products and future wireless standards. The two form factors are defined as SOM-WA/SOM-WB with the pinouts to the specification shown in the table below.


Wireless Spec Pinouts
(click image to enlarge)
TB-96AI

The TB-96AI can be combined with the backplane to form a complete industry application motherboard, and be applied to various embedded artificial intelligence fields. The TB-96AI’s RK3399Pro processor has an Arm dual-core Cortex-A72+quad-core Cortex-A53 architecture. The processor has frequencies is up to 1.8 GHz and integrates a Mali-T860 MP4 quad-core graphics processor. The chip’s integrated NPU supports 8Bit/16Bit operation. With computing power of 3.0 Tops, the NPU can meet various AI application needs such as vision, audio and so on.

 
TB-96AI, front and back
(click images to enlarge)
The TB-96AI supports DP1.2, HDMI 2.0, MIPI-DSI, eDP multiple display output interfaces, dual-screen co-display/dual-screen heterodyne, 4K VP9, 4K 10bits H265/H264 and 1080P multi-format (VC-1, MPEG-1/2/4, VP8) video decoding, 1080P (H.264, VP8 format) video coding. The board is compatible with multiple AI frameworks, the design supports TensorFlow Lite/Android NN API, AI software tools support import, mapping and optimization of Caffe / TensorFlow models, allowing developers to easily use AI technology.

TB-96AIoT

The TB-96AIoT meanwhile is equipped with the RK1808 AIoT chip. According to Linaro, the TB-96AIoT also provides rich interfaces and strong scalability. Aside from this, little other detail on the TB-96AIoT is provided in the announcement.

The Rockchip RK1808 processor used on the TB-96AIoT features a dual-core Cortex-A35 CPU architecture, NPU computing performance up to 3.0 Tops, VPU supporting 1080P video codec, microphone array with hardware VAD function, and camera video signal input with built-in ISP. The RK1808 boasts lower power consumption thanks in part to being built on an 22nm FD-SOI process. This shrinks power consumption by about 30%, compared with mainstream 28nm processes under the same performance, according to Rockchip. The device features DDR-free operation of the always-on device with built-in 2MB system-level SRAM. A hardware VAD function provides low-power monitoring and far-field wake-up, features all suited to IoT applications.

Both the TB-96AI and TB-96AIoT SOM designs are available for purchase from Beiqicloud.com—sign in required. A story by cnx-software points out that Vamrs is also involved because of the “ToyBrick” reference on the boards’ silkscreen.

96Boards SOM Carrier Board

The 96Boards SOM Carrier Board is compatible with both the TB-96AI and TB-96AIoT. It is designed to suit different markets and demonstrates how easy it is to support multiple different SOMs.


96Boards SOM carrier board
(click image to enlarge)
There wasn’t much detailed on the carrier board spelled-out in the announcement, although this detail graphic was provided:


96Boards SOM carrier board detail
(click image to enlarge)
 Further information

More information on the new SOM specifications can be found on the announcement page. You can learn more about Linaro’s engineering work on the Linaro and 96Boards websites. Beiqicloud is 96Boards Compute SOM Lead partner. For more information about SOM boards and Carrier board visit Beiqicloud’s products page.

This article originally appeared on LinuxGizmos.com on April 2.

Linaro | www.linaro.org

Latest UP Board Combines Whiskey Lake with AI Core X Modules

By Eric Brown

Aaeon has posted specs for a Linux-ready “UP Xtreme” SBC with a 15 W, 8th Gen Whiskey Lake-U CPU, up to 16 GB DDR4 and 128 GB eMMC, 2x GbE, 6x USB, SATA and optional AI Core X modules via M.2 and mini-PCIe.

Aaeon’s community-backed UP project, which most recently brought us the Intel Apollo Lake based Up Squared and UP Core Plus SBCs, has announced an UP Xtreme hacker board built around Intel’s 8th Gen Whiskey Lake U-series Core processors. This is likely the fastest open-spec, community-backed SBC around, depending on your definition.


 
UP Xtreme and block diagram
(click images to enlarge)
Despite lacking full schematics, the UP boards barely qualify for our catalog of open-spec Linux hacker boards. However, DFRobot’s maker-oriented LattePanda boards, including the Kaby Lake based LattePanda Alpha, do not. In any case the 1.6 GHz/2.6 GHz, dual quad-thread Core m3-7Y30 on the LattePanda Alpha would not match the performance of the quad-core UP Xtreme model. Other boards that come close include Hardkernel’s more fully open-spec, quad-core Gemini Lake based Odroid-H2.

The only SBCs we’ve seen announced with the 14nm fabricated Whiskey Lake are Congatec’s 3.5-inch Conga-JC370 and thin Mini-ITX Conga-IC370. The Whiskey Lake U-series chips are notable for providing quad-core configurations with the same 15W TDPs of Intel’s earlier dual-core U-series chips. The quad-core models offer a performance increase of up to 40 percent compared to previous U-Series processors.

Aaeon appears to support all five Core i7/i5/i3 models, all but one of which are dual-threaded. The models range from the 1.8GHz (4.6GHz Turbo), quad-core Core i7-8565U to the 1.8 GHz (3.9 GHz Turbo), dual-core Core i3-8145U. Congatec clocks the latter’s base speed at up to 2.1 GHz, but Aaeon lists only 1.8 GHz base frequency for all the models.

The Whiskey Lake processors integrate Intel Gen9 UHD Graphics 620 with 24 EUs. They’re also notable for supporting USB 3.1 Gen2 with up to a 10 Gbps transfer rate. Sadly, however, the UP Xtreme does not include a USB 3.1 port, perhaps to reduce costs.

Even still, the board is not likely to make our under-$200 cut-off for the hacker board catalog. As noted in the CNXSoft post that first revealed the SBC, the lowest cost i3-8145 Whiskey Lake model sells for $281, suggesting the lowest Xtreme price might be about $350 to $400.

At 120 x 120mm, this is the largest UP board yet. The SBC supports up to 16GB DDR4 and up to 128GB eMMC. In addition to offering a powered SATA interface, there’s a SATA option on the M.2 “B/M” key slot, and mSATA is available via the similarly multi-purpose mini-PCIe slot, which is accompanied by a SIM slot. An M.2 Key E slot is also onboard.



UP Xtreme detail view
(click image to enlarge)

The stacked HDMI and DisplayPorts will no doubt give you 4K video, and you can probably get triple 4K displays if you use the onboard 3DP header with backlighting. Audio headers are also available.

The UP Xtreme is further equipped with 2x GbE and 4x USB 3.0 ports, plus additional USB and RS232/422/485 headers. There’s also a pair of STM32 I/O headers, which may offer GPIO related to the STM32 MCU. Like other UP boards, further expansion is available via a 40-pin “HAT” GPIO connector, which suggests it can run some Raspberry Pi HATs.

AI Core X support

There’s no explanation for the 100-pin docking connector, which appears to offer four different options for I/O daughtercards (see spec list below). The UP Core Plus offers dual 100-pin connectors for various AI-enhanced add-ons such as the Cyclone 10GX-based AI Plus and the Myriad 2 based Vision Plus. However, the brief marketing copy on the UP Xtreme teaser page suggests that the UP Xtreme’s touted AI capabilities are instead launched via the M.2 and mini-PCIe slots.



AI Core X models
(click image to enlarge)
Aaeon notes the ability to add AI Core X Neural Compute Engine modules with 1TOPs neural acceleration performance. Equipped with Intel’s new Movidius Myriad X VPU, which also drives Intel’s new Intel Neural Compute Stick 2, the AI Core X modules are available in a variety of M.2 and mini-PCIe models.



AI Core X specs
(click image to enlarge)
The Myriad X VPU based AI Core X modules are also available now for the UP Core Plus. The Myriad X VPU provides a dedicated hardware neural network inference accelerator to deliver up to 10X higher performance than the Myriad 2 “for applications requiring multiple neural networks running simultaneously.”

Specifications listed for the UP Xtreme include:

  • Processor — Intel 8th Gen “Whiskey Lake” U-series — 2x or 4x Whiskey Lake @ 1.8GHz (up to 3.9 GHz or 4.6 GHz Turbo) with Intel Gen9 UHD Graphics 620 (24 EU) at 300 MHz base and 1 GHz max dynamic; Intel 300 series chipset
  • Memory — up to 16 GB of DDR4 via dual sockets
  • Storage:
    • 16GB to 128GB eMMC 5.1
    • SATA with SATA power
    • M.2 Key B/M with support for 2x SATA, and mini-PCIe with support for mSATA (see expansion below)
  • Networking — 2x Gigabit Ethernet ports (Intel i210/i211 and 1219LM)
  • Media I/O:
    • DisplayPort
    • HDMI port
    • eDP with backlight header
    • I2S audio and audio out/mic in with ALC887 codec
  • Other I/O:
    • 4x USB 3.0 host ports
    • 2x USB 2.0 headers
    • 2x RS232/422/485 (10-pin Fintech F81801 connectors)
    • HSUART
    • 2x STM32 I/O headers
  • Expansion:
    • 40-pin “HAT” header — By MAX5: 28x GPIO, 2x SPI, 2x I2C, ADC, I2S, 2x PWM, UART, 3V3, 5V, GND
    • 100-pin docking connector for 1) 12V, GND; 2) 3x PCIe x1; 3) 2x PCIe x1 or USB 3.0; 5) 2x USB 2.0
    • M.2 Key B/M (2242/2280) with 2x PCIe/2x SATA
    • M.2 Key E (2230) with PCIe/USB 2.0)
    • Mini-PCIe slot for mSATA/USB 2.0 with SIM slot
  • Other features — RTC with battery; heatsink; humidity resistance; optional AI Core X modules via M.2 or mini-PCIe
  • Power — Lockable 12-65V DC input; power button
  • Operating temperatures — 0 to 60°C
  • Dimensions — 120 x 120mm
  • Operating system – Linux (Ubuntu, Yocto); Android; Windows 10

Further information

No pricing or availability information was provided for the UP Xtreme. More information may be found at Aaeon’s UP community UP Xtreme product page.

This article originally appeared on LinuxGizmos.com on March 19.

Aaeon UP | up-board.org

COMe Type 7 Card Sports AMD EPYC Embedded 3000 Processor

Congatec has introduced its first Server-on-Module (SoM) with AMD embedded server technology. The new conga-B7E3 Server-on-Module with AMD EPYC Embedded 3000 processor offers up to 52% more instructions per clock compared to legacy architectures, according to the company. Use cases include Industry 4.0, smart robot cells with collaborative robotics, autonomous robotic and logistics vehicles, as well as virtualized on-premise equipment in harsh environments to perform functions such as industrial routing, firewall security and VPN technologies—optionally in combination with various real-time controls and neural network computing for Artificial Intelligence (AI)

Also attractive for edge server deployments is the support of the extended temperature range (-40 to 85 °C) for selected versions and the comprehensive RAS (reliability, availability and serviceability) features common to all versions. Edge applications benefit from the hardware-integrated virtualization and comprehensive security package that includes Secure Boot System, Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV), as well as a secure migration channel between two SEV-capable platforms. Support is also given for IPsec with integrated crypto acceleration. As a consequence, even the server administrator does not have access to such an encrypted Virtual Machine (VM). This is important for the high security required by many edge server services, which must enable multi-vendor applications in Industry 4.0 automation while effectively warding off sabotage attempts by hackers.

The conga-B7E3 COM Express Type 7 modules are equipped with AMD EPYC Embedded 3000 processors with 4, 8, 12, or 16 high-performance cores, support simultaneous multi-threading (SMT) and up to 96 GB of DDR4 2666 RAM in the COM Express Basic form factor and up to 1TB in full custom designs. Measuring just 125 x 95 mm, the COM Express Basic Type 7 module supports up 4x 10 GbE and up to 32 PCIe Gen 3 lanes. For storage the module even integrates an optional 1 TB NVMe SSD and offers 2x SATA Gen 3.0 ports for conventional drives.

Further interfaces include 4x USB 3.1 Gen 1, 4x USB 2.0 as well as 2x UART, GPIO, I2C, LPC and SPI. Attractive features also include seamless support of dedicated high-end GPUs and improved floating-point performance, which is essential for emerging AI and HPC applications. Congatec also offers advanced cooling solutions for its COM Express Type 7 Server-on-Modules that match the processor, support fanless cooling even beyond 65 W TDP, and can be adapted to customers’ housings, if required. This allows OEMs to integrate maximum processor performance into their designs, as performance is often limited by the system’s cooling capacity. OS support is provided for Linux and Yocto, as well as Microsoft Windows 10 and Windows Server.

Congatec | www.congatec.com

 

Tiny, Octa-Core Arm Module Targets AI on the Edge

By Eric Brown

Qualcomm’s octa-core Snapdragon 660 appeared on Intrinsyc’s Open-Q 660 HDK Mini-ITX dev kit back in 2017 and also showed up on an Inforce 6560 Pico-ITX SBC announced in February. Now Intrinsyc has returned with a tiny compute module implementation. The $225 Open-Q 660 µSOM (micro System on Module) measures only 50 mm x 25mm.


 
Open-Q 660 μSOM, front and back
(click images to enlarge)
Applications for the Open-Q 660 μSOM include on-device artificial intelligence, enhanced gaming, power optimization, device management, security, and advanced photography and image processing jobs such as camera and audio tuning. Intrinysc mentions a development kit that will connect to the module via its 3x 100-pin board to board connectors, but there were no further details.

The module runs Android 9.0 on the Snapdragon 660 (Qualcomm SDA660), which is claimed to offer up to 20 percent higher CPU performance and 30 percent higher graphics performance compared to the similarly octa-core Snapdragon 653. The Snapdragon 660 is also faster than the octa-core Snapdragon 625 and almost identical Snapdragon 626 thanks to its use of Cortex-A73-like “Kryo” cores.

The 14nm fabricated SoC has 4x Kryo cores clocked to 2.2 GHz and 4x clocked to 1.84 GHz, as well as a 650 MHz Adreno 512 GPU. The module’s AI potentiality is unlocked via dual Spectra 160 ISPs and a Hexagon 680 DSP with Hexagon Vector eXtensions (HVX), which supports Caffe2 and Tensorflow for machine learning and image processing.



Open-Q 660 μSOM
(click image to enlarge)
The Open-Q 660 μSOM has the same footprint as the Snapdragon 820 based Open-Q 820 µSOM. The module ships with a combo eMCP chip with 32GB eMMC and 4GB of dual-channel, 1866MHz LPDDR4 SDRAM.

The module integrates a 2.4/5GHz 802.11a/b/g/n/ac 2×2 MU-MIMO WiFi radio via a Qualcomm WCN3990 module supported with 5GHz external PA and U.FL antenna connectors. Bluetooth 5.x is also on board.

The Open-Q 660 μSOM is equipped with 2x 4-lane MIPI-DSI interfaces for up to 2560 x 1600 displays plus DP 1.4 for up to 4K@30 or 2K@60. The up to 24-megapixel camera support is derived from 3x 4-lane MIPI-CSI connections with I2C controllers for each camera port plus 2x camera flash control signals.

Audio features include a SLIMBus interface for external Qualcomm codecs plus optional Qualcomm Fluence support. You also get 4- and 2-lane MI2S interfaces for external audio devices, a Soundwire link for digital amps, and 2x PDM-based digital mic interfaces.

The Open-Q 660 μSOM supports single USB 3.1 Gen1 Type-C and USB 2.0 host ports plus 4-bit SD 3.0, 8x BLSP (UART, I2C, SPI), and configurable GPIOs. The module provides a PMIC and battery charging circuitry and offers a 3.6V to 4.2V input and a -10 to 70°C operating range.

Further information

The Open-Q 660 µSOM is available for pre-order at $225 in single quantities, with shipments due in April. More information may be found in Intrinsyc’s Open-Q 660 µSOM announcementproduct page, and shopping page

This article originally appeared on LinuxGizmos.com on March 25.

Intrinsyc | www.intrinsyc.com

i.MX 8M SoC-Based Solution Enables Immersive 3D Audio

NXP Semiconductors has announced its Immersiv3D audio solution for the smart home market. The solution combines NXP software on its i.MX 8M Mini applications processor and will support both Dolby Atmos and DTS:X immersive audio technologies in future devices that integrate the i.MX 8M Mini SoC. The i.MX 8M Mini also brings smart capabilities like voice control to a broader range of consumer devices including soundbars, smart speakers, and AV receivers with the option for adding additional speakers to distribute smart voice control and immersive audio throughout the home.

TVs and audio systems are becoming more advanced thanks in large part to the development of Dolby Atmos and DTS:X. Both technologies are a leap forward from surround sound and transport listeners with moving audio that fills the room and flows all around them. Listeners will feel like they’re inside the action as the sounds of people, places, thing, and music come alive with breathtaking realism. NXP’s Immersiv3D audio solution was designed to enable OEMs to bring to market affordable consumer audio devices capable of supporting Dolby Atmos and DTS:X in their next-generation devices.

Conventional design approaches to audio systems use Digital Signal Processors (DSPs) to deliver complex, controlled and low-latency audio processing to enable audio and video synchronization. But Traditional embedded systems have evolved over time, and today they are capable of processing the latest 3D audio formats, but audio systems need to be designed to take advantage of today’s advanced processor cores. In conjunction with the NXP i.MX 8M family of processors, the Immersiv3D audio solution introduces an advanced approach that features scalable audio processing integration into the SoC Arm cores. This approach eliminates the need for expensive discrete DSPs, and also once-proprietary DSP design foundations, to embrace licensable cores.

The solution delivers high-end audio features such as immersive multi-channel audio playback, natural language processing and voice capabilities to fit today’s digitally savvy connected consumer. The NXP Immersiv3D audio solution gives audio developers, designers and integrators a leap forward to add intelligence and Artificial Intelligence (AI) functionality while reducing cost. This includes development of enhancements like selective noise canceling where only certain sound elements are removed like car traffic or speech processing like changing speaker dialect or languages.

The solution introduces an easy-to-use, low-cost enablement for voice capability expansion. Audio systems built using NXP’s Immersiv3D with the i.MX 8M Mini applications processor will give consumers the flexibility to add different audio speakers, regardless of brand, to stream simultaneous and synchronized audio with voice control from their systems.

NXP showcased its i.MX applications processor family including Immersiv3D at the CES 2019 show.

NXP Semiconductors | www.nxp.com

Tool Extension Enables Neural Networking on STM32 MCUs

STMicroelectronics has extended its STM32CubeMX ecosystem by adding advanced Artificial Intelligence (AI) features.  AI uses trained artificial neural networks to classify data signals from motion and vibration sensors, environmental sensors, microphones and image sensors, more quickly and efficiently than conventional handcrafted signal processing. With STM32Cube.AI, developers can now convert pre-trained neural networks into C-code that calls functions in optimized libraries that can run on STM32 MCUs.
STM32Cube.AI comes together with ready-to-use software function packs that include example code for human activity recognition and audio scene classification. These code examples are immediately usable with the ST SensorTile reference board and the ST BLE Sensor mobile app. Additional support such as engineering services is available for developers through qualified partners inside the ST Partner Program and the dedicated AI and Machine Learning (ML) STM32 online community. ST will demonstrate applications developed using STM32Cube.AI running on STM32 MCUs this week in a private suite at CES, the Consumer Electronics Show, in Las Vegas, January 8-12.

The STM32Cube.AI extension pack can be downloaded inside ST’s STM32CubeMX MCU configuration and software code-generation ecosystem. Today, the tool supports Caffe, Keras (with TensorFlow backend), Lasagne, ConvnetJS frameworks and IDEs including those from Keil, IAR and System Workbench.

The FP-AI-SENSING1 software function pack provides examples of code to support end-to-end motion (human-activity recognition) and audio (audio-scene classification) applications based on neural networks. This function pack leverages ST’s SensorTile reference board to capture and label the sensor data before the training process. The board can then run inferences of the optimized neural network. The ST BLE Sensor mobile app acts as the SensorTile’s remote control and display.

The comprehensive toolbox consisting of the STM32Cube.AI mapping tool, application software examples running on small-form-factor, battery-powered SensorTile hardware, together with the partner program and dedicated community support offers a fast and easy path to neural-network implementation on STM32 devices.

STMicroelectronics | www.st.com

 

Chip-Level Solutions Feed AI Needs

Embedded Supercomputing

Gone are the days when supercomputing meant big, rack-based systems in an air conditioned room. Today, embedded processors, FPGAs and GPUs are able to do AI and machine learning operations, enabling new types of local decision making in embedded systems.

By Jeff Child, Editor-in-Chief

Embedded computing technology has evolved way past the point now where complete system functionality on a single chip is remarkable. Today, the levels of compute performance and parallel processing on an IC means that what were once supercomputing levels of capabilities can now be implemented in in chip-level solutions.

While supercomputing has become a generalized term, what system developers are really interested in are crafting artificial intelligence, machine learning and neural networking using today’s embedded processing. Supplying the technology for these efforts are the makers of leading-edge embedded processors, FPGAs and GPUs. In these tasks, GPUs are being used for “general-purpose computing on GPUs”, a technique also known as GPGPU computing.

With all that in mind, embedded processor, GPU and FPGA companies have rolled out a variety of solutions over the last 12 months, aimed at performing AI, machine learning and other advanced computing functions for several demanding embedded system application segments.

FPGAS Take AI Focus

Back March, FPGA vendor Xilinx announced its plans to launch a new FPGA product category it calls its adaptive compute acceleration platform (ACAP). Following up on that, in October the company unveiled Versal—the first of its ACAP implementations. Versal ACAPs combine scalar processing engines, adaptable hardware engines and intelligent engines with advanced memory and interfacing technologies to provide heterogeneous acceleration for any application. But even more importantly, according to Xilinx, the Versal ACAP’s hardware and software can be programmed and optimized by software developers, data scientists and hardware developers alike. This is enabled by a host of tools, software, libraries, IP, middleware and frameworks that facilitate industry-standard design flows.

Built on TSMC’s 7-nm FinFET process technology, the Versal portfolio combines software programmability with domain-specific hardware acceleration and adaptability. The portfolio includes six series of devices architected to deliver scalability and AI inference capabilities for a host of applications across different markets, from cloud to networking to wireless communications to edge computing and endpoints.

The portfolio includes the Versal Prime series, Premium series and HBM series, which are designed to deliver high performance, connectivity, bandwidth, and integration for the most demanding applications. It also includes the AI Core series, AI Edge series and AI RF series, which feature the AI Engine (Figure 1). The AI Engine is a new hardware block designed to address the emerging need for low-latency AI inference for a wide variety of applications and also supports advanced DSP implementations for applications like wireless and radar.

Figure 1
Xilinx’s AI Engine is a new hardware block designed to address the emerging need for low-latency AI inference for a wide variety of applications. It also supports advanced DSP implementations for applications like wireless and radar.

It is tightly coupled with the Versal Adaptable Hardware Engines to enable whole application acceleration, meaning that both the hardware and software can be tuned to ensure maximum performance and efficiency. The portfolio debuts with the Versal Prime series, delivering broad applicability across multiple markets and the Versal AI Core series, delivering an estimated 8x AI inference performance boost compared to industry-leading GPUs, according to Xilinx.

Low-Power AI Solution

Following the AI trend, back in May Lattice Semiconductor unveiled Lattice sensAI, a technology stack that combines modular hardware kits, neural network IP cores, software tools, reference designs and custom design services. In September the company unveiled expanded features of the sensAI stack designed for developers of flexible machine learning inferencing in consumer and industrial IoT applications. Building on the ultra-low power (1 mW to 1 W) focus of the sensAI stack, Lattice released new IP cores, reference designs, demos and hardware development kits that provide scalable performance and power for always-on, on-device AI applications.

Embedded system developers can build a variety of solutions enabled by sensAI. They can build stand-alone iCE40 UltraPlus/ECP5 FPGA based always-on, integrated solutions, with latency, security and form factor benefits. Alternatively, they can use CE40 UltraPlus as an always-on processor that detects key phrases or objects, and wakes-up a high-performance AP SoC / ASIC for further analytics only when required, reducing overall system power consumption. And, finally, you can use the scalable performance/power benefits of ECP5 for neural network acceleration, along with I/O flexibility to seamlessly interface to on-board legacy devices including sensors and low-end MCUs for system control.

Figure 2
Human face detection application example. iCE40 UlraPlus enables AI with an always-on image sensor, while consuming less than 1 mW of active power.

Updates to the sensAI stack include a new CNN (convolutional neural networks) Compact Accelerator IP core for improved accuracy on iCE40 UltraPlus FPGA and enhanced CNN Accelerator IP core for improved performance on ECP5 FPGAs. Software tools include an updated neural network compiler tool with improved ease-of-use and both Caffe and TensorFlow support for iCE40 UltraPlus FPGAs. Also provided are reference designs enabling human presence detection and hand gesture recognition reference designs and demos (Figure 2). New iCE40 UltraPlus development platform support includes a Himax HM01B0 UPduino shield and DPControl iCEVision board.. …

Read the full article in the December 341 issue of Circuit Cellar

Don’t miss out on upcoming issues of Circuit Cellar. Subscribe today!

Note: We’ve made the October 2017 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

COM Express Card Sports 3 GHz Core i3 Processor

Congatec has introduced a Computer-on-Module for the entry-level of high-end embedded computing based on Intel’s latest Core i3-8100H processor platform. The board’s fast 16 PCIe Gen 3.0 lanes make it suited for all new artificial intelligence (AI) and machine learning applications requiring multiple GPUs for massive parallel processing.

The new conga-TS370 COM Express Basic Type 6 Computer-on-Module with quad-core Intel Core i3 8100H processor offers a 45 W TDP configurable to 35 W, supports 6 MB cache and provides up to 32 GB dual-channel DDR4 2400 memory. Compared to the preceding 7th generation of Intel Core processors, the improved memory bandwidth also helps to increase the graphics and GPGPU performance of the integrated new Intel UHD630 graphics, which additionally features an increased maximum dynamic frequency of up to 1.0 GHz for its 24 execution units. It supports up to three independent 4K displays with up to 60 Hz via DP 1.4, HDMI, eDP and LVDS.

Embedded system designers can now switch from eDP to LVDS purely by modifying the software without any hardware changes. The module further provides exceptionally high bandwidth I/Os including 4x USB 3.1 Gen 2 (10 Gbit/s), 8x USB 2.0 and 1x PEG and 8 PCIe Gen 3.0 lanes for powerful system extensions including Intel Optane memory. All common Linux operating systems as well as the 64-bit versions of Microsoft Windows 10 and Windows 10 IoT are supported. Congatec’s personal integration support rounds off the feature set. Additionally, Congatec also offers an extensive range of accessories and comprehensive technical services, which simplify the integration of new modules into customer-specific solutions.

Congatec | www.congatec.com

MPU Targets AI-Based Imaging Processing

Renesas Electronics has now developed a new RZ/A2M microprocessor (MPU) to expand the use of artificial intelligence (e-AI) solutions to high-end applications. The new MPU delivers 10 times the image processing performance of its predecessor, the RZ/A1, and incorporates Renesas’ exclusive Dynamically Reconfigurable Processor (DRP), which achieves real-time image processing at low power consumption. This allows applications incorporating embedded devices–such as smart appliances, service robots, and compact industrial machinery–to carry out image recognition employing cameras and other AI functions while maintaining low power consumption, and accelerating the realization of intelligent endpoints.
Currently, there are several challenges to using AI in the operational technology (OT) field, such as difficulty transferring large amounts of sensor data to the cloud for processing, and delays waiting for AI judgments to be transferred back from the cloud. Renesas already offers AI unit solutions that can detect previously invisible faults in real time by minutely analyzing oscillation waveforms from motors or machines. To accelerate the adoption of AI in the OT field, Renesas has developed the RZ/A2M with DRP, which makes possible image-based AI functionality requiring larger volumes of data and more powerful processing performance than achievable with waveform measurement and analysis.

Since real-time image processing can be accomplished while consuming very little power, battery-powered devices can perform tasks such as real-time image recognition based on camera input, biometric authentication using fingerprints or iris scans, and high-speed scanning by handheld scanners. This solves several issues associated with cloud-based approaches, such as the difficulty of achieving real-time performance, assuring privacy and maintaining security.

The RZ/A2M with DRP is a new addition to the RZ/A Series lineup of MPUs equipped with large capacity on-chip RAM, which eliminates the need for external DRAM. The RZ/A Series MPUs address applications employing human-machine interface (HMI) functionality, and the RZ/A2M adds to this capability with features ideal for applications using cameras. It supports the MIPI camera interface, widely used in mobile devices, and is equipped with a DRP for high-speed image processing.

Renesas has also boosted network functionality with the addition of two-channel Ethernet support, and enhanced secure functionality with an on-chip hardware encryption accelerator. These features enable safe and secure network connectivity, making the new RZ/A2M best suited for a wide range of systems employing image recognition, from home appliances to industrial machinery.

Samples of the RZ/A2M with DRP are available now. The RZ/A2M MPUs are offered with a development board, reference software, and DRP image-processing library, allowing customers to begin evaluating HMI function and image processing performance. Mass production is scheduled to start in the first quarter of 2019, and monthly production volume for all RZ/A2M versions is anticipated to reach a combined 400,000 units by 2021.

Renesas Electronics | www.renesas.com

SoC Provides Neural Network Acceleration

Brainchip has claimed itself as the first company to bring a production spiking neural network architecture to market. Called the Akida Neuromorphic System-on-Chip (NSoC), the device is small, low cost and low power, making it well-suited for edge applications such as advanced driver assistance systems (ADAS), autonomous vehicles, drones, vision-guided robotics, surveillance and machine vision systems. Its scalability allows users to network many Akida devices together to perform complex neural network training and inferencing for many markets including agricultural technology (AgTech), cybersecurity and financial technology (FinTech).
According to Lou DiNardo, BrainChip CEO, Akida, which is Greek for ‘spike,’ represents the first in a new breed of hardware solutions for AI. Artificial intelligence at the edge is going to be as significant and prolific as the microcontroller.

The Akida NSoC uses a pure CMOS logic process, ensuring high yields and low cost. Spiking neural networks (SNNs) are inherently lower power than traditional convolutional neural networks (CNNs), as they replace the math-intensive convolutions and back-propagation training methods with biologically inspired neuron functions and feed-forward training methodologies. BrainChip’s research has determined the optimal neuron model and training methods, bringing unprecedented efficiency and accuracy. Each Akida NSoC has effectively 1.2 million neurons and 10 billion synapses, representing 100 times better efficiency than neuromorphic test chips from Intel and IBM. Comparisons to leading CNN accelerator devices show similar performance gains of an order of magnitude better images/second/watt running industry standard benchmarks such as CIFAR-10 with comparable accuracy.

The Akida NSoC is designed for use as a stand-alone embedded accelerator or as a co-processor. It includes sensor interfaces for traditional pixel-based imaging, dynamic vision sensors (DVS), Lidar, audio, and analog signals. It also has high-speed data interfaces such as PCI-Express, USB, and Ethernet. Embedded in the NSoC are data-to-spike converters designed to optimally convert popular data formats into spikes to train and be processed by the Akida Neuron Fabric.

Spiking neural networks are inherently feed-forward dataflows, for both training and inference. Ingrained within the Akida neuron model are innovative training methodologies for supervised and unsupervised training. In the supervised mode, the initial layers of the network train themselves autonomously, while in the final fully-connected layers, labels can be applied, enabling these networks to function as classification networks. The Akida NSoC is designed to allow off-chip training in the Akida Development Environment, or on-chip training. An on-chip CPU is used to control the configuration of the Akida Neuron Fabric as well as off-chip communication of metadata.

The Akida Development Environment is available now for early access customers to begin the creation, training, and testing of spiking neural networks targeting the Akida NSoC. The Akida NSoC is expected to begin sampling in Q3 2019.

Brainchip | www.brainchip.com

SDR Meets AI in a Mash-Up of Jetson TX2, Artix-7 and 2×2 MIMO

By Eric Brown

A Philadelphia based startup called Deepwave Digital has gone to Crowd Supply to launch its “Artificial Intelligence Radio – Transceiver” (AIR-T) SBC. The AIR-T is a software defined radio (SDR) platform for the 300 MHz to 6 GHz range with AI and deep learning hooks designed for “low-cost AI, deep learning, and high-performance wireless systems,” says Deepwave Digital. The 170 mm x 170 mm Mini-ITX board is controlled by an Ubuntu stack running on an Arm hexa-core powered Nvidia Jetson TX2 module. There’s also a Xilinx Artix-7 FPGA and an Analog Devices AD9371 RFIC 2×2 MIMO transceiver.


 
AIR-T with Jetson TX2 module
(click images to enlarge)

The AIR-T is available through Aug. 14 for $4,995 on Crowd Supply with shipments due at the end of November. Deepwave Digital has passed the halfway point to its $20K goal, but it’s already committed to building the boards regardless of the outcome.

The AIR-T is designed for researchers who want to apply the deep learning powers of the Jetson TX2’s 256-core Pascal GPU and its CUDA libraries to the SDR capabilities provided by the Artix 7 and AD9371 transceiver. The platform can function as a “highly parallel SDR, data recorder, or inference engine for deep learning algorithms,” and provides for “fully autonomous SDR by giving the AI engine complete control over the hardware,” says Deepwave Digital. Resulting SDR applications can process bandwidths greater than 200MHz in real-time, claims the company.

The software platform is built around “custom and open” Ubuntu 16.04 software running on the Jetson TX2, as well as custom FPGA blocks that interface with the open source GNU Radio SDR development platform.

The combined stack enables developers to avoid coding CUDA or VHDL. You can prototype in GNU Radio, and then optionally port it to Python or C++. More advanced users can program the Artix 7 FPGA and Pascal GPU directly. AIR-T is described as an “open platform,” but this would appear to refer to the software rather than hardware.



AIR-T software flow
(click image to enlarge)

The AIR-T enables the development of new wireless technologies, where AI can help maximize resources with today’s increasingly limited spectrum. Potential capabilities include autonomous signal identification and interference mitigation. The AIR-T can also be used for satellite and terrestrial communications. The latter includes “high-power, high-frequency voice communications to 60GHz millimeter wave digital technology,” says Deepwave.

Other applications include video, image, and audio recognition. You can “demodulate a signal and apply deep learning to the resulting image, video, or audio data in one integrated platform,” says the company. The product can also be used for electrical engineering or applied physics research.


Jetson TX2

Nvidia’s Jetson TX2 module features 2x high-end “Denver 2” cores, 4x Cortex-A57 cores, and the 256-core Pascal GPU with CUDA libraries for running machine learning algorithms. The TX2 also supplies the AIR-T with 8 GB of LPDDR4 RAM, 32 GB of eMMC 5.1, and 802.11ac Wi-Fi and Bluetooth.

The Xilinx Artix-7 provides 75k logic cells. The FPGA interfaces with the Analog Devices AD9371 (PDF) dual RF transceiver designed for 300 MHz to 6 GHz frequencies. The AD9371 features 2x RX and 2x TX channels at 100 MHz for each channel, as well as auxiliary observation and sniffer RX channels.

The AIR-T is further equipped with a SATA port and a microSD slot loaded with the Ubuntu stack, as well as GbE, USB 3.0, USB 2.0 and 4K-ready HDMI ports. You also get DIO, an external LO input, a PPS and 10 MHz reference input, and a power supply. It typically runs on 22 W, or as little as 14 W with reduced GPU usage. Other features include 4x MCX-to-SMA cables and an optional enclosure.

Further information

The Artificial Intelligence Radio – Transceiver (AIR-T) is available through Aug. 14 for $4,995 on Crowd Supply — at a 10 percent discount from retail — with shipments due at the end of November. More information may be found on the AIR-T Crowd Supply page and the Deepwave Digital website.

This article originally appeared on LinuxGizmos.com on July 18..

Deepwave Digital | www.deepwavedigital.com

FPGA Solutions Evolve to Meet AI Needs

Brainy System ICs

Long gone now are the days when FPGAs were thought of as simple programmable circuitry for interfacing and glue logic. Today, FPGAs are powerful system chips with on-chip processors, DSP functionality and high-speed connectivity.

By Jeff Child, Editor-in-Chief

Today’s FPGAs have now evolved to the point that calling them “systems-on-chips” is redundant. It’s now simply a given that the high-end lines of the major FPGA vendors have general-purpose CPU cores on them. Moreover, the flavors of signal processing functionality on today’s FPGA chips are ideally suited to the kind of system-oriented DSP functions used in high-end computing. And even better, they’ve enabled AI (Artificial Intelligence) and Machine Learning kinds of functionalities to be implemented into much smaller, embedded systems.

In fact, over the past 12 months, most of the leading FPGA vendors have been rolling out solutions specifically aimed at using FPGA technology to enable AI and machine learning in embedded systems. The two main FPGA market leaders Xilinx and Intel’s Programmable Solutions Group (formerly Altera) have certainly embraced this trend, as have many of their smaller competitors like Lattice Semiconductor and QuickLogic. Meanwhile, specialists in so-called e-FPGA technology like Archonix and Flex Logix have their own compelling twist on FPGA system computing.

Project Brainwave

Exemplifying the trend toward FPGAs facilitating AI processing, Intel’s high-performance line of FPGAs is its Stratix 10 family. According to Intel, the Stratix 10 FPGAs are capable of 10 TFLOPS, or 10 trillion floating point operations per second (Figure 1). In May Microsoft announced its Microsoft debuted its Azure Machine Learning Hardware Accelerated Models powered by Project Brainwave integrated with the Microsoft Azure Machine Learning SDK. Azure’s architecture is developed with Intel FPGAs and Intel Xeon processors.

Figure 1
Stratix 10 FPGAs are capable of 10 TFLOPS or 10 trillion floating point operations per second.

Intel says its FPGA-powered AI is able to achieve extremely high throughput that can run ResNet-50, an industry-standard deep neural network requiring almost 8 billion calculations without batching. This is possible using FPGAs because the programmable hardware—including logic, DSP and embedded memory—enable any desired logic function to be easily programmed and optimized for area, performance or power. And because this fabric is implemented in hardware, it can be customized and can perform parallel processing. This makes it possible to achieve orders of magnitudes of performance improvements over traditional software or GPU design methodologies.

In one application example, Intel cites an effort where Canada’s National Research Council (NRC) is helping to build the next-generation Square Kilometer Array (SKA) radio telescope to be deployed in remote regions of South Africa and Australia, where viewing conditions are most ideal for astronomical research. The SKA radio telescope will be the world’s largest radio telescope that is 10,000 times faster with image resolution 50 times greater than the best radio telescopes we have today. This increased resolution and speed results in an enormous amount of image data that is generated by these telescopes, processing the equivalent of a year’s data on the Internet every few months.

NRC’s design embeds Intel Stratix 10 SX FPGAs at the Central Processing Facility located at the SKA telescope site in South Africa to perform real-time processing and analysis of collected data at the edge. High-speed analog transceivers allow signal data to be ingested in real time into the core FPGA fabric. After that, the programmable logic can be parallelized to execute any custom algorithm optimized for power efficiency, performance or both, making FPGAs the ideal choice for processing massive amounts of real-time data at the edge.

ACAP for Next Gen

For its part, Xilinx’s high-performance product line is its Virtex UltraScale+ device family (Figure 2). According to the company, these provide the highest performance and integration capabilities in a FinFET node, including the highest signal processing bandwidth at 21.2 TeraMACs of DSP compute performance. They deliver on-chip memory density with up to 500 Mb of total on-chip integrated memory, plus up to 8 GB of HBM Gen2 integrated in-package for 460 GB/s of memory bandwidth. Virtex UltraScale+ devices provide capabilities with integrated IP for PCI Express, Interlaken, 100G Ethernet with FEC and Cache Coherent Interconnect for Accelerators (CCIX).

Figure 2
Virtex UltraScale+ FPGAs provide a signal processing bandwidth at 21.2 TeraMACs. They deliver on-chip memory density with up to 500 Mb of total on-chip integrated memory, plus up to 8 GB of HBM Gen2 integrated in-package for 460 GB/s of memory bandwidth.

Looking to the next phase of system performance, Xilinx in March announced its strategy toward a new FPGA product category it calls its adaptive compute acceleration platform (ACAP). Touted as going beyond the capabilities of an FPGA, an ACAP is a highly integrated multi-core heterogeneous compute platform that can be changed at the hardware level to adapt to the needs of a wide range of applications and workloads. An ACAP’s adaptability, which can be done dynamically during operation, delivers levels of performance and performance per-watt that is unmatched by CPUs or GPUs, says Xilinx… …

Read the full article in the August 337 issue of Circuit Cellar

Don’t miss out on upcoming issues of Circuit Cellar. Subscribe today!

Note: We’ve made the October 2017 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.

Multiphase PMICs Boast High Efficiency and Small Footprint

Renesas Electronics has announced three programmable power management ICs (PMICs) that offer high power efficiency and small footprint for application processors in smartphones and tablets: the ISL91302B, ISL91301A, and ISL91301B PMICs. The PMICs also deliver power to artificial intelligence (AI) processors, FPGAs and industrial microprocessors (MPUs). They are also well-suited for powering the supply rails in solid-state drives (SSDs), optical transceivers, and a wide range of consumer, industrial and networking devices. The ISL91302B dual/single output, multiphase PMIC provides up to 20 A of output current and 94 percent peak efficiency in a 70 mm2 solution size that is more than 40% smaller than competitive PMICs.
In addition to the ISL91302B, Renesas’ ISL91301A triple output PMIC and ISL91301B quad output PMIC both deliver up to 16 A of output power with 94% peak efficiency. The new programmable PMICs leverage Renesas’ R5 Modulation Technology to provide fast single-cycle transient response, digitally tuned compensation, and ultra-high 6 MHz (max) switching frequency during load transients. These features make it easier for power supply designers to design boards with 2 mm x 2 mm, 1mm low profile inductors, small capacitors and only a few passive components.

Renesas PMICs also do not require external compensation components or external dividers to set operating conditions. Each PMIC dynamically changes the number of active phases for optimum efficiency at all output currents. Their low quiescent current, superior light load efficiency, regulation accuracy, and fast dynamic response significantly extend battery life for today’s feature-rich, power hungry devices.

Key Features of ISL91302B PMIC:

  • Available in three factory configurable options for one or two output rails:
    • Dual-phase (2 + 2) configuration supporting 10 A from each output
    • Triple-phase (3 + 1) configuration supporting 15 A from one output and  5A from the second output
    • Quad-phase (4 + 0) configuration supporting 20A from one output
  • Small solution size: 7 mm x 10 mm for 4-phase design
  • Input supply voltage range of 2.5 V to 5.5 V
  • I2C or SPI programmable Vout from 0.3 V to 2 V
  • R5 modulator architecture balances current loads with smooth phase adding and dropping for power efficiency optimization
  • Provides 75 μA quiescent current in discontinuous current mode (DCM)
  • Independent dynamic voltage scaling for each output
  • ±0.7percent system accuracy for -10°C to 85°C with remote voltage sensing
  • Integrated telemetry ADC senses phase currents, output current, input/output voltages, and die temperature, enabling PMIC diagnostics during operation
  • Soft-start and fault protection against under voltage (UV), over voltage (OV), over current (OC), over temperature (OT), and short circuit

Key Features of ISL91301A and ISL91301B PMICs

  • Available in two factory configurable options:
    • ISL91301A: dual-phase, three output rails configured as 2+1+1 phase
    • ISL91301B: single-phase, four output rails configured as 1+1+1+1 phase
  • 4A per phase for 2.8 V to 5.5 V supply voltage
  • 3A per phase for 2.5 V to 5.5 V supply voltage
  • Small solution size: 7 mm x 10 mm for 4-phase design
  • I2C or SPI programmable Vout from 0.3 V to 2 V
  • Provides 62μA quiescent current in DCM mode
  • Independent dynamic voltage scaling for each output
  • ±0.7percent system accuracy for -10°C to 85°C with remote voltage sensing
  • Soft-start and fault protection against UV, OV, OC, OT, and short circuit

Pricing and Availability

The ISL91302B dual/single output PMIC is available now in a 2.551 mm x 3.670 ball WLCSP package and is priced at $3.90 in 1k quantities. For more information on the ISL91302B, please visit: www.intersil.com/products/isl91302B.

The ISL91301A triple-output PMIC and ISL91301B quad-output PMIC are available now in 2.551 mm x 2.87 mm, 42-ball WLCSP packages, both priced at $3.12 in 1k quantities. For more information on the ISL91301A, please visit: www.intersil.com/products/isl91301A. For more information on the ISL91301B, please visit: www.intersil.com/products/isl91301B.

Renesas Electronics | www.renesas.com

Movidius AI Acceleration Technology Comes to a Mini-PCIe Card

By Eric Brown

UP AI Core (front)

As promised by Intel when it announced an Intel AI: In Production program for its USB stick form factor Movidius Neural Compute Stick, Aaeon has launched a mini-PCIe version of the device called the UP AI Core. It similarly integrates Intel’s AI-infused Myriad 2 Vision Processing Unit (VPU). The mini-PCIe connection should provide faster response times for neural networking and machine vision compared to connecting to a cloud-based service.

UP AI Core (back)

The module, which is available for pre-order at $69 for delivery in April, is designed to “enhance industrial IoT edge devices with hardware accelerated deep learning and enhanced machine vision functionality,” says Aaeon. It can also enable “object recognition in products such as drones, high-end virtual reality headsets, robotics, smart home devices, smart cameras and video surveillance solutions.”

 

 

UP Squared

The UP AI Core is optimized for Aaeon’s Ubuntu-supported UP Squared hacker board, which runs on Intel’s Apollo Lake SoCs. However, it should work with any 64-bit x86 computer or SBC equipped with a mini-PCIe slot that runs Ubuntu 16.04. Host systems also require 1GB RAM and 4GB free storage. That presents plenty of options for PCs and embedded computers, although the UP Squared is currently the only x86-based community backed SBC equipped with a Mini-PCIe slot.

Myriad 2 architecture

Aaeon had few technical details about the module, except to say it ships with 512MB of DDR RAM, and offers ultra-low power consumption. The UP AI Core’s mini-PCIe interface likely provides a faster response time than the USB link used by Intel’s $79 Movidius Neural Compute Stick. Aaeon makes no claims to that effect, however, perhaps to avoid

Intel’s Movidius
Neural Compute Stick

disparaging Intel’s Neural Compute Stick or other USB-based products that might emerge from the Intel AI: In Production program.

It’s also possible the performance difference between the two products is negligible, especially compared with the difference between either local processing solutions vs. an Internet connection. Cloud-based connections for accessing neural networking services suffer from reduced latency, network bandwidth, reliability, and security, says Aaeon. The company recommends using the Linux-based SDK to “create and train your neural network in the cloud and then run it locally on AI Core.”

Performance issues aside, because a mini-PCIe module is usually embedded within computers, it provides more security than a USB stck. On the other hand, that same trait hinders ease of mobility. Unlike the UP AI Core, the Neural Compute Stick can run on an ARM-based Raspberry Pi, but only with the help of the Stretch desktop or an Ubuntu 16.04 VirtualBox instance.

In 2016, before it was acquired by Intel, Movidius launched its first local-processing version of the Myriad 2 VPU technology, called the Fathom. This Ubuntu-driven USB stick, which miniaturized the technology in the earlier Myriad 2 reference board, is essentially the same technology that re-emerged as Intel’s Movidius Neural Compute Stick.

UP AI Core, front and back

Neural network processors can significantly outperform traditional computing approaches in tasks like language comprehension, image recognition, and pattern detection. The vast majority of such processors — which are often repurposed GPUs — are designed to run on cloud servers.

AIY Vision Kit

The Myriad 2 technology can translate deep learning frameworks like Caffe and TensorFlow into its own format for rapid prototyping. This is one reason why Google adopted the Myriad 2 technology for its recent AIY Vision Kit for the Raspberry Pi Zero W. The kit’s VisionBonnet pHAT board uses the same Movidius MA2450 chip that powers the UP AI Core. On the VisionBonnet, the processor runs Google’s open source TensorFlow machine intelligence library for neural networking, enabling visual perception processing at up to 30 frames per second.

Intel and Google aren’t alone in their desire to bring AI acceleration to the edge. Huawei released a Kirin 970 SoC for its Mate 10 Pro phone that provides a neural processing coprocessor, and Qualcomm followed up with a Snapdragon 845 SoC with its own neural accelerator. The Snapdragon 845 will soon appear on the Samsung Galaxy S9, among other phones, and will also be heading for some high-end embedded devices.

Last month, Arm unveiled two new Project Trillium AI chip designs intended for use as mobile and embedded coprocessors. Available now is Arm’s second-gen Object Detection (OD) Processor for optimizing visual processing and people/object detection. Due this summer is a Machine Learning (ML) Processor, which will accelerate AI applications including machine translation and face recognition.

Further information

The UP AI Core is available for pre-order at $69 for delivery in late April. More information may be found at Aaeon’s UP AI Core announcement and its UP Community UP AI Edge page for the UP AI Core.

Aaeon | www.aaeon.com

This article originally appeared on LinuxGizmos.com on March 6.

NVIDIA Graphics Tapped for Mercedes-Benz MBUX AI Cockpit

At the CES show last month, Mercedes-Benz its NVIDIA-powered MBUX infotainment system–a next-gen car cabin experience can learn and adapt to driver and passenger preferences, thanks to artificial intelligence.

According to NVIDIA, all the key MBUX systems are built together with NVIDIA, and they’re all powered by NVIDIA. The announcement comes a year after Huang joined Mercedes-Benz execs on stage at CES 2017 and said that their companies were collaborating on an AI car that would be ready in 2018.

Powered by NVIDIA graphics and deep learning technologies, the Mercedes-Benz User Experience, or MBUX, has been designed to deliver beautiful new 3D touch-screen displays. It can be controlled with a new voice-activated assistant that can be summoned with the phrase “Hey, Mercedes. It’s an intelligent learning system that adapts to the requirements of customers, remembering such details as the seat and steering wheel settings, lights and other comfort features.

The MBUX announcement highlights the importance of AI to next-generation infotainment systems inside the car, even as automakers are racing put AI to work to help vehicles navigate the world around them autonomously. The new infotainment system aims to use AI to adapt itself to drivers and passengers— automatically suggesting your favorite music for your drive home, or offering directions to a favorite restaurant at dinner time. It’s also one that will benefit from “over-the-air” updates delivering new features and capabilities.

Debuting in this month (February) in the new Mercedes-Benz A-Class, MBUX will power dramatic wide-screen displays that provide navigation, infotainment and other capabilities, touch-control buttons on the car’s steering wheel, as well as an intelligent assistant that can be summoned with a voice command. It’s an interface that can change its look to reflect the driver’s mood—whether they’re seeking serenity or excitement—and understand the way a user talks.

NVIDIA | www.nvidia.com