A Comparison of NVIDIA’s Embedded AI Computing Platform Options

| Created: November 27, 2019 | Updated: March 30, 2021

Embedded AI computing platform options

Each new embedded AI computing platform blurs the lines between man and machine...

When I first got started with machine learning applications, specifically text classification and sentiment analysis, I was working with an eCommerce client building everything onto a web server. Given the huge amount of data we continue working with on a daily basis, it only made sense to build our platform on a dedicated server. For some AI applications, deploying your solution in a server or data center environment still makes sense and will continue to be the best course of action in the future.

Given the current workloads being placed on cloud-based AI solutions, designers can take an environmentally friendly and computationally efficient step by removing these solutions from the data center environment and placing them at the edge. The recent explosion of AI computing hardware platforms is helping to transition computing workloads away from data centers to the edge and are enabling a new wave of AI/ML-enabled products.

NVIDIA is arguably the leader in this space. The company has released a number of modules with varying levels of computational power. Let’s look at the various options available for embedded engineers and how these options can be integrated into a fully customized board.

Why Use a Specialized AI Computing Platform?

This is a fair question, particularly when one looks at the broad range of single board computers (SBCs) available on the market. This comes down to a question of processing power. Newer AI-specialized boards provide processing power with greater parallelization compared to the fastest SBCs available on the market. You’ll tend to see GPUs with a high number of cores on an AI computing platform that is intended for embedded systems, while SBCs will typically contain a high-end MCU without the same level of parallelization.

This isn’t to say that a typical SBC like a Raspberry Pi Compute module can’t be used for AI/ML tasks; they certainly can. Different AI-driven applications require different levels of processing power, RAM, and storage space, all of which will determine latency. No matter the AI-enabled application, your chosen AI computing platform should be chosen based on the amount of data you need to process and the algorithm that runs your solution.

For simpler ML tasks, such as prediction from numerical data using a simple neural network or related regression model, SBCs may be perfectly capable of handling the computational load. More intensive tasks that involve model training with huge datasets, particularly image or video classification, takes much more processing power and memory to quickly produce accurate results. In order for typical SBCs to measure up to AI-specific hardware, they often need to be built as clusters in order to accommodate numerous sensors and distribute computationally intensive workloads among multiple processors. NVIDIA’s AI computing platform options and other AI-specific modules are designed with sufficient computing power, making clustering unnecessary.

AI computing platform vs. server room

You can move your application outside the data center with the right GPU-enabled AI computing platform.

NVIDIA’s AI-Capable Options

Currently, there are four AI-specialized options available from NVIDIA that are ideal for embedded applications. The Jetson series includes four boards with different levels of processing power:

Nano. This board has the lowest price point and is an excellent entry-level module. This module delivers 472 GFlops and runs multiple neural networks in parallel. It also connects to a motherboard with a PCIe connector.
TX2 Series. The higher price point than the Nano brings 1.3 TFlops of processing power on 256 cores with similar footprint. It also includes 32 GB onboard storage and can interface with other devices via GPIO, I2C, I2S, SPI, CAN, and UART.
Xavier NX. This module offers even greater performance with a 384 core GPU and 21 TFlops computational power. This board also interfaces with another motherboard with a PCIe connector.
AGX Xavier Series. the foundation of your computing environment, and making the wrong choice here can limit the performance of your application.

All these boards run on an ARM Cortex architecture, meaning the embedded software developer will find plenty of resources for building firmware for these modules. The Nano and TX2 Series modules are better for deployment on a number of devices in the field. They have the processing power and onboard memory required to execute a number of complex AI/ML algorithms. Examples include image/video classification, prediction using information from an array of sensors, small robots, and speech recognition.

In contrast, the Xavier NX and AGX Xavier Series boards provide enough power to train complex AI/ML models on extremely large datasets that would otherwise take hours on a dedicated general-purpose server. In working with text classification and sentiment scoring, training a multinomial Naive-Bayes classifier with a scant 41,000 scored text entries takes hours on a dedicated server. Imagine training the same classifier with video, audio, or image data; the time required simply becomes prohibitive, especially in an embedded environment. Something like an Xavier NX or AGX Xavier Series module is ideal for training, after which time the model can be deployed to a Nano or TX2 Series module in the field.

Modular Boards with the NVIDIA Jetson Nano COM

The NVIDIA Jetson Nano board is ideal for modular electronics applications in that it can be easily placed in a customized motherboard and integrated with an array of sensors and other peripherals. This module is ideal for edge computing with a deployed AI/ML model for data-intensive applications. Rather than using a cloud-deployed equivalent, where data needs to be sent to a data center for processing and instructions sent back to the embedded module, placing a Jetson Nano into a customized board gives you an embedded solution that gathers and processes large amounts of data directly in the field.

Working with a COM like the Jetson Nano is easiest when you take a modular approach to building your board. This essentially means you are taking advantage of standardized electrical connections between a number of modules to build out your new product, rather than taking time to design every portion of the platform from scratch. The modular design tools in Upverter are ideal for modular SBC design; you’ll be able to access the Jetson Nano and immediately bring it into your board alongside other peripherals, such as WiFi capabilities, an array of sensors, and standard components like power regulation modules.

NVIDIA Jetson Nano AI computing platform

I built this AI-capable board in Upverter in about 10 minutes

If you’re an AI entrepreneur and you want to quickly deploy the first version of your new product, then using the modular design tools in Upverter is a great way to get started. This is especially true if you’re a software developer, or if you have a great idea but lack PCB design experience. The design interface in Upverter is so easy to use that you can create a fully-functional, fully-manufacturable AI solution in a matter of minutes.

Take a look at the Jetson Nano module in Upverter if you’re interested in taking advantage of a cutting-edge AI computing platform. Upverter also gives you have access to industry-standard COMs and popular modules, allowing you to create production ready hardware for nearly any embedded AI application.

Take a look at some Gumstix customer success stories or contact us today to learn more about our products, design tools, and services.

Altium Designer - PCB Design Software Altium 365 - PCB Design Platform Altium Nexus - Agile PCB Design FREE Trials

About Author

Zachariah Peterson has an extensive technical background in academia and industry. He currently provides research, design, and marketing services to companies in the electronics industry. Prior to working in the PCB industry, he taught at Portland State University and conducted research on random laser theory, materials, and stability. His background in scientific research spans topics in nanoparticle lasers, electronic and optoelectronic semiconductor devices, environmental sensors, and stochastics. His work has been published in over a dozen peer-reviewed journals and conference proceedings, and he has written 1000+ technical blogs on PCB design for a number of companies. He is a member of IEEE Photonics Society, IEEE Electronics Packaging Society, and the American Physical Society, and he currently serves on the INCITS Quantum Computing Technical Advisory Committee.