Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Network edge artificial intelligence applications such as presence detection and object counting are becoming more and more popular, but designers are increasingly demanding low-power and small-size network edge artificial intelligence solutions without compromising performance. The latest version of Lattice’s sensAI technology collection, suitable for ECP5 and iCE40 UltraPlus FPGAs, provides designers with the hardware platform, IP, software tools, reference designs and designs required to implement low-power, high-performance AI at the edge of the network Serve.

Lattice semiconductor White Paper

August 2019

Network edge artificial intelligence applications such as presence detection and object counting are becoming more and more popular, but designers are increasingly demanding low-power and small-size network edge artificial intelligence solutions without compromising performance. The latest version of Lattice’s sensAI technology collection, suitable for ECP5 and iCE40 UltraPlus FPGAs, provides designers with the hardware platform, IP, software tools, reference designs and designs required to implement low-power, high-performance AI at the edge of the network Serve.

content

Chapter 1 Summary

Chapter 2 Utilizing the Advantages of FPGA

Chapter 3 Main Update

Chapter 4 sensAI design case

Chapter 5 Conclusion


Summary

The market for low-cost, high-performance network edge solutions is increasingly competitive. Leading market research companies predict that in the next six years, the network edge solutions market will usher in a big explosion. IHS predicts that by 2025, there will be more than 40 billion devices operating at the edge of the network, and market intelligence agency Tractica predicts that by then, more than 2.5 billion network edge devices will be shipped each year.

With the emergence of a new generation of network edge applications, designers are increasingly inclined to develop solutions that combine low power consumption and small size without sacrificing performance. Driving these new AI solutions are more and more network edge applications, such as presence detection of smart doorbells and security cameras in home control, object counting for inventory in retail applications, and object and presence detection in industrial applications . On the one hand, the market requires designers to develop solutions with higher performance than ever before. On the other hand, latency, bandwidth, privacy, power consumption, and cost issues restrict them from relying on cloud computing resources to perform analysis.

At the same time, performance, power consumption, and cost constraints vary from application to application. As the data requirements of real-time online network edge applications continue to drive demand for cloud-based services, designers must solve traditional power consumption, circuit board area, and cost issues. How do developers solve the increasingly stringent power consumption (milliwatt level) and small size (5 mm2 to 100 mm2) requirements of the system. It is difficult to meet various performance requirements alone.

Take advantage of FPGA

Lattice FPGAs have unique advantages to meet the rapidly changing market demands of network edge devices. One of the ways that designers can quickly provide more computing resources to network edge devices without relying on the cloud is to use the parallel processing capabilities in FPGAs to accelerate neural network performance. In addition, by using low-density, small-size packaged FPGAs optimized for low-power operation, designers can meet the stringent power consumption and size constraints of new consumer and industrial applications. For example, Lattice’s iCE40 UltraPlus™ and ECP5™ product families support the development of network edge solutions with power consumption as low as 1 mW to 1 W, and hardware platform sizes as small as 5.5 mm2 to 100 mm2. By combining ultra-low power consumption, high performance, and high accuracy with comprehensive traditional interface support, these FPGAs provide network edge device developers with the flexibility they need to meet changing design requirements.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 1: Lattice Semiconductor’s low-power, small-size FPGAs provide the right combination of performance and functions to support artificial intelligence applications at the edge of the network

To meet this demand and accelerate development, Lattice has launched the industry’s first technology collection sensAI™, which provides designers with low-power, high-performance network edge devices for the development of smart homes, smart factories, smart cities, and smart cars All the tools needed. sensAI aims to meet the growing demand for AI-enabled network edge devices, and provides comprehensive hardware and software solutions to implement low-power, real-time online AI functions in smart devices running at the edge of the network. It was launched in 2018 to seamlessly create new designs or update existing designs, and its low-power AI reasoning is optimized for these new application requirements.

What is in this comprehensive design ecosystem? First of all, Lattice’s modular hardware platforms, such as iCE40 UPduino 2.0 with HM01B0 Shield development board and ECP5-based Embedded Vision Development Kit (EVDK), provide a solid foundation for application development. UPduino can be used for AI designs that only require a few milliwatts, while EVDK supports applications that require higher power consumption but usually work below 1W.

Soft IP can be easily instantiated into FPGA to accelerate the development of neural networks. Therefore, the sensAI development kit includes CNN accelerator IP, allowing designers to implement deep learning applications in iCE40 UltraPlus FPGAs. sensAI also provides a complete CNN accelerator IP core with settable parameters, which can be implemented in Lattice’s ECP5 FPGA. These IPs support variable quantization. This in turn enables designers to make trade-offs between data accuracy and power consumption.

Lattice’s collection of sensAI technologies allows designers to explore design options and trade-offs through an easy-to-use tool flow. Designers can use industry standard frameworks such as Caffe, TensorFlow and Keras for network training. The development environment also provides a neural network compiler, which maps the trained network model to a fixed point representation, and supports variable quantization of weights and activations. Designers can use compilers to help analyze, simulate, and compile different types of networks, so that they can be implemented on Lattice’s accelerator IP cores without RTL experience. Then, designers can use traditional FPGA design tools such as Lattice Radiant and Diamond to implement the entire FPGA design.

To speed up design implementation, sensAI provides more and more reference designs and demonstrations. Including facial recognition, gesture detection, keyword detection, human presence detection, face tracking, object counting and speed sign detection. Finally, the design team usually needs a certain amount of expertise to complete the design. To meet this demand, Lattice has established partnerships with numerous design service partners around the world to support customers with insufficient AI/ML expertise.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 2: Lattice senseAI is a complete set of hardware and software solutions suitable for the development of artificial intelligence applications at the edge of the network

Major update

In order to meet the rapidly growing performance requirements of AI at the network edge, Lattice released sensAI updates in 2019 to enhance its performance and optimize the design process. The performance of the updated sensAI is 10 times higher than the previous version. This is facilitated by multiple optimizations, including updating the CNN IP and neural network compiler, adding 8-bit activation quantization, intelligent layer merging, and dual DSP engines, etc. Features, optimized memory access.

In the latest version, due to the updated neural network compiler to support 8-bit input data, the memory access sequence has been greatly optimized. Therefore, not only the access to the external memory is reduced by half, it also supports the use of higher resolution images as data input. With higher resolution images, the solution is naturally more precise.

To further speed up performance, Lattice optimized the convolutional layer in the sensAI neural network to reduce the time consumed by convolution calculations. Lattice doubled the number of convolution engines in the device, reducing the convolution time by approximately 50%.

Lattice has improved the performance of sensAI without increasing power consumption, so designers can choose devices with fewer gates in the ECP5 FPGA product family. Optimized demo examples can help achieve performance improvements. For example, it is optimized for low-power operation, using CMOS image sensor to detect people, and provide a resolution of 64 x 64 x 3 through the VGG8 network. The system runs at a rate of 5 frames per second and consumes only 7 mW using iCE40 UltraPlus FPGA. The second performance-optimized demonstration, for people counting applications, also uses a CMOS image sensor and provides a resolution of 128 x 128 x 3 through the VGG8 network. The demo runs at a rate of 30 frames per second and uses ECP5-85K FPGA power consumption of 850 mW.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 3: These reference designs show the power consumption and performance options offered by sensAI

At the same time, sensAI brings a seamless design experience to users. It supports more neural network models and machine learning frameworks, thereby shortening the design cycle. The new customizable reference design simplifies the development of common network edge solutions such as object counting and presence detection. At the same time, the ecosystem of design partners is constantly expanding to provide users with important design services. With these, Lattice can provide developers with all the key tools they need to copy or adjust their designs. For example, the following block diagram shows a comprehensive set of components provided by Lattice, including training models, training data sets, training scripts, updated neural network IP, and neural network compilers.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 4: sensAI’s design process includes industry-leading machine learning frameworks, training data and scripts, neural network IP, and other resources necessary to design and train network edge AI devices

Lattice has also expanded its support for machine learning frameworks and is committed to providing a seamless user experience. The initial version of sensAI supports Caffe and TensorFlow, and the latest version adds support for Keras, which is an open source neural network written in Python that can be run on TensorFlow, Microsoft Cognition Toolkit or Theano. Keras aims to help engineers quickly implement deep neural networks, which can provide users with

A good, modular and expandable environment accelerates prototyping. Keras was originally regarded as an interface rather than an independent machine learning framework. Its highly abstract performance allows developers to accelerate the development of deep learning models.

To further simplify the use, Lattice has updated the sensAI neural network compiler tool, which can automatically select the most accurate number of scores when the machine learning model is converted into a firmware file. The sensAI update also adds a hardware debugging tool that allows users to read and write in each layer of the neural network. After software simulation, engineers also need to know the performance of their network on actual hardware. Using this tool, engineers can see the results of hardware operation in just a few minutes.

In addition, the latest version of sensAI has been supported by more and more companies, who provide Lattice with design services and product development skills optimized for low-power, real-time online network edge devices. These companies help customers build network edge AI devices by seamlessly updating existing designs or developing complete solutions for specific applications.

sensAI design case

Lattice’s new higher-performance solution can be used in the following four different accelerator design cases. In the first design case (Figure 5), the design engineer used sensAI to build a solution in a standalone operation mode. This system architecture allows design engineers to develop real-time online integrated solutions on Lattice iCE40 UltraPlus or ECP5 FPGAs, featuring low latency and high security. FPGA resources can be used for system control. A typical application is to use independently operating sensors to detect and count people.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 5: Use sensAI as a standalone AI processing solution at the edge of the network

Designers also use sensAI to develop two different types of pre-processing solutions. In the first case (Figure 6), the designer uses Lattice sensAI and a low-power iCE40 UltraPlus FPGA to preprocess the sensor data, thereby minimizing the cost of transferring data to SoC or the cloud for analysis . For example, if it is used on a smart doorbell, sensAI will initially read the data from the image sensor. If it is judged that it is not a human, such as a cat, the system will not wake up the SoC or connect to the cloud for further processing. Therefore, this method can minimize data transmission costs and power consumption. If the pre-processing system judges that the object at the door is a person, it wakes up the SoC for further processing. This can greatly reduce the amount of data that the system needs to process, while reducing power consumption requirements, which is essential for real-time online network edge applications.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 6: In this case, sensAI will preprocess the sensor data to determine whether the data needs to be sent to the SoC for further processing

In the second preprocessing application, designers can use ECP5 FPGA to achieve neural network acceleration (Figure 7). In this case, designers use the flexibility of ECP5 IO to connect various existing on-board devices (such as sensors) to low-end MCUs to achieve highly flexible system control.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 7: The second system architecture also uses preprocessing. Designers can use ECP5 and sensAI to preprocess sensor data to enhance the comprehensive performance of the neural network

Designers can also use sensAI accelerators in post-processing systems (Figure 8). More and more design cases show that although many companies have developed proven MCU-based solutions, they hope to add a certain AI function without replacing components or redesigning. But in some cases, their MCU performance is relatively insufficient. Typical examples are smart industrial or smart home applications, where image filtering is required before analysis. Designers can add another MCU here and go through a time-consuming design verification process, or they can add an accelerator between the MCU and the data center for post-processing to minimize the amount of data sent to the cloud. This approach is particularly attractive to IoT device developers who want to add AI capabilities.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 8: Enhance the MCU-based design through sensAI, allowing existing designs to support AI functions at the edge of the network

in conclusion

Obviously, the next few years will be a critical period for the market development of real-time online network edge smart devices. As applications become more and more complex, designers will urgently need tools that can support higher performance with low power consumption. The latest version of Lattice’s sensAI technology, combined with ECP5 and iCE40 UltraPlus FPGA, will provide designers with hardware platforms, IP, software tools, reference designs and design services to help them defeat competitors and quickly develop successful solutions.

Network edge artificial intelligence applications such as presence detection and object counting are becoming more and more popular, but designers are increasingly demanding low-power and small-size network edge artificial intelligence solutions without compromising performance. The latest version of Lattice’s sensAI technology collection, suitable for ECP5 and iCE40 UltraPlus FPGAs, provides designers with the hardware platform, IP, software tools, reference designs and designs required to implement low-power, high-performance AI at the edge of the network Serve.

Lattice semiconductor White Paper

August 2019

Network edge artificial intelligence applications such as presence detection and object counting are becoming more and more popular, but designers are increasingly demanding low-power and small-size network edge artificial intelligence solutions without compromising performance. The latest version of Lattice’s sensAI technology collection, suitable for ECP5 and iCE40 UltraPlus FPGAs, provides designers with the hardware platform, IP, software tools, reference designs and designs required to implement low-power, high-performance AI at the edge of the network Serve.

content

Chapter 1 Summary

Chapter 2 Utilizing the Advantages of FPGA

Chapter 3 Main Update

Chapter 4 sensAI design case

Chapter 5 Conclusion


Summary

The market for low-cost, high-performance network edge solutions is increasingly competitive. Leading market research companies predict that in the next six years, the network edge solutions market will usher in a big explosion. IHS predicts that by 2025, there will be more than 40 billion devices operating at the edge of the network, and market intelligence agency Tractica predicts that by then, more than 2.5 billion network edge devices will be shipped each year.

With the emergence of a new generation of network edge applications, designers are increasingly inclined to develop solutions that combine low power consumption and small size without sacrificing performance. Driving these new AI solutions are more and more network edge applications, such as presence detection of smart doorbells and security cameras in home control, object counting for inventory in retail applications, and object and presence detection in industrial applications . On the one hand, the market requires designers to develop solutions with higher performance than ever before. On the other hand, latency, bandwidth, privacy, power consumption, and cost issues restrict them from relying on cloud computing resources to perform analysis.

At the same time, performance, power consumption, and cost constraints vary from application to application. As the data requirements of real-time online network edge applications continue to drive demand for cloud-based services, designers must solve traditional power consumption, circuit board area, and cost issues. How do developers solve the increasingly stringent power consumption (milliwatt level) and small size (5 mm2 to 100 mm2) requirements of the system. It is difficult to meet various performance requirements alone.

Take advantage of FPGA

Lattice FPGAs have unique advantages to meet the rapidly changing market demands of network edge devices. One of the ways that designers can quickly provide more computing resources to network edge devices without relying on the cloud is to use the parallel processing capabilities in FPGAs to accelerate neural network performance. In addition, by using low-density, small-size packaged FPGAs optimized for low-power operation, designers can meet the stringent power consumption and size constraints of new consumer and industrial applications. For example, Lattice’s iCE40 UltraPlus™ and ECP5™ product families support the development of network edge solutions with power consumption as low as 1 mW to 1 W, and hardware platform sizes as small as 5.5 mm2 to 100 mm2. By combining ultra-low power consumption, high performance, and high accuracy with comprehensive traditional interface support, these FPGAs provide network edge device developers with the flexibility they need to meet changing design requirements.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 1: Lattice Semiconductor’s low-power, small-size FPGAs provide the right combination of performance and functions to support artificial intelligence applications at the edge of the network

To meet this demand and accelerate development, Lattice has launched the industry’s first technology collection sensAI™, which provides designers with low-power, high-performance network edge devices for the development of smart homes, smart factories, smart cities, and smart cars All the tools needed. sensAI aims to meet the growing demand for AI-enabled network edge devices, and provides comprehensive hardware and software solutions to implement low-power, real-time online AI functions in smart devices running at the edge of the network. It was launched in 2018 to seamlessly create new designs or update existing designs, and its low-power AI reasoning is optimized for these new application requirements.

What is in this comprehensive design ecosystem? First of all, Lattice’s modular hardware platforms, such as iCE40 UPduino 2.0 with HM01B0 Shield development board and ECP5-based Embedded Vision Development Kit (EVDK), provide a solid foundation for application development. UPduino can be used for AI designs that only require a few milliwatts, while EVDK supports applications that require higher power consumption but usually work below 1W.

Soft IP can be easily instantiated into FPGA to accelerate the development of neural networks. Therefore, the sensAI development kit includes CNN accelerator IP, allowing designers to implement deep learning applications in iCE40 UltraPlus FPGAs. sensAI also provides a complete CNN accelerator IP core with settable parameters, which can be implemented in Lattice’s ECP5 FPGA. These IPs support variable quantization. This in turn enables designers to make trade-offs between data accuracy and power consumption.

Lattice’s collection of sensAI technologies allows designers to explore design options and trade-offs through an easy-to-use tool flow. Designers can use industry standard frameworks such as Caffe, TensorFlow and Keras for network training. The development environment also provides a neural network compiler, which maps the trained network model to a fixed point representation, and supports variable quantization of weights and activations. Designers can use compilers to help analyze, simulate, and compile different types of networks, so that they can be implemented on Lattice’s accelerator IP cores without RTL experience. Then, designers can use traditional FPGA design tools such as Lattice Radiant and Diamond to implement the entire FPGA design.

To speed up design implementation, sensAI provides more and more reference designs and demonstrations. Including facial recognition, gesture detection, keyword detection, human presence detection, face tracking, object counting and speed sign detection. Finally, the design team usually needs a certain amount of expertise to complete the design. To meet this demand, Lattice has established partnerships with numerous design service partners around the world to support customers with insufficient AI/ML expertise.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 2: Lattice senseAI is a complete set of hardware and software solutions suitable for the development of artificial intelligence applications at the edge of the network

Major update

In order to meet the rapidly growing performance requirements of AI at the network edge, Lattice released sensAI updates in 2019 to enhance its performance and optimize the design process. The performance of the updated sensAI is 10 times higher than the previous version. This is facilitated by multiple optimizations, including updating the CNN IP and neural network compiler, adding 8-bit activation quantization, intelligent layer merging, and dual DSP engines, etc. Features, optimized memory access.

In the latest version, due to the updated neural network compiler to support 8-bit input data, the memory access sequence has been greatly optimized. Therefore, not only the access to the external memory is reduced by half, it also supports the use of higher resolution images as data input. With higher resolution images, the solution is naturally more precise.

To further speed up performance, Lattice optimized the convolutional layer in the sensAI neural network to reduce the time consumed by convolution calculations. Lattice doubled the number of convolution engines in the device, reducing the convolution time by approximately 50%.

Lattice has improved the performance of sensAI without increasing power consumption, so designers can choose devices with fewer gates in the ECP5 FPGA product family. Optimized demo examples can help achieve performance improvements. For example, it is optimized for low-power operation, using CMOS image sensor to detect people, and provide a resolution of 64 x 64 x 3 through the VGG8 network. The system runs at a rate of 5 frames per second and consumes only 7 mW using iCE40 UltraPlus FPGA. The second performance-optimized demonstration, for people counting applications, also uses a CMOS image sensor and provides a resolution of 128 x 128 x 3 through the VGG8 network. The demo runs at a rate of 30 frames per second and uses ECP5-85K FPGA power consumption of 850 mW.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 3: These reference designs show the power consumption and performance options offered by sensAI

At the same time, sensAI brings a seamless design experience to users. It supports more neural network models and machine learning frameworks, thereby shortening the design cycle. The new customizable reference design simplifies the development of common network edge solutions such as object counting and presence detection. At the same time, the ecosystem of design partners is constantly expanding to provide users with important design services. With these, Lattice can provide developers with all the key tools they need to copy or adjust their designs. For example, the following block diagram shows a comprehensive set of components provided by Lattice, including training models, training data sets, training scripts, updated neural network IP, and neural network compilers.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 4: sensAI’s design process includes industry-leading machine learning frameworks, training data and scripts, neural network IP, and other resources necessary to design and train network edge AI devices

Lattice has also expanded its support for machine learning frameworks and is committed to providing a seamless user experience. The initial version of sensAI supports Caffe and TensorFlow, and the latest version adds support for Keras, which is an open source neural network written in Python that can be run on TensorFlow, Microsoft Cognition Toolkit or Theano. Keras aims to help engineers quickly implement deep neural networks, which can provide users with

A good, modular and expandable environment accelerates prototyping. Keras was originally regarded as an interface rather than an independent machine learning framework. Its highly abstract performance allows developers to accelerate the development of deep learning models.

To further simplify the use, Lattice has updated the sensAI neural network compiler tool, which can automatically select the most accurate number of scores when the machine learning model is converted into a firmware file. The sensAI update also adds a hardware debugging tool that allows users to read and write in each layer of the neural network. After software simulation, engineers also need to know the performance of their network on actual hardware. Using this tool, engineers can see the results of hardware operation in just a few minutes.

In addition, the latest version of sensAI has been supported by more and more companies, who provide Lattice with design services and product development skills optimized for low-power, real-time online network edge devices. These companies help customers build network edge AI devices by seamlessly updating existing designs or developing complete solutions for specific applications.

sensAI design case

Lattice’s new higher-performance solution can be used in the following four different accelerator design cases. In the first design case (Figure 5), the design engineer used sensAI to build a solution in a standalone operation mode. This system architecture allows design engineers to develop real-time online integrated solutions on Lattice iCE40 UltraPlus or ECP5 FPGAs, featuring low latency and high security. FPGA resources can be used for system control. A typical application is to use independently operating sensors to detect and count people.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 5: Use sensAI as a standalone AI processing solution at the edge of the network

Designers also use sensAI to develop two different types of pre-processing solutions. In the first case (Figure 6), the designer uses Lattice sensAI and a low-power iCE40 UltraPlus FPGA to preprocess the sensor data, thereby minimizing the cost of transferring data to SoC or the cloud for analysis . For example, if it is used on a smart doorbell, sensAI will initially read the data from the image sensor. If it is judged that it is not a human, such as a cat, the system will not wake up the SoC or connect to the cloud for further processing. Therefore, this method can minimize data transmission costs and power consumption. If the pre-processing system judges that the object at the door is a person, it wakes up the SoC for further processing. This can greatly reduce the amount of data that the system needs to process, while reducing power consumption requirements, which is essential for real-time online network edge applications.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 6: In this case, sensAI will preprocess the sensor data to determine whether the data needs to be sent to the SoC for further processing

In the second preprocessing application, designers can use ECP5 FPGA to achieve neural network acceleration (Figure 7). In this case, designers use the flexibility of ECP5 IO to connect various existing on-board devices (such as sensors) to low-end MCUs to achieve highly flexible system control.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 7: The second system architecture also uses preprocessing. Designers can use ECP5 and sensAI to preprocess sensor data to enhance the comprehensive performance of the neural network

Designers can also use sensAI accelerators in post-processing systems (Figure 8). More and more design cases show that although many companies have developed proven MCU-based solutions, they hope to add a certain AI function without replacing components or redesigning. But in some cases, their MCU performance is relatively insufficient. Typical examples are smart industrial or smart home applications, where image filtering is required before analysis. Designers can add another MCU here and go through a time-consuming design verification process, or they can add an accelerator between the MCU and the data center for post-processing to minimize the amount of data sent to the cloud. This approach is particularly attractive to IoT device developers who want to add AI capabilities.

Provide higher performance solutions for the rapidly growing network edge artificial intelligence applications

Figure 8: Enhance the MCU-based design through sensAI, allowing existing designs to support AI functions at the edge of the network

in conclusion

Obviously, the next few years will be a critical period for the market development of real-time online network edge smart devices. As applications become more and more complex, designers will urgently need tools that can support higher performance with low power consumption. The latest version of Lattice’s sensAI technology, combined with ECP5 and iCE40 UltraPlus FPGA, will provide designers with hardware platforms, IP, software tools, reference designs and design services to help them defeat competitors and quickly develop successful solutions.

The Links:   G170ETN02.2 PM150RSD060

Author: Yoyokuo