Efabless Logo
VT512
public project
2306Q   

This project serves as an aid to the Raspberry Pi by providing an on-device image classification solution. Instead of relying on cloud-based processing, the project enables the Raspberry Pi to perform immediate classification of captured images locally.

The image data captured by the Raspberry Pi's camera is sent to the hardware accelerator implemented in this project. The MobileNetV3-Small component of the accelerator extracts relevant features from the image, capturing important visual patterns and details.

The extracted features are then passed to the Vision Transformer (ViT) module, which utilizes its ability to analyze the global relationships and context within the image. The ViT model performs the classification based on these extracted features, identifying the objects or patterns present in the image.

Once the classification is complete, the resulting object label and segmented image (if applicable) are sent back to the Raspberry Pi. This allows the Raspberry Pi to obtain the classification results locally without the need for cloud-based processing or external communication. The speed and efficiency of the on-device classification enable real-time or near-real-time analysis and decision-making.

Overall, this project empowers the Raspberry Pi with image classification capabilities, enabling quick and autonomous processing of captured images without relying on external resources. It brings the benefits of immediate classification, privacy, and reduced latency to edge computing applications.

Owner
Anthony Kung
Organization URL

http://anth.dev

Description

This project serves as an aid to the Raspberry Pi by providing an on-device image classification solution. Instead of relying on cloud-based processing, the project enables the Raspberry Pi to perform immediate classification of captured images locally. The image data captured by the Raspberry Pi's camera is sent to the hardware accelerator implemented in this project. The MobileNetV3-Small component of the accelerator extracts relevant features from the image, capturing important visual patterns and details. The extracted features are then passed to the Vision Transformer (ViT) module, which utilizes its ability to analyze the global relationships and context within the image. The ViT model performs the classification based on these extracted features, identifying the objects or patterns present in the image. Once the classification is complete, the resulting object label and segmented image (if applicable) are sent back to the Raspberry Pi. This allows the Raspberry Pi to obtain the classification results locally without the need for cloud-based processing or external communication. The speed and efficiency of the on-device classification enable real-time or near-real-time analysis and decision-making. Overall, this project empowers the Raspberry Pi with image classification capabilities, enabling quick and autonomous processing of captured images without relying on external resources. It brings the benefits of immediate classification, privacy, and reduced latency to edge computing applications.

Version

0.0.0

Category

acc

Process

sky130A