Vision Tracker | AI-Powered Smart Tracking Camera

by Mukesh_Sankhla in Circuits > Cameras

541 Views, 8 Favorites, 0 Comments

Vision Tracker | AI-Powered Smart Tracking Camera

A100.gif
DSC03381.JPG
DSC03383.JPG

In this project, we’re going to build a smart pan-and-tilt tracking system for the HuskyLens V2 — turning it into an interactive AI vision module that can physically follow what it sees.

The HuskyLens V2 is already a powerful standalone AI camera with built-in models for face recognition, object tracking, hand gesture detection, pose estimation, and more. But what if it could do more than just detect? What if it could react and move in real time?

That’s exactly what we’re building.

Using two servo motors and an ESP32-C6, we’ll create a smooth pan-and-tilt mechanism that allows the HuskyLens V2 to physically track a target. As the AI detects a face, object, hand, or pose, the ESP32-C6 reads the tracking data and adjusts the servo angles accordingly — keeping the subject centered in the frame.

This project extends the capabilities of HuskyLens beyond simple detection and brings it into the physical world. Instead of just identifying objects, the system actively follows them — opening possibilities for:

  1. Smart surveillance systems
  2. Interactive robots
  3. Auto-tracking cameras
  4. AI-powered turrets
  5. Smart classroom or lab demonstrations

By the end of this tutorial, you’ll have a fully functional AI-powered tracking camera system that combines embedded systems, servo control, and real-time machine vision — perfect for makers, robotics enthusiasts, and AI explorers.

Let’s get started!

Supplies

DSC03292.JPG
DSC03295.JPG
DSC03297.JPG
DSC03298.JPG
DSC03299.JPG
DSC03274.JPG

1x HuskyLens V2 — Vision AI camera module

1x ESP32-C6 — Microcontroller for servo control and HuskyLens communication

DSS-M15S Servo Motors — For pan and tilt motion

About HuskyLens V2

A2.gif
DSC0330511.JPG
DSC03303.JPG

The HuskyLens V2 is an embedded AI vision module designed for makers and projects that require real-time visual intelligence. Unlike traditional cameras that simply capture images, HuskyLens V2 processes vision tasks on-device using its built-in AI capabilities — there’s no need for an external computer or cloud connection.

Key Technical Features

  1. Dual-core Kendryte K230 Processor – Provides dedicated AI acceleration with up to 6 TOPS performance for efficient edge inference.
  2. 1 GB LPDDR4 Memory – Ensures smooth model execution and real-time processing.
  3. 8 GB eMMC 5.1 Storage – Stores models, configurations, and logs without external storage.
  4. 2 MP Camera Sensor – High-quality image capture optimized for AI detection and tracking.
  5. 20+ Built-in AI Models – Ready-to-use vision algorithms, including:
  6. Face detection and recognition
  7. Object tracking
  8. Hand gesture detection
  9. Pose estimation
  10. Color detection
  11. Tag recognition, and more
  12. Custom Model Support – Ability to train your own AI models and deploy them directly to the device.
  13. Interactive Touch Display – 2.4″ touchscreen UI for easy configuration and real-time feedback.
  14. Communication Interfaces – USB-C for direct connection, UART/I²C for microcontroller integration, and optional Wi-Fi module support.

HuskyLens V2 runs AI inference locally — meaning video frames are processed on the device itself using optimized neural network models. When a target (e.g., face or object) is detected, HuskyLens calculates positional data such as X/Y coordinates relative to the frame center. This data can be shared with a microcontroller like the ESP32-C6 over UART or I²C, enabling real-time motion control — which is exactly what we’ll use for this pan-and-tilt tracking setup.

CAD & 3D Printing

A3.gif
P3.png
DSC03279.JPG
DSC03280.JPG
DSC03282.JPG
DSC03289.JPG
DSC03286.JPG
DSC03281.JPG

To build a clean and compact tracking system, I designed the complete pan-and-tilt assembly in Autodesk Fusion 360. The design focuses on stability, proper weight distribution, and easy assembly while keeping wiring neatly managed inside the structure.

The mechanism consists of four main parts:

Part 1 – Pan Motor Mount

  1. This part securely holds the pan servo motor in place. It ensures proper alignment and provides a rigid base so the entire system rotates smoothly without wobble.

Part 2 – Pan Base

  1. The pan base attaches directly to the pan motor horn. It rotates along with the motor and includes a dedicated mounting provision for the tilt servo motor. This creates the second axis of movement.

Part 3 – Bottom Housing

This is the structural foundation of the system. It serves multiple purposes:

  1. Houses the ESP32-C6
  2. Holds the pan motor securely
  3. Provides internal space to tuck in extra wires for clean cable management
  4. Includes a precise cutout for the Type-C port of the ESP32-C6 for easy programming and power access

Part 4 – Tilt Arm

The tilt mechanism is designed as a two-part assembly:

  1. It mounts directly onto the tilt servo horn
  2. The top section holds the HuskyLens V2, secured using two screws

This structure keeps the camera stable while allowing smooth vertical motion.


All parts were printed on my Bambu Lab P1S using Black PLA.

You can:

  1. Download the ready-to-print STL files and print them directly
  2. Or download the original Fusion 360 file to modify dimensions, adapt to different servos, or customize the mounting system based on your requirements


Pan Motor Assembly

DSC03310.JPG
DSC03311.JPG
DSC03312.JPG
DSC03314.JPG
DSC03315.JPG
DSC03316.JPG
DSC03317.JPG
DSC03320.JPG
  1. Take Part-1 (Pan Motor Mount) and one DSS-M15S servo motor.
  2. Remove the small screws from the servo casing
  3. Insert the servo motor into Part-1 so it sits firmly in the mounting slot.
  4. Reinstall and tighten the screws to close the servo casing.
  5. Use 4x mounting screws to secure the servo firmly to Part-1.
  6. Check that the servo is properly aligned and does not move inside the mount.

Circuit Connection

H2 - C6.png
DSC03321.JPG
DSC03323.JPG
DSC03325.JPG
DSC03327.JPG
DSC03330.JPG
DSC03328.JPG
DSC03332.JPG
DSC03334.JPG

Now let’s connect everything together as shown in the diagram.

1. Servo Connections (Pan & Tilt)

Both servos have three wires:

  1. Red → VCC (5V / VIN)
  2. Black/Brown → GND
  3. Yellow/Orange → Signal

Pan Servo

  1. Signal → GPIO 5 of the Beetle ESP32-C6
  2. VCC → VIN
  3. GND → GND

Tilt Servo

  1. Signal → GPIO 4 of the Beetle ESP32-C6
  2. VCC → VIN
  3. GND → GND
Important: Make sure all GNDs are connected together (ESP32 + both servos + HuskyLens).


2. HuskyLens V2 Connection (I2C Mode)

On the HuskyLens V2:

  1. Go to Settings → Protocol Type
  2. Select I2C

Now connect using the 4-pin Gravity cable:

  1. Green (SDA)GPIO 19 (ESP32-C6 SDA)
  2. Blue (SCL)GPIO 20 (ESP32-C6 SCL)
  3. Red (VCC)VIN
  4. Black (GND)GND

3. Power

  1. The ESP32-C6 can be powered using the Type-C port.
  2. VIN powers both servos and the HuskyLens.
If using high-torque servos, consider using an external 5V supply for stable operation.


Route all wires through the opening in Part-3 (bottom housing) before final assembly.


  1. Place the ESP32-C6 inside its dedicated slot in Part-3.
  2. Align the Type-C port with the side opening of the housing.
  3. Use 2× M2 screws to secure the ESP32-C6 firmly in place.

Pan Base Assembly

DSC03335.JPG
DSC03338.JPG
DSC03339.JPG
DSC03340.JPG
DSC03341.JPG

1. Initialize the Servos

Before closing the assembly, we need to center both servos.

  1. Connect the ESP32-C6 to your PC using the Type-C cable.
  2. Open the Arduino IDE (or your preferred environment).
  3. Upload the servo initialization code.
#include <ESP32Servo.h>

Servo tiltServo;
Servo panServo;

#define TILT_PIN 4
#define PAN_PIN 5

void setup() {
Serial.begin(115200);

// Attach servos
tiltServo.attach(TILT_PIN);
panServo.attach(PAN_PIN);

// Move both servos to center (90°)
tiltServo.write(90);
panServo.write(90);

Serial.println("Servos moved to center position (90 degrees)");
}

void loop() {
// Nothing here — servos stay centered
}

This code moves both the pan and tilt servos to their center position.

⚠️ Centering the servos before mechanical assembly is important. It ensures proper alignment and prevents limited rotation or strain after mounting.

Wait until both servos move to their center position.

2. Attach the Bottom Housing (Part-3)

  1. Take Part-3 (Bottom Housing).
  2. Carefully route the wires inside if not already positioned.
  3. Snap Part-3 onto the existing pan motor assembly.

Make sure it:

  1. Fully covers the exposed pan motor section
  2. Sits flush without pinching wires
  3. Aligns properly with the Type-C opening

Once snapped in place, the base structure is complete and ready for the tilt assembly in the next step.

Tilt Motor Assembly

DSC03343.JPG
DSC03344.JPG
DSC03346.JPG
DSC03348.JPG
DSC03349.JPG
DSC03350.JPG

1. Install the Tilt Servo

  1. Take Part-2 (Pan Base with Tilt Mount).
  2. Insert the tilt servo motor into the dedicated slot in Part-2.
  3. Use 4 screws to firmly secure the servo in place.

2. Attach the Pan Base to the Pan Servo

  1. Take the assembled base (with centered pan servo).
  2. Align Part-2 with the pan servo horn.
  3. Carefully place it onto the servo shaft.


Tilt Assembly

DSC03353.JPG
DSC03355.JPG
DSC03357.JPG
DSC03358.JPG
DSC03359.JPG
DSC03360.JPG

1. Install the Servo Horn

  1. Take the tilt arm main part.
  2. Insert the disc servo horn into the provided slot inside the part.
  3. Use 2 screws to secure the horn firmly in place.
  4. Make sure the horn sits flat and does not move.

2. Attach to the Tilt Servo

  1. Ensure the tilt servo is still in the center (90°) position.
  2. Align the tilt arm with the servo shaft.
  3. Carefully place it onto the servo spline.
  4. Insert and tighten the center screw to lock it in place.
⚠️ Make sure the arm is straight before tightening, so the movement range is balanced.

3. Attach the Second Tilt Arm Part

  1. Take the second tilt arm piece.
  2. Align it with the mounted tilt arm section.
  3. Use 2 screws to secure both parts together (as shown in the images).

Once tightened, check the movement manually to ensure smooth tilt motion without obstruction.

The tilt mechanism is now complete and ready for mounting the HuskyLens V2.

HuskyLens Mount Assembly

A102.gif
DSC03368.JPG
DSC03369.JPG
DSC03370.JPG
DSC03373.JPG
DSC03377.JPG
DSC03379.JPG

1. Connect the Gravity Cable

  1. Take the Gravity I2C cable coming from the ESP32-C6.
  2. Connect it to the HuskyLens V2 Gravity port.
  3. Ensure the connector is fully inserted and properly aligned.
  4. Gently route the cable so it does not interfere with tilt movement.

2. Mount the HuskyLens V2

  1. Place the HuskyLens V2 onto the top mounting holes of the tilt arm.
  2. Align the mounting holes.
  3. Insert 2× M3 screws through the bracket.
  4. Tighten the screws securely, but do not over-tighten.

Code and Working

A101.gif
DSC03381.JPG
DSC03383.JPG

1. Upload the Tracking Code

  1. Open Arduino IDE on your PC.
  2. Copy and paste the provided tracking code into a new sketch.
  3. Go to Tools → Board → Select “Beetle ESP32-C6”.
  4. Select the correct COM Port from Tools → Port.
  5. Click Upload.
#include <Wire.h>
#include "DFRobot_HuskylensV2.h"
#include <ESP32Servo.h>

HuskylensV2 huskylens;

// Servo objects
Servo tiltServo;
Servo panServo;

// Servo pins
#define TILT_PIN 4
#define PAN_PIN 5

// Camera resolution (HuskyLens default)
#define FRAME_WIDTH 320
#define FRAME_HEIGHT 240

// Current servo angles
int panAngle = 90;
int tiltAngle = 90;

// Tuning (adjust if movement too fast/slow)
float panGain = 0.05;
float tiltGain = 0.05;

void setup() {
Serial.begin(115200);
Wire.begin();

// Attach servos
tiltServo.attach(TILT_PIN);
panServo.attach(PAN_PIN);

// Center servos initially
tiltServo.write(tiltAngle);
panServo.write(panAngle);

while (!huskylens.begin(Wire)) {
Serial.println("HuskyLens Begin failed!");
delay(100);
}

huskylens.switchAlgorithm(ALGORITHM_FACE_RECOGNITION);
delay(2000);
}

void loop() {

if (huskylens.getResult(ALGORITHM_ANY)) {

while (huskylens.available(ALGORITHM_ANY)) {

Result *result =
static_cast<Result *>(huskylens.popCachedResult(ALGORITHM_ANY));

int x = result->xCenter;
int y = result->yCenter;

Serial.print("Face Center: ");
Serial.print(x);
Serial.print(", ");
Serial.println(y);

// Calculate error from center
int errorX = x - FRAME_WIDTH / 2;
int errorY = y - FRAME_HEIGHT / 2;

// Adjust angles
panAngle -= errorX * panGain;
tiltAngle += errorY * tiltGain;

// Constrain angles
panAngle = constrain(panAngle, 0, 180);
tiltAngle = constrain(tiltAngle, 0, 180);

// Move servos
panServo.write(panAngle);
tiltServo.write(tiltAngle);
}
}

delay(20);
}

Wait for the code to compile and upload successfully.

2. How It Works

Once the code is uploaded:

  1. Power on the system.
  2. Open any AI mode on the HuskyLens V2 — for example:
  3. Face Tracking
  4. Object Tracking
  5. Hand Gesture
  6. Pose Detection

As soon as a target is detected:

  1. The HuskyLens V2 sends bounding box and tracking data (X, Y coordinates and ID) to the ESP32-C6 over I2C.
  2. The ESP32-C6 processes this data and calculates how far the object is from the center of the frame.
  3. Based on the position difference, it adjusts the pan and tilt servo angles.

This keeps the detected subject centered in real time.

The system now acts as a fully AI-powered smart tracking camera — capable of physically following faces, objects, hands, or poses using real-time vision data.

Face Recognition

A4.gif

Face Recognition detects faces, shows facial key points (eyes, nose, mouth), and can learn & recognize multiple people.

  1. White box → Face detected
  2. Colored box + ID + % → Face recognized
  3. Example: Face ID1 97%
  4. ID1 = First learned face
  5. 97% = Confidence level

RGB Status Light (Back Side):

  1. 🔵 Blue → Face detected
  2. 🟡 Yellow → Learning face
  3. 🟢 Green → Recognized face

How to Learn a Face

  1. Select Face Recognition mode
  2. Align face inside white box
  3. Make sure center crosshair is inside box
  4. Press Button-A (top-right)
  5. Face saved as ID1, ID2, etc.

Important Parameters (Quick Guide)

  1. Forget IDs → Deletes all learned faces
  2. Multi-Face Acceleration → Faster tracking (slightly lower accuracy)
  3. Detection Threshold
  4. Low → Detects easily (may detect false faces)
  5. High → Strict detection
  6. Recognition Threshold
  7. Low → Easy matching (more false matches)
  8. High → Strict matching
  9. NMS Threshold → Removes duplicate overlapping boxes
  10. Face Features → Show/hide key points
  11. Set Name → Assign custom name to ID
  12. Display Name → Show/hide name on screen
  13. Restore Defaults → Reset everything
  14. Export Model → Save learned faces (up to 5 models)
  15. Import Model → Load saved faces into another HuskyLens

Export / Import (Quick Flow)

Export:

Face Recognition → Export Model → Select model number → Save

Import:

Copy .json & .bin files → Paste into new device → Import Model → Select same number

Object Recognition

A5.gif

The object Recognition feature of HUSKYLENS 2 can identify 80 types of fixed objects.

During detection, it automatically frames the target object and displays its name along with a confidence score

Object Tracking

A6.gif

Object Tracking allows you to learn and track one custom object at a time.

How to Learn an Object

  1. Enter Object Tracking mode
  2. Align camera with target object
  3. Touch & drag on screen to frame the object
  4. Release to complete learning

Tracking Result

When the learned object appears:

  1. Screen shows a colored bounding box
  2. Displays: Obj: ID1 66%

Where:

  1. Obj = Default name
  2. ID1 = First learned object
  3. 66% = Confidence level


Color Recognition

A7.gif

This function enables detection, learning, recognition, and tracking of specified colors.


Object Classification

A9.gif
  1. Classifies objects into 1000 predefined categories
  2. Uses built-in AI model (no training required)
  3. Displays object name + confidence level
  4. Example: Laptop 92%

⚠ Unlike Object Recognition:

  1. Does NOT show bounding boxes
  2. Does NOT provide object position (no X–Y coordinates)

Best for: Identifying what an object is, not where it is.

Self-Learning

A8.gif

This feature allows capturing multi-angle images of any object, learning, and recognizing any custom object.

Instance Segmentation

A10.gif
  1. Detects objects and outlines their exact shape (contours)
  2. Assigns a unique mask to each detected object
  3. Helps in measuring object area and shape
  4. Supports up to 80 object categories
  5. Categories are the same as Object Recognition mode

Unlike simple bounding boxes, this feature marks the precise boundary of each object for better visual understanding.

Hand Gesture Recognition

A11.gif
  1. Detects the palm and 21 key points
  2. Shows all finger joints in real-time
  3. Supports learning, recognizing, and tracking gestures

21 Key Points Include:

  1. 1 Wrist
  2. 4 joints per finger:
  3. Thumb
  4. Index finger
  5. Middle finger
  6. Ring finger
  7. Little finger

(Each finger has: root, first joint, second joint, fingertip)

Perfect for gesture control and interactive AI projects

Pose Recognition

A12.gif
  1. Detects human body in the image
  2. Identifies and plots 17 body key points
  3. Can learn, recognize, and track different poses
  4. Detects multiple people at the same time
  5. Works from different angles
  6. Can predict some hidden (occluded) joints

17 Key Points Include:

  1. Nose
  2. Eyes (Left & Right)
  3. Ears (Left & Right)
  4. Shoulders (Left & Right)
  5. Elbows (Left & Right)
  6. Wrists (Left & Right)
  7. Hips (Left & Right)
  8. Knees (Left & Right)
  9. Ankles (Left & Right)

Great for fitness tracking, gesture control, and smart interaction projects.

License Plate Recognition

A13.gif
  1. Detects vehicle license plates in the scene
  2. Displays the plate number on screen
  3. Supports learning specific license plates
  4. Can recognize and track learned plates
  5. Shows ID and confidence level

Useful for parking systems, smart gates, and access control.

Character Recognition (OCR)

A14.gif
  1. Uses OCR (Optical Character Recognition)
  2. Detects Chinese and English text on screen
  3. Displays the recognized text content
  4. Can learn, recognize, and track characters


  1. All detected text areas are shown with bounding boxes
  2. Only the text block closest to the center crosshair is recognized
  3. Recognized text appears at the top-left of the box

Useful for smart readers, label scanning, and text-based automation projects

Line Tracking

A15.gif
  1. Detects lines with strong color contrast from the background
  2. Marks different detected paths with different colors
  3. Works in real-time
  4. Ideal for line-following robots and path tracking projects

Perfect for smart cars and autonomous navigation

Face Emotion Recognition

A16.gif
  1. Recognizes 7 facial expressions:
  2. Anger
  3. Disgust
  4. Fear
  5. Happiness
  6. Neutral
  7. Sadness
  8. Surprise
  9. Expressions are pre-trained at factory
  10. No manual learning required
  11. Displays detected emotion on screen

Great for emotion-based AI interaction projects.

QR Code Recognition

A17.gif
  1. Detects and reads QR codes
  2. Displays the encoded information on screen
  3. Can learn and recognize specific QR codes
  4. Supports tracking detected QR codes
  5. Allows users to assign custom names

Perfect for smart login systems, inventory tracking, and interactive projects

Barcode Recognition

A19.gif
  1. Detects barcodes in the scene
  2. Displays the encoded information
  3. Supports learning and recognizing specific barcodes
  4. Can track detected barcodes
  5. Allows custom naming of barcodes

Useful for billing systems, inventory management, and automation projects

RTSP Video Streaming

A20.gif
  1. Streams live video wirelessly using RTSP
  2. View camera output on mobile, laptop, or PC
  3. Works with RTSP-supported apps (e.g., VLC)
  4. See real-time AI detection results on other devices

Perfect for monitoring, demos, and remote AI projects

HUSKYLENS 2 Microscope Lens Module

A1211.gif
DSC03815.JPG
DSC03816.JPG
DSC03814.JPG
DSC03817.JPG
DSC03819.JPG

The HUSKYLENS 2 Microscope Lens Module is a special attachment designed to convert the HUSKYLENS 2 into a smart digital microscope.

  1. Uses a 6mm focal length lens
  2. Provides 30× magnification
  3. Equipped with 2MP GC2093 sensor
  4. Can detect details as small as ~3 μm

This allows the device to not only observe microscopic objects but also identify them using AI.

It expands HUSKYLENS 2 from a vision sensor to an AI-powered microscope system.

3D Printed Microscope Stand for HUSKYLENS 2

DSC03820.JPG
DSC03831.JPG
DSC03833.JPG
DSC03836.JPG

I designed a 3D printed microscope stand specially for HUSKYLENS 2.

  1. Provides stable vertical mounting
  2. Maintains proper focus distance
  3. Ideal for microscopic observation and inspection

Since HUSKYLENS 2 supports custom model installation, you can:

  1. Train your own AI image model
  2. Deploy it directly on the device
  3. Perform AI-based microscopic analysis

This enables powerful applications like:

  1. Smart biology experiments
  2. Automated micro quality inspection
  3. AI-powered STEM learning

The stand transforms HUSKYLENS 2 into a complete AI Microscopy System

Conclusion

DSC03383.JPG

In this project, we transformed the HUSKYLENS 2 from a powerful AI camera into a physically interactive tracking system.

By integrating two servo motors with the ESP32-C6, we built a smooth pan-and-tilt mechanism that allows the camera not just to detect — but to react and follow in real time.

The system now:

  1. Reads AI detection data
  2. Adjusts servo angles dynamically
  3. Keeps the subject centered automatically

What started as a vision module has become a complete AI-powered tracking platform.

This project demonstrates how embedded systems, servo control, and machine vision can work together to bridge the gap between digital intelligence and physical movement.

With this foundation, you can expand into:

  1. Smart surveillance
  2. Interactive robotics
  3. Auto-tracking cameras
  4. AI-based classroom demos
  5. Advanced maker and research projects

You’ve now built more than just a tracker — you’ve built a system where AI doesn’t just see… it moves.

Keep building. Keep experimenting.