ESP32-S3-EYE Documentation:

This user guide will help you get started with ESP32-S3-EYE v2.1 and will provide more -in depth information

9/21/20246 min read

The ESP32-S3-EYE is a small sized AI development board produced by Espressif. It is based on the ESP32-S3 SOC and ESP-WHO can guide everything of this module, Espressif’s AI development framework. It features a 2-Megapixel camera, an LCD display, and a microphone, which are used for image recognition and audio processing.

ESP32-S3-EYE offers plenty of storage, with an 8 MB Octal PSRAM and an 8 MB flash. It also supports image transmission via Wi-Fi

and debugging through a Micro-USB port. With ESP-WHO, you can develop a variety of AIoT applications,  such as smart doorbell,

surveillance systems, facial recognition time clock, etc.

1.1. Overview

The ESP32-S3-EYE board consists of two parts: the main board (ESP32-S3-EYE-MB) that integrates the

ESP32-S3-WROOM-1 module, camera, sd card slot, digital microphone, USB port,  and function buttons: and the sub board (ESP32-S3-EYE-SUB) that contains an LCD display.  The main board and sub board are connected through pin headers.

1.2. ESP32-S3-EYE Block Diagram

The block diagram below presents main components of the ESP32-S3-EYE-MB main board (on the left)

and the ESP32-S3-EYE-SUB sub board (on the right), as well as the interconnections between components.

and The ESP32-S3-EYE has some additional functions which is based on ESP32-based development board.

  • PSRAM: 8 MB Octal PSRAM

  • Flash: 8 MB flash

  • LCD display: Yes

  • Accelerometer: Yes

  • Alternative power supply: External battery (optional)

  • USB-to-UART bridge: No need. Functionality is provided by ESP32-S3 USB Serial/JTAG interface.

  • Antenna connector: No need. Antenna is provided by the ESP32-S3-WROOM-1 module.

1.3. Components on the ESP32-S3-EYE-MB Main Board:

The following sections will describe the key components on the main board and the sub board, respectively Key Component Description:

The key components of the board are described from front view to back view, starting from the camera, in an anti-clockwise direction.

  • Camera: The camera OV2640 with 2 million pixels has a 66.5° field of view and a maximum resolution of 1600x1200. You can change the resolution when developing applications.

  • Module Power LED: The LED (green) turns on when USB power is connected to the board.

    If it is not turned on, it indicates either the USB power is not supplied, or the 5 V to 3.3 V LDO

    is broken. Software can configure GPIO3 to set different LED statuses (turned on/off, flashing)

    for different statuses of the board. Note that GPIO3 must be set up in open-drain mode.

    Pulling GPIO3 up may burn the LED.

  • Pin Headers: Connect the female headers on the sub board.

  • 5 V to 3.3 V LDO: Power regulator that converts a 5 V supply into a 3.3 V output for the module.

  • Digital Microphone: The digital I2S MEMS microphone features 61 dB SNR and –26

    dBFS sensitivity, working at 3.3 V.

  • FPC Connector: Connects the main board and the sub board.

  • Function Button: There are six function buttons on the board. Users can configure any functions as

    needed except for the RST button.

  • ESP32-S3-WROOM-1: The ESP32-S3-WROOM-1 module embeds the ESP32-S3R8 chip variant that provides

    Wi-Fi and Bluetooth 5 (LE) connectivity, as well as dedicated vector instructions for accelerating neural

    network computing and signal processing. On top of the integrated 8 MB Octal SPI PSRAM offered by the SoC, the module also comes with 8 MB flash, allowing for fast data access. ESP32-S3-WROOM-1U module is also supported.

  • MicroSD Card Slot: Used for inserting a MicroSD card to expand memory capacity.

  • 3.3 V to 1.5 V LDO: Power regulator that converts a 3.3 V supply into a 1.5 V output for the camera.

  • 3.3 V to 2.8 V LDO: Power regulator that converts a 3.3 V supply into a 2.8 V output for the camera.

  • USB Port: A Micro-USB port used for 5 V power supply to the board, as well as for communication with the chip via GPIO19 and GPIO20.

  • Battery Soldering Points: Used for soldering a battery socket to connect an external Li-ion battery that can serve as an alternative power supply to the board. If you use an external battery, make sure it has built-in protection circuit and fuse. The recommended specifications of the battery: capacity > 1000 mAh, output voltage 3.7 V, input voltage 4.2 V – 5 V.

  • Battery Charger Chip: 1 A linear Li-ion battery charger (ME4054BM5G-N) in ThinSOT package. The power source for charging is the USB Port.

  • Battery Red LED: When the USB power is connected to the board and a battery is not connected, the red LED blinks. If a battery is connected and being charged, the red LED turns on. When the battery is fully charged, it turns off.

  • Accelerometer: Three-axis accelerometer (QMA7981) for screen rotation, etc.

1.4. Components on the ESP32-S3-EYE-SUB Sub Board

The key components of the board are described from front view to back view, starting from the LCD display, in an anti-clockwise direction.

1.5. Default Firmware and Function Test

Each ESP32-S3-EYE board comes with pre-built default firmware that allows you to test its functions including voice wake-up, voice command recognition, face detection and recognition.

To test the board's functions, you need the following hardware:

  • 1 x ESP32-S3-EYE

  • 1 x USB 2.0 cable (Standard-A to Micro-B), for USB power supply

Before powering up your board, please make sure that it is in good condition with no obvious signs of damage. Both the main board and the sub board should be firmly connected together. Then, follow the instructions described below:

  • Connect the board to a power supply through the USB Port using a USB cable. While the board is powered up, you will notice the following responses:

  • The Module Power LED turns on for a few seconds, indicating that the default firmware is being loaded.

  • The Module Power LED turns off, indicating the default firmware has been loaded. The board enters human face recognition mode by default.

  • The LCD display shows live video streaming.

At this point, the board is ready for further instructions. You can control the board with either function buttons or voice commands. Function button control is described first:

  • Face the camera so whole human face is visible on the screen for the board to detect. Once a human face is detected, the board displays a blue rectangle.

  • Press MENU so that the board enters an ID (starting from 1) for a detected human face.

  • Press UP+ so that the board starts face recognition. Once a face is recognized, the board displays the entered face ID. If it doesn’t know the face, it displays “WHO?”.

  • Press PLAY to delete the latest face ID. The board displays "XXX ID(S) LEFT".

To control the board with voice commands, follow the instructions below:

  1. Complete step 1 described previously and notice responses from the board.

  2. Activate the board with the default English wake word “Hi ESP”. When the wake word is detected, the Module Power LED will turn on, indicating that the board is ready for a speech command.

  3. Say an English speech command to control the board. Once a speech command is recognized, the Module Power LED will blink. The supported English speech commands in face recognition mode are listed below:

Default English Speech Commands Response

  • Enter face: The board enters a human face ID.

  • Recognize face: The board displays a recognized human face ID, or "WHO?" for an unrecognized face.

  • Delete face: The board deletes the latest human face ID entered and displays "XXX ID(S) LEFT".

  1. After waking up the board as described in step 2, you can also switch the board's working mode with speech commands. Once a speech command is recognized, the Module Power LED will blink.

  • Speech Commands for Different Working Modes : Response

  • Face recognition: The board displays a blue rectangle if a human face is detected.

  • Motion detection: The board displays a solid blue rectangle in upper-left corner if motion is detected.

  • Display only: The board displays only live video streaming.

  • Stop working: The board does nothing and displays Espressif logo.

Now you get the first experience with the board. The following sections provide further information about how to flash firmware onto the board, configuration options, related resources, and more.

2. Start Application Development

This section provides instructions on how to do hardware/software setup and flash firmware onto the board for application development.

2.1. Required Hardware

  • 1 x ESP32-S3-EYE

  • 1 x USB 2.0 cable (Standard-A to Micro-B), for USB power supply and flashing firmware on to the board

  • 1 x Computer running Windows, Linux, or macOS

2.2. Hardware Setup

Prepare the board for loading of the first sample application:

  • Connect the board with the computer through the USB Port using a USB cable. The Module Power LED should turn on. Assuming that a battery is not connected, the Battery Red LED will blink.

  • Now the board is ready for software setup.

2.3. Software Setup

After hardware setup, you can proceed with preparation of development tools. Go to the reference guide to ESP-WHO, which will walk you through the following steps:

  • Get ESP-IDF which provides a common framework to develop applications for ESP32-S3 in C language.

  • Get ESP-WHO which is an image processing platform that runs on ESP-IDF.

  • Run Examples that are provided by ESP-WHO.