Skip to content

Instantly share code, notes, and snippets.

@Foadsf
Created January 29, 2026 12:45
Show Gist options
  • Select an option

  • Save Foadsf/7750b0dfbc0ddf716feef28f8df36e8a to your computer and use it in GitHub Desktop.

Select an option

Save Foadsf/7750b0dfbc0ddf716feef28f8df36e8a to your computer and use it in GitHub Desktop.

Radxa Dragon Q6A - AI Quick Start Guide

This guide enables Hardware Accelerated AI on the Radxa Dragon Q6A. We will run Llama 3.2 (LLM) on the NPU and Whisper (Speech) on the CPU to create a fully voice-interactive system.

Hardware: Radxa Dragon Q6A (QCS6490) OS: Ubuntu 24.04 Noble (T7 Image or newer) Status: βœ… Verified Working (Jan 2026)


πŸ› οΈ Step 1: System Preparation

Run these commands once to install drivers and set permissions.

1. Install Dependencies

sudo apt update
sudo apt install -y fastrpc fastrpc-dev libcdsprpc1 radxa-firmware-qcs6490 \
    python3-pip python3.12-venv libportaudio2 ffmpeg git

2. Set Permanent NPU Permissions

This ensures you don't get "Permission Denied" errors after rebooting.

sudo tee /etc/udev/rules.d/99-fastrpc.rules << 'EOF'
KERNEL=="fastrpc-*", MODE="0666"
SUBSYSTEM=="dma_heap", KERNEL=="system", MODE="0666"
EOF

# Apply immediately
sudo udevadm control --reload-rules
sudo udevadm trigger

3. Create Python Virtual Environment

We use a virtual environment to prevent "Dependency Hell" with system packages.

# Create and activate
python3 -m venv ~/qai-venv
source ~/qai-venv/bin/activate

# Install AI tools (Whisper, Audio libraries)
pip install --upgrade pip
pip install "qai-hub-models[whisper-small]" librosa sounddevice

πŸ¦™ Step 2: Setup Llama 3.2 (NPU)

We use the 4096-context model for better conversation memory.

1. Download Model

(Note: requires ~2GB space)

# Ensure you are NOT in the venv for this part (using system tools for binary download)
deactivate 2>/dev/null

# Install downloader
pip3 install modelscope --break-system-packages

# Download
mkdir -p ~/llama-4k && cd ~/llama-4k
modelscope download --model radxa/Llama3.2-1B-4096-qairt-v68 --local_dir .

# Make the runner executable
chmod +x genie-t2t-run

2. Create "Chat" Shortcut

Create a simple script to run the NPU model.

cd ~/llama-4k
cat << 'EOF' > chat
#!/bin/bash
cd ~/llama-4k
export LD_LIBRARY_PATH="$(pwd):$LD_LIBRARY_PATH"

# Llama 3 Prompt Format
PROMPT="<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n$1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

./genie-t2t-run -c htp-model-config-llama32-1b-gqa.json -p "$PROMPT"
EOF
chmod +x chat

Test it: ~/llama-4k/chat "What is the capital of France?"


πŸŽ™οΈ Step 3: Setup Whisper (Speech-to-Text)

We run Whisper Small on the CPU. It is lightweight enough to be fast without needing the complex NPU compilation.

1. Create "Transcribe" Script

cat << 'EOF' > ~/transcribe.sh
#!/bin/bash
# Wrapper to run Whisper in the virtual environment

source ~/qai-venv/bin/activate
python3 -m qai_hub_models.models.whisper_small.demo --audio-file "$1" 2>/dev/null | grep "Transcription:" | sed 's/Transcription: //'
EOF
chmod +x ~/transcribe.sh

2. Verify Audio

Download a sample file to test the system.

wget [https://github.com/ggerganov/whisper.cpp/raw/master/samples/jfk.wav](https://github.com/ggerganov/whisper.cpp/raw/master/samples/jfk.wav) -O jfk.wav
~/transcribe.sh jfk.wav

Expected Output: "And so my fellow Americans..."


πŸ€– Step 4: The "Jarvis" Demo (Voice to AI)

Combine both tools! This script records your voice, converts it to text, sends it to Llama, and prints the answer.

1. Create the Voice Assistant Script

cat << 'EOF' > ~/voice-chat.sh
#!/bin/bash

echo "πŸ”΄ Recording... (Press Ctrl+C to stop, or wait 5 seconds)"
arecord -d 5 -f cd -r 16000 -c 1 -t wav my_voice.wav 2>/dev/null
echo "βœ… Processing..."

# 1. Speech to Text (Whisper)
USER_TEXT=$(~/transcribe.sh my_voice.wav)
echo "πŸ—£οΈ  You said: $USER_TEXT"

if [ -z "$USER_TEXT" ]; then
    echo "❌ No speech detected."
    exit 1
fi

# 2. Text to Intelligence (Llama NPU)
echo "πŸ€– AI Thinking..."
~/llama-4k/chat "$USER_TEXT"
EOF
chmod +x ~/voice-chat.sh

2. Run It

Plug in a USB microphone and run:

~/voice-chat.sh

πŸ“Š Performance Summary

Component Model Processor Performance
Brain Llama 3.2 1B (4096) NPU (Hexagon) ~15 tokens/sec (Real-time)
Ears Whisper Small CPU (Kryo) ~2 sec for 5 sec audio
Memory System RAM Shared ~2.5 GB Total Used

πŸ› Troubleshooting

Issue Solution
Permission denied (/dev/fastrpc) Run the Step 1 udev commands and reboot.
genie-t2t-run: not found Ensure you are in ~/llama-4k and run chmod +x genie-t2t-run.
ModuleNotFoundError (Whisper) Run source ~/qai-venv/bin/activate before using python.
EOFError (Whisper) The audio file is corrupt/empty. Re-download or re-record.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment