Radxa Dragon Q6A - AI Quick Start Guide

This guide enables Hardware Accelerated AI on the Radxa Dragon Q6A. We will run Llama 3.2 (LLM) on the NPU and Whisper (Speech) on the CPU to create a fully voice-interactive system.

Hardware: Radxa Dragon Q6A (QCS6490) OS: Ubuntu 24.04 Noble (T7 Image or newer) Status: ✅ Verified Working (Jan 2026)

🛠️ Step 1: System Preparation

Run these commands once to install drivers and set permissions.

1. Install Dependencies

sudo apt update
sudo apt install -y fastrpc fastrpc-dev libcdsprpc1 radxa-firmware-qcs6490 \
    python3-pip python3.12-venv libportaudio2 ffmpeg git

2. Set Permanent NPU Permissions

This ensures you don't get "Permission Denied" errors after rebooting.

sudo tee /etc/udev/rules.d/99-fastrpc.rules << 'EOF'
KERNEL=="fastrpc-*", MODE="0666"
SUBSYSTEM=="dma_heap", KERNEL=="system", MODE="0666"
EOF

# Apply immediately
sudo udevadm control --reload-rules
sudo udevadm trigger

3. Create Python Virtual Environment

We use a virtual environment to prevent "Dependency Hell" with system packages.

# Create and activate
python3 -m venv ~/qai-venv
source ~/qai-venv/bin/activate

# Install AI tools (Whisper, Audio libraries)
pip install --upgrade pip
pip install "qai-hub-models[whisper-small]" librosa sounddevice

🦙 Step 2: Setup Llama 3.2 (NPU)

We use the 4096-context model for better conversation memory.

1. Download Model

(Note: requires ~2GB space)

# Ensure you are NOT in the venv for this part (using system tools for binary download)
deactivate 2>/dev/null

# Install downloader
pip3 install modelscope --break-system-packages

# Download
mkdir -p ~/llama-4k && cd ~/llama-4k
modelscope download --model radxa/Llama3.2-1B-4096-qairt-v68 --local_dir .

# Make the runner executable
chmod +x genie-t2t-run

2. Create "Chat" Shortcut

Create a simple script to run the NPU model.

cd ~/llama-4k
cat << 'EOF' > chat
#!/bin/bash
cd ~/llama-4k
export LD_LIBRARY_PATH="$(pwd):$LD_LIBRARY_PATH"

# Llama 3 Prompt Format
PROMPT="<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n$1<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

./genie-t2t-run -c htp-model-config-llama32-1b-gqa.json -p "$PROMPT"
EOF
chmod +x chat

Test it: ~/llama-4k/chat "What is the capital of France?"

🎙️ Step 3: Setup Whisper (Speech-to-Text)

We run Whisper Small on the CPU. It is lightweight enough to be fast without needing the complex NPU compilation.

1. Create "Transcribe" Script

cat << 'EOF' > ~/transcribe.sh
#!/bin/bash
# Wrapper to run Whisper in the virtual environment

source ~/qai-venv/bin/activate
python3 -m qai_hub_models.models.whisper_small.demo --audio-file "$1" 2>/dev/null | grep "Transcription:" | sed 's/Transcription: //'
EOF
chmod +x ~/transcribe.sh

2. Verify Audio

Download a sample file to test the system.

wget [https://github.com/ggerganov/whisper.cpp/raw/master/samples/jfk.wav](https://github.com/ggerganov/whisper.cpp/raw/master/samples/jfk.wav) -O jfk.wav
~/transcribe.sh jfk.wav

Expected Output: "And so my fellow Americans..."

🤖 Step 4: The "Jarvis" Demo (Voice to AI)

Combine both tools! This script records your voice, converts it to text, sends it to Llama, and prints the answer.

1. Create the Voice Assistant Script

cat << 'EOF' > ~/voice-chat.sh
#!/bin/bash

echo "🔴 Recording... (Press Ctrl+C to stop, or wait 5 seconds)"
arecord -d 5 -f cd -r 16000 -c 1 -t wav my_voice.wav 2>/dev/null
echo "✅ Processing..."

# 1. Speech to Text (Whisper)
USER_TEXT=$(~/transcribe.sh my_voice.wav)
echo "🗣️  You said: $USER_TEXT"

if [ -z "$USER_TEXT" ]; then
    echo "❌ No speech detected."
    exit 1
fi

# 2. Text to Intelligence (Llama NPU)
echo "🤖 AI Thinking..."
~/llama-4k/chat "$USER_TEXT"
EOF
chmod +x ~/voice-chat.sh

2. Run It

Plug in a USB microphone and run:

~/voice-chat.sh

📊 Performance Summary

Component	Model	Processor	Performance
Brain	Llama 3.2 1B (4096)	NPU (Hexagon)	~15 tokens/sec (Real-time)
Ears	Whisper Small	CPU (Kryo)	~2 sec for 5 sec audio
Memory	System RAM	Shared	~2.5 GB Total Used

🐛 Troubleshooting

Issue	Solution
`Permission denied` (/dev/fastrpc)	Run the Step 1 udev commands and reboot.
`genie-t2t-run: not found`	Ensure you are in `~/llama-4k` and run `chmod +x genie-t2t-run`.
`ModuleNotFoundError` (Whisper)	Run `source ~/qai-venv/bin/activate` before using python.
`EOFError` (Whisper)	The audio file is corrupt/empty. Re-download or re-record.

Foadsf/RADXA_DRAGON_Q6A_NPU_QUICKSTART.md

Select an option

No results found