Skip to content

Instantly share code, notes, and snippets.

@Foadsf
Foadsf / README.md
Created February 2, 2026 22:09
Radxa Dragon Q6A: Offline Audio Transcriber CLI (Whisper CPU) - A polished CLI tool for converting audio to text.

🎙️ Radxa Dragon Q6A: Audio Transcriber CLI

Hardware: Radxa Dragon Q6A (Qualcomm QCS6490)
Engine: OpenAI Whisper (CPU via HuggingFace Transformers)
Interface: Python CLI with Rich UI

Overview

This is a polished, "production-ready" Command Line Interface (CLI) for transcribing audio files directly on the Radxa Dragon. It automatically handles file inputs, generates text transcripts with matching filenames, and provides a beautiful visual status during processing.

While the NPU is great for LLMs, we run Whisper on the CPU here to ensure maximum compatibility with various audio formats and to support the complex decoding logic required for long-form transcription.

@Foadsf
Foadsf / README.md
Created February 2, 2026 21:59
Radxa Dragon Q6A: Offline Voice Assistant "Jarvis" (NPU Llama + CPU Whisper)

🤖 Radxa Dragon Q6A: Offline "Jarvis" Assistant

Verified: February 2026
Hardware: Radxa Dragon Q6A (Qualcomm QCS6490)
OS: Ubuntu 24.04 (Noble)
Status: ✅ Production Ready

The "Hybrid" Architecture

Building a truly responsive voice assistant on embedded hardware requires balancing workloads. We use the NPU (Hexagon DSP) for the heavy lifting (LLM) and the CPU (Kryo) for real-time sensory tasks.

Radxa Dragon Q6A AI Assistant: The "Hybrid" Approach

Verified: February 2026
Hardware: Radxa Dragon Q6A (Qualcomm QCS6490)
OS: Ubuntu 24.04 (Noble)

The Goal

To create a "Jarvis-like" voice assistant that runs entirely on-device (offline) with low latency.

The Challenge & Lessons Learned

Running Whisper Speech Recognition on Radxa Dragon Q6A NPU

TL;DR: Whisper-Small runs on the QCS6490 NPU at ~506ms for 30 seconds of audio — 140x faster than CPU inference.

This guide documents how to run OpenAI's Whisper speech recognition model on the Radxa Dragon Q6A's Hexagon NPU using ONNX Runtime with QNN Execution Provider.

Hardware & Software

  • Board: Radxa Dragon Q6A (8GB variant, ~$140)
  • SoC: Qualcomm QCS6490

Radxa Dragon Q6A - AI Server Quick Start

Turn the Radxa Dragon Q6A into a self-hosted AI appliance. Features:

  • Brain: Llama 3.2 1B (4096 context) running on NPU (Real-time).
  • Ears: Whisper Small running on CPU (Fast).
  • Interface: Open WebUI (ChatGPT-style) accessible over WiFi.

Hardware: Radxa Dragon Q6A (QCS6490) OS: Ubuntu 24.04 Noble (T7 Image or newer)

Radxa Dragon Q6A - AI Quick Start Guide

This guide enables Hardware Accelerated AI on the Radxa Dragon Q6A. We will run Llama 3.2 (LLM) on the NPU and Whisper (Speech) on the CPU to create a fully voice-interactive system.

Hardware: Radxa Dragon Q6A (QCS6490)
OS: Ubuntu 24.04 Noble (T7 Image or newer)
Status: ✅ Verified Working (Jan 2026)


Radxa Dragon Q6A - AI Quick Start Guide

This guide enables Hardware Accelerated AI on the Radxa Dragon Q6A. We will run Llama 3.2 (LLM) on the NPU and Whisper (Speech) on the CPU to create a fully voice-interactive system.

Hardware: Radxa Dragon Q6A (QCS6490) OS: Ubuntu 24.04 Noble (T7 Image or newer) Status: ✅ Verified Working (Jan 2026)


Radxa Dragon Q6A - NPU Quick Start Guide

Run Llama 3.2 1B (4096 Context) on the 12 TOPS Hexagon NPU.

Hardware: Radxa Dragon Q6A (QCS6490) OS: Ubuntu 24.04 Noble (T7 Image or newer) Status: ✅ Verified Working (Jan 29, 2026)


Radxa Dragon Q6A - NPU Quick Start Guide

Run Llama 3.2 1B on the 12 TOPS Hexagon NPU.

Hardware: Radxa Dragon Q6A (QCS6490)
OS: Ubuntu 24.04 Noble
Last tested: January 2026 (T7 image)


Radxa Dragon Q6A NPU Troubleshooting Guide

Status: 🔴 UNRESOLVED - Community help needed
Last Updated: January 29, 2026
Hardware: Radxa Dragon Q6A (QCS6490, 8GB RAM)
Goal: Run AI inference (Llama 3.2 1B) on the Hexagon NPU (12 TOPS)


Table of Contents