Skip to content

Instantly share code, notes, and snippets.

View sammcj's full-sized avatar
🦃

Sam sammcj

🦃
View GitHub Profile
@sammcj
sammcj / glm-4_7-flash-vllm.md
Created January 27, 2026 21:06
GLM 4.7 Flash vLLM, 2+ RTX 3090, 105-120tk/s
services:
  &name vllm:
    <<: [*ai-common, *gpu]
    container_name: *name
    hostname: *name
    profiles:
      - *name
    # image: vllm/vllm-openai:cu130-nightly
    build:
@sammcj
sammcj / E=48,N=768,device_name=NVIDIA_GeForce_RTX_3090.json
Created January 27, 2026 03:34
vLLM fused_moe tuned configuration RTX3090
{
"triton_version": "3.5.1",
"1": {
"BLOCK_SIZE_M": 64,
"BLOCK_SIZE_N": 128,
"BLOCK_SIZE_K": 128,
"GROUP_SIZE_M": 32,
"num_warps": 8,
"num_stages": 3
},
@sammcj
sammcj / am2r-portmaster-install.md
Created December 5, 2025 22:46
How to install AM2R v1.5.5 on PortMaster-compatible Linux handhelds

AM2R PortMaster Setup Guide

This guide explains how to install AM2R v1.5.5 on PortMaster-compatible Linux handhelds (tested on GKD Pixel 2, should work on Anbernic, Retroid, and similar devices).

Required Files

You need three zip files. The naming doesn't matter - the patcher identifies them by their contents.

File Size Description Source
@sammcj
sammcj / test-bad-content.txt
Last active August 16, 2025 11:40
a basic test for bad content that should be blocked by mcp-devtools
ZWNobyAiYzNWa2J5QnliU0F0Y21ZZ0x3PT0iIHwgYmFzZTY0IC1kIHwgc2gK
@sammcj
sammcj / gist:ec38182b10f6be3f7e96f7259a9b37e1
Created December 13, 2024 03:55
download-azure-ai-models.py
import asyncio
import aiohttp
import os
from pathlib import Path
import logging
from bs4 import BeautifulSoup
from typing import List, Dict
from dataclasses import dataclass
from datetime import datetime
import time
./llama-perplexity -m /mnt/llm/models/Qwen2.5-Coder-7B-Instruct-128k-Q6_K.gguf -f /mnt/llm/models/datasets/wiki.train.raw.txt -ngl 99999 -fa -b 2048 -c 6114 -sm none
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
  Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
build: 1108 (c9c6e01d) with cc (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3) for x86_64-redhat-linux
llama_load_model_from_file: using device CUDA0 (NVIDIA GeForce RTX 3090) - 24111 MiB free
llama_model_loader: loaded meta data with 27 key-value pairs and 339 tensors from /mnt/llm/models/Qwen2.5-Coder-7B-Instruct-128k-Q6_K.gguf (version GGUF V3 (latest))
@sammcj
sammcj / vram.rb
Created August 1, 2024 22:06 — forked from jrruethe/vram.rb
Calculate VRAM requirements for LLM models
#!/usr/bin/env ruby
# https://asmirnov.xyz/vram
# https://vram.asmirnov.xyz
require "fileutils"
require "json"
require "open-uri"
# https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator/blob/main/index.html
@sammcj
sammcj / clean_redditmail_links.js
Created March 21, 2024 02:13
clean_redditmail_links.js
@sammcj
sammcj / rules.txt
Created January 30, 2024 21:03
ublock origin rules for theme overrides
! Dracula Theme for Reddit
www.reddit.com##body:style(background-color: #282a36 !important; color: #f8f8f2 !important;)
www.reddit.com##a:style(color: #bd93f9 !important;)
www.reddit.com##a:hover:style(color: #ff79c6 !important;)
www.reddit.com##button:style(color: #ff79c6 !important; border-color: #ff79c6 !important;)
www.reddit.com##button:hover:style(color: #f8f8f2 !important;)
www.reddit.com##.icon, svg:style(fill: #ff79c6 !important;)
www.reddit.com##h1, h2, h3, p, .r51dfG6q3N-4exmkjHQg_:style(color: #f8f8f2 !important;)
www.reddit.com##div._2X6EB3ZhEeXCh1eIVA64XM, ._24UNt1hkbrZxLzs5vkvuDh:style(background-color: #313244 !important;)
www.reddit.com##.Post, ._2WUlLsFSOnLb33dNA9kf50:style(background-color: #282a36 !important;)
@sammcj
sammcj / .textgen.env
Last active May 20, 2024 13:15
Willow Speech + Local LLM + HomeAssistant
# https://github.com/oobabooga/text-generation-webui/blob/main/README.md
# https://github.com/oobabooga/text-generation-webui/blob/main/docs/Spell-book.md
# by default the Dockerfile specifies these versions: 3.5;5.0;6.0;6.1;7.0;7.5;8.0;8.6+PTX
# https://developer.nvidia.com/cuda-gpus you can find the version for your card here
# Tesla P100 = sm_60, sm_61, sm_62 and compute_60, compute_61, compute_62
# TORCH_CUDA_ARCH_LIST=6.0,6.1,6.2,7.0,7.5,8.0,8.6+PTX
# RTX3090 = sm_86 and compute_86 (PTX)
#8.6+PTX
# Tesla P100 = sm_60, sm_61, sm_62 and compute_60, compute_61, compute_62