Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jscott3201/ad69c4ffbd79f18b11a0f6a94c94fadf to your computer and use it in GitHub Desktop.

Select an option

Save jscott3201/ad69c4ffbd79f18b11a0f6a94c94fadf to your computer and use it in GitHub Desktop.
A drop-in replacement chat template for google/gemma-4-31B-it tuned for open-source agentic coding harnesses.
{#---------------------------------------------------------------------
custom_pub_chat_template_gemma4.jinja
=====================================
A public, harness-friendly fork of Google's Gemma 4 chat template,
tuned for open-source agentic coding harnesses like:
- anomalyco/opencode (https://github.com/anomalyco/opencode)
- earendil-works/pi (https://github.com/earendil-works/pi)
- openclaw, OpenHarness, similar Claude-Code-style harnesses
WHY THIS FORK EXISTS
--------------------
The upstream chat template at google/gemma-4-31B-it is correct for
chat use, but four real edge cases bite agentic coding harnesses:
1. tool_call.arguments arriving as a JSON string (Vercel AI SDK and
several OpenAI-compatible adapters serialize this way) is silently
wrapped in extra braces by the upstream template, producing invalid
Gemma 4 DSL like call:fn{{"city":"Tokyo"}} — nested braces, JSON
colons, and quoted keys, none of which the model was trained on.
Symptom: degraded tool-call accuracy, mysterious arguments collapse
to {} on repeated calls.
2. Prior-turn reasoning is dropped from history. The model card says
"historical model output should only include the final response,"
but agentic harnesses doing multi-step tool calls benefit
materially from keeping the prior reasoning visible. The Qwen3.6
analogue of this bug is documented at:
https://github.com/earendil-works/pi/issues/3325
Symptom (on Qwen and Gemma alike): after 2-3 turns, every tool call
collapses to arguments: {} even though the model's prior reasoning
correctly identified the parameters it needed.
3. enable_thinking defaults to FALSE in the upstream template, and
most OpenAI-compatible adapters drop unknown request fields:
https://github.com/anomalyco/opencode/issues/24264
So the harness ends up with thinking permanently off, agentic
tool-call accuracy suffers, and there's no obvious failure signal.
4. JSON null values inside tool_call.arguments render as the bare
string "None" (Python repr of None survives Jinja). Optional
fields are very common in coding tools (find_files, search:
pattern=..., language=null) and this corrupts the prompt silently.
This fork is also forked from a private engineering fork used in
internal harnesses; the public copy reuses the same five patches but
adds expanded comments, removes references to private design docs,
and ships with a self-contained pytest conformance suite.
PATCH INVENTORY (full details next to each patch site below)
------------------------------------------------------------
P1 format_argument: emit JSON null instead of bare "None"
P2 enable_thinking defaults to TRUE
P3 tool_call.arguments as string: RAISE instead of silent corruption
P4 preserve_thinking kwarg (default TRUE) keeps prior <|channel>
P5 fix HF discussion #62 turn-tag close asymmetry
INVARIANT
---------
With enable_thinking=False AND preserve_thinking=False passed
explicitly, this template renders byte-for-byte identical to the
upstream verbatim template on every input that doesn't hit P1, P3,
or P5's bug sites. The conformance suite at
tests/test_custom_pub_chat_template.py
locks this in across 21 representative cases.
USAGE
-----
Server side (e.g. vLLM or SGLang):
--chat-template /path/to/custom_pub_chat_template_gemma4.jinja
Harness side: no changes required for the common case. If you need
to force defaults off (e.g. to match upstream behaviour exactly):
{
"extra_body": {
"chat_template_kwargs": {
"enable_thinking": false,
"preserve_thinking": false
}
}
}
For opencode-style providers, this maps to the `chat_template_args`
field in models config; for pi, set thinkingFormat appropriately
in the provider's compat block and pi will inject these kwargs.
PINS
----
Forked from google/gemma-4-31B-it @ fcf2302760ae9c6e528a8dbba9dd636e56848237
Fork date: 2026-05-22
License: Apache 2.0 (same as upstream)
Maintainer: see repo README
---------------------------------------------------------------------#}
{%- macro format_parameters(properties, required, filter_keys=false) -%}
{%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
{%- set ns = namespace(found_first=false) -%}
{%- for key, value in properties | dictsort -%}
{%- set add_comma = false -%}
{%- if not filter_keys or key not in standard_keys -%}
{%- if ns.found_first %},{% endif -%}
{%- set ns.found_first = true -%}
{{ key }}:{
{%- if value['description'] -%}
description:<|"|>{{ value['description'] }}<|"|>
{%- set add_comma = true -%}
{%- endif -%}
{%- if value['type'] | upper == 'STRING' -%}
{%- if value['enum'] -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
enum:{{ format_argument(value['enum']) }}
{%- endif -%}
{%- elif value['type'] | upper == 'ARRAY' -%}
{%- if value['items'] is mapping and value['items'] -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
items:{
{%- set ns_items = namespace(found_first=false) -%}
{%- for item_key, item_value in value['items'] | dictsort -%}
{%- if item_value is not none -%}
{%- if ns_items.found_first %},{% endif -%}
{%- set ns_items.found_first = true -%}
{%- if item_key == 'properties' -%}
properties:{
{%- if item_value is mapping -%}
{{- format_parameters(item_value, value['items']['required'] | default([])) -}}
{%- endif -%}
}
{%- elif item_key == 'required' -%}
required:[
{%- for req_item in item_value -%}
<|"|>{{- req_item -}}<|"|>
{%- if not loop.last %},{% endif -%}
{%- endfor -%}
]
{%- elif item_key == 'type' -%}
{%- if item_value is string -%}
type:{{ format_argument(item_value | upper) }}
{%- else -%}
type:{{ format_argument(item_value | map('upper') | list) }}
{%- endif -%}
{%- else -%}
{{ item_key }}:{{ format_argument(item_value) }}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
}
{%- endif -%}
{%- endif -%}
{%- if value['nullable'] %}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
nullable:true
{%- endif -%}
{%- if value['type'] | upper == 'OBJECT' -%}
{%- if value['properties'] is defined and value['properties'] is mapping -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
properties:{
{{- format_parameters(value['properties'], value['required'] | default([])) -}}
}
{%- elif value is mapping -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
properties:{
{{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
}
{%- endif -%}
{%- if value['required'] -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
required:[
{%- for item in value['required'] | default([]) -%}
<|"|>{{- item -}}<|"|>
{%- if not loop.last %},{% endif -%}
{%- endfor -%}
]
{%- endif -%}
{%- endif -%}
{%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
type:<|"|>{{ value['type'] | upper }}<|"|>}
{%- endif -%}
{%- endfor -%}
{%- endmacro -%}
{%- macro format_function_declaration(tool_data) -%}
declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
{%- set params = tool_data['function']['parameters'] -%}
{%- if params -%}
,parameters:{
{%- if params['properties'] -%}
properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
{%- endif -%}
{%- if params['required'] -%}
required:[
{%- for item in params['required'] -%}
<|"|>{{- item -}}<|"|>
{{- ',' if not loop.last -}}
{%- endfor -%}
],
{%- endif -%}
{%- if params['type'] -%}
type:<|"|>{{- params['type'] | upper -}}<|"|>}
{%- endif -%}
{%- endif -%}
{%- if 'response' in tool_data['function'] -%}
{%- set response_declaration = tool_data['function']['response'] -%}
,response:{
{%- if response_declaration['description'] -%}
description:<|"|>{{- response_declaration['description'] -}}<|"|>,
{%- endif -%}
{%- if response_declaration['type'] | upper == 'OBJECT' -%}
type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
{%- endif -%}
{%- endif -%}
}
{%- endmacro -%}
{%- macro format_argument(argument, escape_keys=True) -%}
{#- P1 (public fork): emit JSON null for None values rather than the
bare string "None". Jinja's default coercion of Python's None
goes through str(None) -> "None", which then leaks into the
Gemma 4 DSL as a literal token the model has never been trained
on. Common bite path: a coding tool's optional argument
(language=null in a find-files call, after=null in a search,
etc.) → upstream emits after:None in the DSL → model
confusion. We emit after:null instead, matching the JSON wire
format the model has actually seen.
Branch ordering: `is none` must precede `is string`, `is
mapping`, `is sequence`, etc., because None matches NONE of
them in Jinja's type tests but the final else-branch
({{ argument }}) would otherwise stringify it. -#}
{%- if argument is none -%}
{{- 'null' -}}
{%- elif argument is string -%}
{{- '<|"|>' + argument + '<|"|>' -}}
{%- elif argument is boolean -%}
{{- 'true' if argument else 'false' -}}
{%- elif argument is mapping -%}
{{- '{' -}}
{%- set ns = namespace(found_first=false) -%}
{%- for key, value in argument | dictsort -%}
{%- if ns.found_first %},{% endif -%}
{%- set ns.found_first = true -%}
{%- if escape_keys -%}
{{- '<|"|>' + key + '<|"|>' -}}
{%- else -%}
{{- key -}}
{%- endif -%}
:{{- format_argument(value, escape_keys=escape_keys) -}}
{%- endfor -%}
{{- '}' -}}
{%- elif argument is sequence -%}
{{- '[' -}}
{%- for item in argument -%}
{{- format_argument(item, escape_keys=escape_keys) -}}
{%- if not loop.last %},{% endif -%}
{%- endfor -%}
{{- ']' -}}
{%- else -%}
{{- argument -}}
{%- endif -%}
{%- endmacro -%}
{%- macro strip_thinking(text) -%}
{%- set ns = namespace(result='') -%}
{%- for part in text.split('<channel|>') -%}
{%- if '<|channel>' in part -%}
{%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
{%- else -%}
{%- set ns.result = ns.result + part -%}
{%- endif -%}
{%- endfor -%}
{{- ns.result | trim -}}
{%- endmacro -%}
{%- macro format_tool_response_block(tool_name, response) -%}
{{- '<|tool_response>' -}}
{%- if response is mapping -%}
{{- 'response:' + tool_name + '{' -}}
{%- for key, value in response | dictsort -%}
{{- key -}}:{{- format_argument(value, escape_keys=False) -}}
{%- if not loop.last %},{% endif -%}
{%- endfor -%}
{{- '}' -}}
{%- else -%}
{{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
{%- endif -%}
{{- '<tool_response|>' -}}
{%- endmacro -%}
{%- set ns = namespace(prev_message_type=None) -%}
{%- set loop_messages = messages -%}
{#- P2 (public fork): default enable_thinking to TRUE.
Why: Gemma 4's upstream template defaults enable_thinking to False
(or undefined). This is wrong for agentic coding harnesses for two
reasons:
1. Google's own model card: thinking "significantly enhances
function-calling accuracy" — and tool calling IS the core
contract that coding harnesses use the model for. Defaulting it
off means most opencode/pi users see degraded tool accuracy and
have no obvious way to fix it.
2. Most OpenAI-compatible SDKs (notably Vercel AI SDK used by
opencode) strip unknown request fields, so a harness that tries
to pass chat_template_kwargs.enable_thinking=true per request
has it silently dropped. See:
https://github.com/anomalyco/opencode/issues/24264
Flipping the SERVER-SIDE default to True makes "the agentic
happy-path" the default and lets harnesses that explicitly want
chat-only behaviour override it to false per request:
{"extra_body":{"chat_template_kwargs":{"enable_thinking":false}}}
After this `set`, enable_thinking is unconditionally defined as a
bool, so downstream `is defined` guards are dropped. -#}
{%- set enable_thinking = enable_thinking | default(true) -%}
{{- bos_token -}}
{#- Handle System/Tool Definitions Block -#}
{%- if enable_thinking or tools or messages[0]['role'] in ['system', 'developer'] -%}
{{- '<|turn>system\n' -}}
{#- Inject Thinking token at the very top of the FIRST system turn -#}
{%- if enable_thinking -%}
{{- '<|think|>\n' -}}
{%- set ns.prev_message_type = 'think' -%}
{%- endif -%}
{%- if messages[0]['role'] in ['system', 'developer'] -%}
{%- if messages[0]['content'] is string -%}
{{- messages[0]['content'] | trim -}}
{%- elif messages[0]['content'] is sequence -%}
{%- for item in messages[0]['content'] -%}
{{- item['text'] | trim + ' '-}}
{%- endfor -%}
{%- endif -%}
{%- set loop_messages = messages[1:] -%}
{%- endif -%}
{%- if tools -%}
{%- for tool in tools %}
{{- '<|tool>' -}}
{{- format_function_declaration(tool) | trim -}}
{{- '<tool|>' -}}
{%- endfor %}
{%- set ns.prev_message_type = 'tool' -%}
{%- endif -%}
{{- '<turn|>\n' -}}
{%- endif %}
{#- P4 (public fork): preserve_thinking kwarg, default TRUE.
Why: upstream's reasoning re-emission gate fires only when an
assistant message (a) carries `reasoning`/`reasoning_content`,
(b) has tool_calls, AND (c) is AFTER the last user message. That
third clause is what causes the canonical multi-turn-tool-loop
breakage:
User: "find files matching '*.py' in src"
Assistant: (reasoning=...calling find_files...) tool_call:
find_files(pattern='*.py', dir='src')
Tool: [result list]
User: "now look for '*.ts' too"
Assistant: (reasoning=...) tool_call: find_files(pattern={}, dir={})
↑↑↑ arguments collapse to empty here because the prior
reasoning the model would have learned to imitate is
invisible — the previous-turn <|channel> was dropped.
The same shape was reported on Qwen3.6 and resolved by the
preserve_thinking kwarg there:
https://github.com/earendil-works/pi/issues/3325
Gemma 4's model card says "historical model output should only
include the final response" — that guidance is correct for plain
chat but actively harmful for multi-turn agentic tool calling. P4
optionally drops the (c) gate so prior reasoning stays visible to
the model on subsequent turns.
Set preserve_thinking=false to recover upstream behaviour exactly
(used by the conformance suite to verify byte-identity). -#}
{%- set preserve_thinking = preserve_thinking | default(true) -%}
{#- Pre-scan: find last user message index for reasoning guard -#}
{%- set ns_turn = namespace(last_user_idx=-1) -%}
{%- for i in range(loop_messages | length) -%}
{%- if loop_messages[i]['role'] == 'user' -%}
{%- set ns_turn.last_user_idx = i -%}
{%- endif -%}
{%- endfor -%}
{#- Loop through messages -#}
{%- for message in loop_messages -%}
{%- if message['role'] != 'tool' -%}
{%- set ns.prev_message_type = None -%}
{%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
{#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
{%- set prev_nt = namespace(role=None, found=false) -%}
{%- if loop.index0 > 0 -%}
{%- for j in range(loop.index0 - 1, -1, -1) -%}
{%- if not prev_nt.found -%}
{%- if loop_messages[j]['role'] != 'tool' -%}
{%- set prev_nt.role = loop_messages[j]['role'] -%}
{%- set prev_nt.found = true -%}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
{%- if not continue_same_model_turn -%}
{{- '<|turn>' + role + '\n' }}
{%- endif -%}
{#- Render reasoning/reasoning_content as thinking channel.
Upstream gate (all three required to re-emit):
(a) the message carries reasoning or reasoning_content,
(b) the message has tool_calls,
(c) the message is after the last user message in history.
P4 (public fork): when preserve_thinking is true (default), drop
clause (c) so prior assistant turns' <|channel> blocks survive.
See the long P4 comment above the pre-scan for why this matters
for agentic tool loops. The (b) gate stays — re-emitting a
<|channel> on a finalised text-only assistant turn is not in
the model's training distribution. -#}
{%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
{%- set thinking_gate = (loop.index0 > ns_turn.last_user_idx) or preserve_thinking -%}
{%- if thinking_text and thinking_gate and message.get('tool_calls') -%}
{{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
{%- endif -%}
{%- if message['tool_calls'] -%}
{%- for tool_call in message['tool_calls'] -%}
{%- set function = tool_call['function'] -%}
{{- '<|tool_call>call:' + function['name'] + '{' -}}
{%- if function['arguments'] is mapping -%}
{%- set ns_args = namespace(found_first=false) -%}
{%- for key, value in function['arguments'] | dictsort -%}
{%- if ns_args.found_first %},{% endif -%}
{%- set ns_args.found_first = true -%}
{{- key -}}:{{- format_argument(value, escape_keys=False) -}}
{%- endfor -%}
{%- elif function['arguments'] is none -%}
{#- P3 (public fork): None / missing arguments is
valid (means: call this tool with no args).
Emit an empty {} via the empty for-loop above. -#}
{%- else -%}
{#- P3 (public fork): refuse string (or any other
non-mapping) arguments rather than silently
corrupting the prompt.
Bug surface: many OpenAI-compatible SDKs (most
notably Vercel AI SDK, used by opencode) hand
tool_call.arguments back as a JSON-encoded
STRING — e.g. '{"city":"Tokyo"}' — rather
than the already-deserialized object. The
upstream Gemma 4 template silently emits this
string verbatim inside an extra pair of
braces, producing invalid Gemma 4 DSL:
call:fn{{"city":"Tokyo"}}
(nested braces, JSON colons, quoted keys —
none of which the model has been trained on).
The model usually still produces a plausible
response, which makes the bug INSIDIOUS: it
looks like a quality problem with the model,
not a prompt-corruption bug in the harness.
Fix: harnesses MUST deserialize
tool_calls[].function.arguments
exactly once on ingest and store the object.
See the canonical pi-side discussion:
https://github.com/earendil-works/pi/issues/3325
We raise here so the bug surfaces at the
server (an obvious HTTP error to debug)
rather than as a quiet model-output
regression. -#}
{{- raise_exception(
"custom_pub_chat_template_gemma4: "
"tool_calls[].function.arguments must be a JSON "
"object (mapping). Got a "
~ (function['arguments'] | string | length | string)
~ "-char "
~ (function['arguments'].__class__.__name__ if function['arguments'].__class__ is defined else 'non-mapping')
~ ". This is almost always the harness handing back "
"a JSON-encoded STRING rather than the deserialized "
"object. Deserialize once on ingest and store the "
"object. See: github.com/earendil-works/pi/issues/3325"
) -}}
{%- endif -%}
{{- '}<tool_call|>' -}}
{%- endfor -%}
{%- set ns.prev_message_type = 'tool_call' -%}
{%- endif -%}
{%- set ns_tr_out = namespace(flag=false) -%}
{%- if message.get('tool_responses') -%}
{#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
{%- for tool_response in message['tool_responses'] -%}
{{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
{%- set ns_tr_out.flag = true -%}
{%- set ns.prev_message_type = 'tool_response' -%}
{%- endfor -%}
{%- elif message.get('tool_calls') -%}
{#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
{%- set ns_tool_scan = namespace(stopped=false) -%}
{%- for k in range(loop.index0 + 1, loop_messages | length) -%}
{%- if ns_tool_scan.stopped -%}
{%- elif loop_messages[k]['role'] != 'tool' -%}
{%- set ns_tool_scan.stopped = true -%}
{%- else -%}
{%- set follow = loop_messages[k] -%}
{#- Resolve tool_call_id to function name -#}
{%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
{%- for tc in message['tool_calls'] -%}
{%- if tc.get('id') == follow.get('tool_call_id') -%}
{%- set ns_tname.name = tc['function']['name'] -%}
{%- endif -%}
{%- endfor -%}
{#- Handle content as string or content-parts array -#}
{%- set tool_body = follow.get('content') -%}
{%- if tool_body is string -%}
{{- format_tool_response_block(ns_tname.name, tool_body) -}}
{%- elif tool_body is sequence and tool_body is not string -%}
{%- set ns_txt = namespace(s='') -%}
{%- for part in tool_body -%}
{%- if part.get('type') == 'text' -%}
{%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
{%- endif -%}
{%- endfor -%}
{{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
{%- for part in tool_body -%}
{%- if part.get('type') == 'image' -%}
{{- '<|image|>' -}}
{%- elif part.get('type') == 'audio' -%}
{{- '<|audio|>' -}}
{%- elif part.get('type') == 'video' -%}
{{- '<|video|>' -}}
{%- endif -%}
{%- endfor -%}
{%- else -%}
{{- format_tool_response_block(ns_tname.name, tool_body) -}}
{%- endif -%}
{%- set ns_tr_out.flag = true -%}
{%- set ns.prev_message_type = 'tool_response' -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- set captured_content -%}
{%- if message['content'] is string -%}
{%- if role == 'model' -%}
{{- strip_thinking(message['content']) -}}
{%- else -%}
{{- message['content'] | trim -}}
{%- endif -%}
{%- elif message['content'] is sequence -%}
{%- for item in message['content'] -%}
{%- if item['type'] == 'text' -%}
{%- if role == 'model' -%}
{{- strip_thinking(item['text']) -}}
{%- else -%}
{{- item['text'] | trim -}}
{%- endif -%}
{%- elif item['type'] == 'image' -%}
{{- '<|image|>' -}}
{%- set ns.prev_message_type = 'image' -%}
{%- elif item['type'] == 'audio' -%}
{{- '<|audio|>' -}}
{%- set ns.prev_message_type = 'audio' -%}
{%- elif item['type'] == 'video' -%}
{{- '<|video|>' -}}
{%- set ns.prev_message_type = 'video' -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endset -%}
{{- captured_content -}}
{%- set has_content = captured_content | trim | length > 0 -%}
{#- P5 (public fork): symmetric continuation close-suppression
for HF discussion #62.
The bug: upstream's open suppression at the top of this
iteration drops the `<|turn>model\n` header when the
previous non-tool message was also assistant — but the
close below ALWAYS emits `<turn|>\n`. Two back-to-back
text-only assistant messages therefore render as:
<|turn>model\npart 1<turn|>\npart 2<turn|>\n
That's one open, two closes — malformed. The model
(Google-confirmed in HF discussion #62) sees it as a
truncated and re-opened turn, which destabilises long
multi-step agentic histories that accumulate consecutive
assistant messages.
Fix: forward-scan for the next non-tool message. If it is
another assistant AND this iteration is a TEXT-ONLY
assistant message (no tool_calls, no tool_responses), the
next iteration will continue this same turn frame, so
suppress this iteration's close and emit a single `\n` so
the two contents don't byte-glue together.
The narrowing condition (`not message.get('tool_calls')
and not ns_tr_out.flag`) is critical: the tool-call +
tool-response chain MUST close normally so the model still
sees a balanced turn frame around the `<|tool_response>`
block. Conformance test T13 locks this in. -#}
{%- set next_nt = namespace(role=None, found=false) -%}
{%- for j in range(loop.index0 + 1, loop_messages | length) -%}
{%- if not next_nt.found -%}
{%- if loop_messages[j]['role'] != 'tool' -%}
{%- set next_nt.role = loop_messages[j]['role'] -%}
{%- set next_nt.found = true -%}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{%- set continues_into_next = (
role == 'model'
and next_nt.role == 'assistant'
and not message.get('tool_calls')
and not ns_tr_out.flag
) -%}
{%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
{{- '<|tool_response>' -}}
{%- elif continues_into_next -%}
{{- '\n' -}}
{%- elif not (ns_tr_out.flag and not has_content) -%}
{{- '<turn|>\n' -}}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{%- if add_generation_prompt -%}
{%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
{{- '<|turn>model\n' -}}
{#- When thinking is disabled, the upstream contract is to
pre-fill an empty `<|channel>thought\n<channel|>` block so
the model skips reasoning. After P2's set at the top of
the file, `enable_thinking` is unconditionally a bool, so
the upstream `| default(false)` is unnecessary. (It also
had a Jinja precedence trap: `|` binds tighter than `not`,
parsing as `not (enable_thinking | default(false))`. The
simple `not enable_thinking` form is equivalent and
clearer.) -#}
{%- if not enable_thinking -%}
{{- '<|channel>thought\n<channel|>' -}}
{%- endif -%}
{%- endif -%}
{%- endif -%}
@SourceCodeplz

Copy link
Copy Markdown

thank you for this!

@hashangit

Copy link
Copy Markdown

I was getting some errors when trying this template with Pi and LM Studio inference.
Fixed version here:
https://gist.github.com/hashangit/97dcd4ea33dc19c9f4e2d40877c34738/revisions?diff=split&w

(Thank you for the original)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment