You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ ./deploy/openshift/deploy-to-openshift.sh --kserve --no-observability
[SUCCESS] Logged in as cluster-admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Installing KServe and LLMInferenceService CRDs...
[INFO] InferenceService CRD already installed.
[INFO] LLMInferenceService CRD already installed.
[INFO] cert-manager namespace already present.
deployment.apps/cert-manager condition met
deployment.apps/cert-manager-webhook condition met
deployment.apps/cert-manager-cainjector condition met
deployment.apps/kserve-controller-manager condition met
[SUCCESS] KServe webhook service has ready endpoints
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:anyuid added: "llmisvc-controller-manager"
deployment.apps/llmisvc-controller-manager restarted
deployment.apps/llmisvc-controller-manager condition met
[SUCCESS] LLMInferenceService webhook has ready endpoints
[INFO] Ensuring LLMInferenceServiceConfig templates...
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-decode-template unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-decode-worker-data-parallel unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-prefill-template unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-prefill-worker-data-parallel unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-router-route unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-scheduler unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-template unchanged
llminferenceserviceconfig.serving.kserve.io/kserve-config-llm-worker-data-parallel unchanged
configmap/inferenceservice-config patched (no change)
[SUCCESS] All KServe CRDs already installed.
deployment.apps/llmisvc-controller-manager condition met
[INFO] Found 1 node(s) with GPU resources
[INFO] Granting privileged SCC to default service account for model download...
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "default"
[INFO] Deploying Qwen 0.6B LLMInferenceService on GPU...
llminferenceservice.serving.kserve.io/qwen-0-6b created
[INFO] Waiting for Qwen LLMInferenceService to be ready (this may take several minutes for model download)...
llminferenceservice.serving.kserve.io/qwen-0-6b condition met
[INFO] KServe mode: Deploying semantic-router with KServe backend...
==================================================
vLLM Semantic Router - KServe Deployment
==================================================
Configuration:
Namespace: vllm-semantic-router-system
Simulator Mode: false
LLMInferenceService: qwen-0-6b
Model Name: Qwen/Qwen3-0.6B
Embedding Model: all-MiniLM-L12-v2
Storage Class: <cluster default>
Models PVC Size: 10Gi
Cache PVC Size: 5Gi
Dry Run: false
Step 1: Validating prerequisites...
✓ OpenShift CLI found
✓ Logged in as cluster-admin
✓ Namespace exists: vllm-semantic-router-system
✓ LLMInferenceService exists: qwen-0-6b
✓ LLMInferenceService is ready
Creating stable ClusterIP service for predictor: qwen-0-6b
✓ Predictor service ClusterIP: 172.30.108.88 (stable across pod restarts)
Step 2: Generating manifests...
✓ Generated: configmap-router-config.yaml
✓ Generated: configmap-envoy-config.yaml
✓ Generated: serviceaccount.yaml
✓ Generated: pvc.yaml
✓ Generated: peerauthentication.yaml
✓ Generated: deployment.yaml
✓ Generated: service.yaml
✓ Generated: route.yaml
Step 3: Deploying to OpenShift...
serviceaccount/semantic-router unchanged
persistentvolumeclaim/semantic-router-models created
persistentvolumeclaim/semantic-router-cache created
configmap/semantic-router-kserve-config created
configmap/semantic-router-envoy-kserve-config created
Skipping PeerAuthentication (Istio CRD not found).
deployment.apps/semantic-router-kserve created
service/semantic-router-kserve created
route.route.openshift.io/semantic-router-kserve created
route.route.openshift.io/semantic-router-kserve-api created
✓ Resources deployed successfully
Step 4: Waiting for deployment to be ready...
This may take a few minutes while models are downloaded...
Waiting for pod... (1/36)
Waiting for pod... (2/36)
Initializing... (downloading models)
Initializing... (downloading models)
Initializing... (downloading models)
Initializing... (downloading models)
Initializing... (downloading models)
Initializing... (downloading models)
Waiting for pod... (9/36)
Waiting for pod... (10/36)
Waiting for pod... (11/36)
Waiting for pod... (12/36)
Quick status (init logs):
Downloaded sentence-transformers/all-MiniLM-L12-v2
All models downloaded successfully!
Model download complete!
total 40
drwxrwsr-x. 8 root 1001240000 4096 Jan 30 06:04 .
drwxr-xr-t. 4 root root 33 Jan 30 06:04 ..
drwxr-sr-x. 6 1001240000 1001240000 4096 Jan 30 06:04 all-MiniLM-L12-v2
drwxr-sr-x. 3 1001240000 1001240000 4096 Jan 30 06:04 category_classifier_modernbert-base_model
drwxr-sr-x. 3 1001240000 1001240000 4096 Jan 30 06:04 jailbreak_classifier_modernbert-base_model
drwxrws---. 2 root 1001240000 16384 Jan 30 06:04 lost+found
drwxr-sr-x. 3 1001240000 1001240000 4096 Jan 30 06:04 pii_classifier_modernbert-base_model
drwxr-sr-x. 3 1001240000 1001240000 4096 Jan 30 06:04 pii_classifier_modernbert-base_presidio_token_model
Setting proper permissions...
Creating cache directories...
Model download complete!
Waiting for pod... (13/36)
Waiting for pod... (14/36)
Waiting for pod... (15/36)
Waiting for pod... (16/36)
Waiting for pod... (17/36)
Waiting for pod... (18/36)
Waiting for pod... (19/36)
Waiting for pod... (20/36)
Waiting for pod... (21/36)
Waiting for pod... (22/36)
Waiting for pod... (23/36)
Waiting for pod... (24/36)
Waiting for pod... (25/36)
Waiting for pod... (26/36)
✓ Pod is ready: semantic-router-kserve-5696479cbd-r9p6d
✓ External URL: https://semantic-router-kserve-vllm-semantic-router-system.apps.brent.pcbk.p1.openshiftapps.com
==================================================
Deployment Complete!
==================================================
Next steps:
1. Set the route:
ENVOY_ROUTE=semantic-router-kserve-vllm-semantic-router-system.apps.brent.pcbk.p1.openshiftapps.com
2. Test model auto-routing:
curl -k -X POST https://semantic-router-kserve-vllm-semantic-router-system.apps.brent.pcbk.p1.openshiftapps.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"auto","messages":[{"role":"user","content":"Explain the elements of a contract under common law and give a simple example."}]}'
3. View logs:
oc logs -l app=semantic-router -c semantic-router -n vllm-semantic-router-system -f
For more information, see: /semantic-router/deploy/kserve/README.md
[SUCCESS] KServe deployment complete
Validation
API_ROUTE=$(oc get route semantic-router-kserve-api -n vllm-semantic-router-system -o jsonpath='{.spec.host}')
$ curl -k -X POST "https://${API_ROUTE}/api/v1/classify/intent" \
-H "Content-Type: application/json" \
-d '{"text":"What is machine learning?"}' | jq .
{
"classification": {
"category": "other",
"confidence": 0,
"processing_time_ms": 75
},
"recommended_model": "Qwen/Qwen3-0.6B",
"routing_decision": "low_confidence_general",
"matched_signals": {}
}
curl -k -X POST https://semantic-router-kserve-vllm-semantic-router-system.apps.brent.pcbk.p1.openshiftapps.com/v1/chat/completions -H "Content-Type: application/json" -d '{"model":"auto","messages":[{"role":"user","content":"Explain the elements of a contract under common law and give a simple example."}]}' | jq .
{
"id": "chatcmpl-377fa1ba-e859-455b-99fc-4af49ca9be46",
"object": "chat.completion",
"created": 1769754086,
"model": "Qwen/Qwen3-0.6B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<think>\nOkay, the user wants me to explain the elements of a contract under common law and give a simple example. Let me start by recalling the main elements of a contract. First, there must be an offer, right? So the parties have to make a clear promise. Then there's acceptance, which is the other party agreeing. Next, the consideration, which is something of value exchanged. Maybe a promise to do something. Then there's the validity of the terms, like if the contract is made in good faith or if it's void due to fraud or mistake. Also, the performance, so both parties must fulfill their promises. And finally, the termination, like if one party can cancel it.\n\nLet me check if I'm missing anything. Oh, right, there's also the issue of intent and legal capacity. For example, if one party is a minor, they can't enter a contract. Also, the validity of the contract, like if it's in a place where it's illegal. The example needs to be simple. Maybe a common scenario like buying a house. The buyer offers the price, the seller agrees, they exchange money, and both meet their obligations. But I need to make sure the elements are covered clearly. Let me structure it step by step and use a simple example to illustrate each element. Alright, that should cover it.\n</think>\n\nUnder **common law**, a contract must satisfy **six essential elements** to be valid. Here's a simple breakdown and a **simple example**:\n\n---\n\n### **Elements of a Contract under Common Law:**\n\n1. **Offer**: One party (the **offeror**) makes a clear and specific promise to another party (the **offeree**) to do or not do something. \n *Example*: A buyer agrees to buy a house for $20,000. \n\n2. **Acceptance**: The other party (the **offeree**) agrees to the same terms. \n *Example*: The buyer accepts the offer to buy the house. \n\n3. **Consideration**: Money, goods, or services exchanged between the parties. \n *Example*: The buyer pays the seller $20,000 in exchange for the house. \n\n4. **Capacity**: Both parties must have legal capacity (i.e., be of age, mentally sound, and not minors). \n *Example*: A minor cannot enter a contract. \n\n5. **Validity**: The contract must be made **in good faith**, **without fraud**, or **without legal violation**. \n *Example*: A contract made in a public place is valid. \n\n6. **Performance**: Both parties must fulfill their promises. \n *Example*: The buyer pays the seller, and the seller delivers the house. \n\n---\n\n### **Example of a Contract:**\n\n**Buyer** offers to purchase a house for $20,000. \n**Seller** agrees to sell the house. \nThey exchange money (the buyer pays $20,000), and both fulfill their obligations. \n\nThis contract is valid if all elements are met.",
"refusal": null,
"annotations": null,
"audio": null,
"function_call": null,
"tool_calls": [],
"reasoning": null,
"reasoning_content": null
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null,
"token_ids": null
}
],
"service_tier": null,
"system_fingerprint": null,
"usage": {
"prompt_tokens": 24,
"total_tokens": 662,
"completion_tokens": 638,
"prompt_tokens_details": null
},
"prompt_logprobs": null,
"prompt_token_ids": null,
"kv_transfer_params": null
}