Models
The Models page provides a unified view of all AI/ML models deployed in your OtterScale cluster. It aggregates information about model deployments and their artifacts, allowing you to monitor status, manage resources, and perform lifecycle operations.
Introduction
Section titled “Introduction”The Models page displays a list of all models. The table includes the following columns (as shown in the UI):
| Column | Description |
|---|---|
| Name | The name of the model. |
| Model Name | The unique identifier of the model (modelName/id). |
| Namespace | The Kubernetes namespace where the model is deployed. |
| Status | The current status of the model (e.g., Running, Pending). |
| Description | The description of the model. |
| Prefill | Prefill configuration: vGPU memory %, replica, tensor (if available). |
| Decode | Decode configuration: vGPU memory %, replica, tensor (if available). |
| First Deployed | Timestamp of first deployment. |
| Last Deployed | Timestamp of last deployment. |
| GPU Relation | GPU resource relation (shown only if status is ‘deployed’). |
| Test | Test button for model API (only available when the model is in the “ready” state; opens a dialog to test the model). |
| Actions | Management actions (update, delete, etc.). |
Pods Table
Section titled “Pods Table”The Pods Table (in the details view) lists all pods managed by this model, if available.
Click the expand icon at the beginning of a row to view detailed pod information.
| Column | Description |
|---|---|
| Pod | The unique name of the pod. |
| Phase | The current lifecycle phase of the pod (e.g., Running, Pending). |
| Ready | Number of ready containers vs total containers. |
| Restarts | Number of times containers in the pod have restarted. |
| Conditions | The most recent condition or error status for the pod. |
| Time to First Token | The sum of time (in seconds) taken to generate the first token for requests to this pod. |
| Request Latency | The 95th percentile end-to-end request latency (in seconds) for this pod. |
| Log | Clickable: Opens the log view for the pod. |
| Create Time | The timestamp when the pod was created. |
Manage Models
Section titled “Manage Models”You can manage the lifecycle of your models using the Actions menu.
Model Actions
Section titled “Model Actions”The Actions menu (three dots icon) for each model provides:
Create
Section titled “Create”Create a new model.
- Select Create from the actions menu or click the Create button.
- You can search for models using the cloud icon next to the input box, or select a model from your model artifacts by clicking the archive icon.
- Fill in the model configuration (name, namespace, prefill/decode, description, etc.).
- Confirm to deploy the model.
Update
Section titled “Update”Modify the configuration of an existing model (such as prefill/decode, description, etc.).
- Select Update from the actions menu.
- Edit the desired fields.
- Confirm to apply the changes.
Delete
Section titled “Delete”Delete a model.
- Select Delete from the actions menu.
- Confirm deletion.