Skip to content

Models

This content is not available in your language yet.

The Models page provides a unified view of all AI/ML models deployed in your OtterScale cluster. It aggregates information about model deployments and their artifacts, allowing you to monitor status, manage resources, and perform lifecycle operations.

The Models page displays a list of all models. The table includes the following columns (as shown in the UI):

ColumnDescription
NameThe name of the model.
Model NameThe unique identifier of the model (modelName/id).
NamespaceThe Kubernetes namespace where the model is deployed.
StatusThe current status of the model (e.g., Running, Pending).
DescriptionThe description of the model.
PrefillPrefill configuration: vGPU memory %, replica, tensor (if available).
DecodeDecode configuration: vGPU memory %, replica, tensor (if available).
First DeployedTimestamp of first deployment.
Last DeployedTimestamp of last deployment.
GPU RelationGPU resource relation (shown only if status is ‘deployed’).
TestTest button for model API (only available when the model is in the “ready” state; opens a dialog to test the model).
ActionsManagement actions (update, delete, etc.).

The Pods Table (in the details view) lists all pods managed by this model, if available.
Click the expand icon at the beginning of a row to view detailed pod information.

ColumnDescription
PodThe unique name of the pod.
PhaseThe current lifecycle phase of the pod (e.g., Running, Pending).
ReadyNumber of ready containers vs total containers.
RestartsNumber of times containers in the pod have restarted.
ConditionsThe most recent condition or error status for the pod.
Time to First TokenThe sum of time (in seconds) taken to generate the first token for requests to this pod.
Request LatencyThe 95th percentile end-to-end request latency (in seconds) for this pod.
LogClickable: Opens the log view for the pod.
Create TimeThe timestamp when the pod was created.

You can manage the lifecycle of your models using the Actions menu.

The Actions menu (three dots icon) for each model provides:

Create a new model.

  1. Select Create from the actions menu or click the Create button.
  2. You can search for models using the cloud icon next to the input box, or select a model from your model artifacts by clicking the archive icon.
  3. Fill in the model configuration (name, namespace, prefill/decode, description, etc.).
  4. Confirm to deploy the model.

Modify the configuration of an existing model (such as prefill/decode, description, etc.).

  1. Select Update from the actions menu.
  2. Edit the desired fields.
  3. Confirm to apply the changes.

Delete a model.

  1. Select Delete from the actions menu.
  2. Confirm deletion.