Resource Management APIs

APIs for monitoring and controlling deployed AI resources.

GET /api/resources/[resourceId]/status

Get real-time status, health, metrics, and cost data for a deployed resource.

Response (200 OK - SageMaker Endpoint)

{
  "success": true,
  "resource": {
    "id": "resource-uuid",
    "name": "my-llama-endpoint",
    "type": "sagemaker",
    "status": "InService",
    "health": "healthy",
    "endpoint_url": "https://runtime.sagemaker.eu-north-1.amazonaws.com/...",
    "region": "eu-north-1",
    "instance_type": "ml.g4dn.xlarge",
    "instance_count": 1,
    "model_name": "Llama 2 7B",
    "created_at": "2024-01-10T10:00:00Z",
    "last_updated": "2024-01-10T12:00:00Z"
  },
  "metrics": {
    "response_time_ms": 342,
    "throughput_per_minute": 87,
    "error_rate_percent": 0.5,
    "cpu_utilization": 35,
    "memory_utilization": 48
  },
  "costs": {
    "hourly_cost": 0.85,
    "daily_cost": 20.40,
    "monthly_estimate": 612.00,
    "last_updated": "2024-01-10T12:00:00Z",
    "pricing_model": "hourly"
  }
}

Response (200 OK - Bedrock Model)

{
  "success": true,
  "resource": {
    "id": "resource-uuid",
    "name": "my-nova-assistant",
    "type": "bedrock",
    "status": "InService",
    "health": "healthy",
    "endpoint_url": "https://bedrock-runtime.eu-north-1.amazonaws.com/...",
    "region": "eu-north-1",
    "instance_type": null,
    "instance_count": null,
    "model_name": "Amazon Nova Pro",
    "created_at": "2024-01-10T10:00:00Z",
    "last_updated": "2024-01-10T12:00:00Z"
  },
  "metrics": {
    "response_time_ms": 150
  },
  "costs": {
    "hourly_cost": 0,
    "daily_cost": 0,
    "monthly_estimate": 0,
    "last_updated": "2024-01-10T12:00:00Z",
    "pricing_model": "token",
    "per_1k_input_tokens_usd": 0.0008,
    "per_1k_output_tokens_usd": 0.0032
  }
}

Status Values

StatusDescription
InServiceResource is operational and accepting requests
CreatingResource is being created
UpdatingResource is being updated
DeletingResource is being deleted
FailedResource operation failed
OutOfServiceResource is not operational

POST /api/resources/[resourceId]/control

Perform control actions on a deployed resource (start, stop, restart, delete).

Request Body

{
  "action": "restart",  // 'start' | 'stop' | 'restart' | 'delete'
  "confirm": true       // Required for delete action
}

Response (200 OK - Start)

{
  "success": true,
  "message": "Endpoint start initiated",
  "action": "start",
  "status": "updating"
}

Response (200 OK - Restart)

{
  "success": true,
  "message": "Endpoint restart initiated",
  "action": "restart",
  "status": "updating"
}

Response (200 OK - Delete)

{
  "success": true,
  "message": "Endpoint deletion initiated",
  "action": "delete",
  "status": "deleting"
}

Error Responses

// 400 Bad Request - Missing confirmation for delete
{
  "success": false,
  "error": "Delete action requires confirmation. Set confirm: true in request body."
}

// 400 Bad Request - Bedrock models cannot be stopped
{
  "success": false,
  "error": "Bedrock resources cannot be started/stopped (they are managed by AWS)"
}

// 400 Bad Request - SageMaker stop not supported
{
  "success": false,
  "error": "SageMaker endpoints cannot be stopped. Use delete to remove the endpoint entirely."
}

// 404 Not Found
{
  "success": false,
  "error": "Resource not found in AWS (may have been deleted externally)"
}

Platform-Specific Actions

PlatformSupported Actions
Bedrockdelete only (removes from tracking)
SageMakerstart, restart, delete
NVIDIA NIMdelete only (removes from tracking)

⚠️ Important Notes

  • SageMaker Operations: All operations (start, restart, delete) take 5-10 minutes
  • Bedrock & NIM: Always-on services, cannot be stopped or started
  • Delete Confirmation: Always required for delete actions to prevent accidental deletion
  • Status Polling: Poll /status endpoint to track operation progress
  • Costs: SageMaker continues to incur costs until deleted

Example: Restart Workflow

// 1. Initiate restart
POST /api/resources/resource-uuid/control
{
  "action": "restart"
}

// 2. Poll status every 30 seconds
GET /api/resources/resource-uuid/status

// Response will show "Updating" status
{
  "success": true,
  "resource": {
    "status": "Updating",
    "health": "unknown",
    ...
  }
}

// 3. Continue polling until status returns to "InService"
{
  "success": true,
  "resource": {
    "status": "InService",
    "health": "healthy",
    ...
  }
}