Deployment APIs

APIs for deploying AI models to different platforms.

POST /api/deploy/bedrock

Configure access for a Bedrock model. Bedrock models are managed by AWS, so no actual deployment is needed.

Request Body

{
  "modelId": "amazon.nova-pro-v1:0",
  "endpointName": "my-nova-assistant",
  "region": "eu-north-1",  // Optional
  "dryRun": true           // Optional, default: true
}

Response (200 OK - Dry Run)

{
  "success": true,
  "dryRun": true,
  "plan": {
    "modelId": "amazon.nova-pro-v1:0",
    "endpointName": "my-nova-assistant",
    "region": "eu-north-1",
    "modelDetails": {
      "providerName": "Amazon",
      "inputModalities": ["TEXT", "IMAGE"],
      "outputModalities": ["TEXT"],
      "inferenceTypesSupported": ["ON_DEMAND"]
    },
    "invokeEndpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/model/...",
    "pricing": {
      "note": "Bedrock models are pay-per-token, no infrastructure costs",
      "billingModel": "ON_DEMAND or INFERENCE_PROFILE based on model support"
    }
  },
  "message": "Bedrock models are accessed via API - no endpoint deployment needed"
}

Response (200 OK - Real Deploy)

{
  "success": true,
  "message": "Bedrock model access configured",
  "endpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/model/...",
  "modelId": "amazon.nova-pro-v1:0",
  "region": "eu-north-1"
}

POST /api/deploy/sagemaker

Deploy a model to AWS SageMaker with dedicated infrastructure.

Request Body

{
  "endpointName": "my-llama-endpoint",
  "instanceType": "ml.g4dn.xlarge",
  "modelPackageArn": "arn:aws:sagemaker:...",  // Optional
  "inferenceImage": "763104351884.dkr.ecr.eu-north-1.amazonaws.com/...",  // Optional
  "modelDataUrl": "s3://bucket/model.tar.gz",  // Optional
  "executionRoleArn": "arn:aws:iam::123456789012:role/SageMakerRole",  // Optional
  "dryRun": true  // Optional, default: true
}

Response (200 OK - Dry Run)

{
  "success": true,
  "dryRun": true,
  "plan": {
    "endpointName": "my-llama-endpoint",
    "instanceType": "ml.g4dn.xlarge",
    "roleArn": "arn:aws:iam::123456789012:role/SageMakerRole",
    "vpc": {
      "subnets": ["subnet-12345678", "subnet-87654321"],
      "securityGroups": ["sg-12345678"]
    },
    "model": {
      "modelPackageArn": "arn:aws:sagemaker:...",
      "inferenceImage": "763104351884.dkr.ecr...",
      "modelDataUrl": "s3://bucket/model.tar.gz"
    }
  }
}

Response (200 OK - Real Deploy)

{
  "success": true,
  "message": "Endpoint creation started",
  "endpointName": "my-llama-endpoint",
  "endpoint": "https://runtime.sagemaker.eu-north-1.amazonaws.com/endpoints/my-llama-endpoint/invocations"
}

Error Responses

// 400 Bad Request
{
  "success": false,
  "error": "SAGEMAKER_SUBNET_IDS and SAGEMAKER_SECURITY_GROUP_IDS must be set"
}

// 500 Internal Server Error
{
  "success": false,
  "error": "Failed to deploy SageMaker endpoint"
}

POST /api/deploy/nims

Verify connectivity to NVIDIA Hosted NIM and return invocation endpoint metadata.

Request Body

{
  "modelId": "mistralai/mistral-7b-instruct-v0.3",
  "dryRun": false  // Optional, default: false
}

Response (200 OK)

{
  "success": true,
  "provider": "nvidia-nim",
  "modelId": "mistralai/mistral-7b-instruct-v0.3",
  "endpoint": "https://integrate.api.nvidia.com/v1/chat/completions",
  "message": "NIM Hosted API reachable"
}

Error Responses

// 400 Bad Request
{
  "success": false,
  "error": "NVIDIA_API_KEY is not configured"
}

// 502 Bad Gateway
{
  "success": false,
  "error": "NIM verify failed: 401 Unauthorized"
}

⚠️ Deployment Notes

  • Bedrock: No actual deployment, just API access configuration
  • SageMaker: Real infrastructure provisioning (takes 5-10 minutes)
  • NVIDIA NIM: Connectivity verification only, no infrastructure
  • Dry Run: Always test with dryRun=true first
  • Costs: SageMaker incurs hourly costs, others are pay-per-use

Required Environment Variables

PlatformRequired Variables
BedrockSTAQUE_AWS_REGION
STAQUE_AWS_ACCESS_KEY_ID
STAQUE_AWS_SECRET_ACCESS_KEY
SageMakerSTAQUE_AWS_REGION
STAQUE_AWS_ACCESS_KEY_ID
STAQUE_AWS_SECRET_ACCESS_KEY
SAGEMAKER_EXECUTION_ROLE_ARN
SAGEMAKER_SUBNET_IDS
SAGEMAKER_SECURITY_GROUP_IDS
NVIDIA NIMNVIDIA_API_KEY
NIM_BASE_URL (optional)