Deployment APIs
APIs for deploying AI models to different platforms.
POST /api/deploy/bedrock
Configure access for a Bedrock model. Bedrock models are managed by AWS, so no actual deployment is needed.
Request Body
{
"modelId": "amazon.nova-pro-v1:0",
"endpointName": "my-nova-assistant",
"region": "eu-north-1", // Optional
"dryRun": true // Optional, default: true
}Response (200 OK - Dry Run)
{
"success": true,
"dryRun": true,
"plan": {
"modelId": "amazon.nova-pro-v1:0",
"endpointName": "my-nova-assistant",
"region": "eu-north-1",
"modelDetails": {
"providerName": "Amazon",
"inputModalities": ["TEXT", "IMAGE"],
"outputModalities": ["TEXT"],
"inferenceTypesSupported": ["ON_DEMAND"]
},
"invokeEndpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/model/...",
"pricing": {
"note": "Bedrock models are pay-per-token, no infrastructure costs",
"billingModel": "ON_DEMAND or INFERENCE_PROFILE based on model support"
}
},
"message": "Bedrock models are accessed via API - no endpoint deployment needed"
}Response (200 OK - Real Deploy)
{
"success": true,
"message": "Bedrock model access configured",
"endpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/model/...",
"modelId": "amazon.nova-pro-v1:0",
"region": "eu-north-1"
}POST /api/deploy/sagemaker
Deploy a model to AWS SageMaker with dedicated infrastructure.
Request Body
{
"endpointName": "my-llama-endpoint",
"instanceType": "ml.g4dn.xlarge",
"modelPackageArn": "arn:aws:sagemaker:...", // Optional
"inferenceImage": "763104351884.dkr.ecr.eu-north-1.amazonaws.com/...", // Optional
"modelDataUrl": "s3://bucket/model.tar.gz", // Optional
"executionRoleArn": "arn:aws:iam::123456789012:role/SageMakerRole", // Optional
"dryRun": true // Optional, default: true
}Response (200 OK - Dry Run)
{
"success": true,
"dryRun": true,
"plan": {
"endpointName": "my-llama-endpoint",
"instanceType": "ml.g4dn.xlarge",
"roleArn": "arn:aws:iam::123456789012:role/SageMakerRole",
"vpc": {
"subnets": ["subnet-12345678", "subnet-87654321"],
"securityGroups": ["sg-12345678"]
},
"model": {
"modelPackageArn": "arn:aws:sagemaker:...",
"inferenceImage": "763104351884.dkr.ecr...",
"modelDataUrl": "s3://bucket/model.tar.gz"
}
}
}Response (200 OK - Real Deploy)
{
"success": true,
"message": "Endpoint creation started",
"endpointName": "my-llama-endpoint",
"endpoint": "https://runtime.sagemaker.eu-north-1.amazonaws.com/endpoints/my-llama-endpoint/invocations"
}Error Responses
// 400 Bad Request
{
"success": false,
"error": "SAGEMAKER_SUBNET_IDS and SAGEMAKER_SECURITY_GROUP_IDS must be set"
}
// 500 Internal Server Error
{
"success": false,
"error": "Failed to deploy SageMaker endpoint"
}POST /api/deploy/nims
Verify connectivity to NVIDIA Hosted NIM and return invocation endpoint metadata.
Request Body
{
"modelId": "mistralai/mistral-7b-instruct-v0.3",
"dryRun": false // Optional, default: false
}Response (200 OK)
{
"success": true,
"provider": "nvidia-nim",
"modelId": "mistralai/mistral-7b-instruct-v0.3",
"endpoint": "https://integrate.api.nvidia.com/v1/chat/completions",
"message": "NIM Hosted API reachable"
}Error Responses
// 400 Bad Request
{
"success": false,
"error": "NVIDIA_API_KEY is not configured"
}
// 502 Bad Gateway
{
"success": false,
"error": "NIM verify failed: 401 Unauthorized"
}⚠️ Deployment Notes
- Bedrock: No actual deployment, just API access configuration
- SageMaker: Real infrastructure provisioning (takes 5-10 minutes)
- NVIDIA NIM: Connectivity verification only, no infrastructure
- Dry Run: Always test with dryRun=true first
- Costs: SageMaker incurs hourly costs, others are pay-per-use
Required Environment Variables
| Platform | Required Variables |
|---|---|
| Bedrock | STAQUE_AWS_REGION STAQUE_AWS_ACCESS_KEY_ID STAQUE_AWS_SECRET_ACCESS_KEY |
| SageMaker | STAQUE_AWS_REGION STAQUE_AWS_ACCESS_KEY_ID STAQUE_AWS_SECRET_ACCESS_KEY SAGEMAKER_EXECUTION_ROLE_ARN SAGEMAKER_SUBNET_IDS SAGEMAKER_SECURITY_GROUP_IDS |
| NVIDIA NIM | NVIDIA_API_KEY NIM_BASE_URL (optional) |