AWS Bedrock Deployment

Learn how to deploy and configure AWS Bedrock foundation models in Staque IO.

💡 Key Concept

AWS Bedrock models are fully managed by AWS - no infrastructure deployment is needed. The "deployment" process in Staque IO simply configures API access and registers the model for use.

Prerequisites

Before deploying a Bedrock model, ensure you have:

  • AWS account with Bedrock access enabled
  • AWS IAM credentials configured
  • Model access granted in AWS Bedrock console
  • Required environment variables set

Required Environment Variables

STAQUE_AWS_REGION=eu-north-1
STAQUE_AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
STAQUE_AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Step 1: Request Model Access

Before using any Bedrock model, you must request access through the AWS Console:

  1. Navigate to AWS Bedrock Console
  2. Go to "Model Access" in the left sidebar
  3. Click "Manage Model Access"
  4. Select the models you want to use (e.g., Amazon Nova, Claude, Titan)
  5. Submit the access request
  6. Wait for approval (usually instant for most models)

Note: Some models like Claude 3 may require manual approval from AWS. Most Amazon Nova and Titan models are approved instantly.

Step 2: Deploy via Staque IO UI

To deploy a Bedrock model through the Staque IO interface:

  1. Navigate to Get Started
    • Click "Get Started" in the navigation
    • Describe your use case to get AI recommendations
    • Or skip recommendations and go directly to model selection
  2. Select Platform
    • Choose "AWS Bedrock" as the platform
    • Browse available foundation models
  3. Choose Model
    • Select your desired model (e.g., amazon.nova-pro-v1:0)
    • Review model capabilities and pricing
    • Check available modalities (text, image, etc.)
  4. Configure Deployment
    • Enter a conversation title
    • Specify your use case
    • Review the configuration
  5. Deploy
    • Click "Deploy" to create the conversation
    • The system will verify model access
    • Deployment completes instantly (no infrastructure to provision)

Step 3: Deploy via API

You can also deploy Bedrock models programmatically using the API:

Dry Run (Recommended First)

POST /api/deploy/bedrock
Content-Type: application/json
Authorization: Bearer <your-token>

{
  "modelId": "amazon.nova-pro-v1:0",
  "endpointName": "my-nova-assistant",
  "region": "eu-north-1",
  "dryRun": true
}

// Response
{
  "success": true,
  "dryRun": true,
  "plan": {
    "modelId": "amazon.nova-pro-v1:0",
    "endpointName": "my-nova-assistant",
    "region": "eu-north-1",
    "modelDetails": {
      "providerName": "Amazon",
      "inputModalities": ["TEXT", "IMAGE"],
      "outputModalities": ["TEXT"],
      "inferenceTypesSupported": ["ON_DEMAND"]
    },
    "invokeEndpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/...",
    "pricing": {
      "note": "Bedrock models are pay-per-token, no infrastructure costs",
      "billingModel": "ON_DEMAND or INFERENCE_PROFILE"
    }
  }
}

Actual Deployment

POST /api/deploy/bedrock
Content-Type: application/json
Authorization: Bearer <your-token>

{
  "modelId": "amazon.nova-pro-v1:0",
  "endpointName": "my-nova-assistant",
  "region": "eu-north-1",
  "dryRun": false
}

// Response
{
  "success": true,
  "message": "Bedrock model access configured",
  "endpoint": "https://bedrock-runtime.eu-north-1.amazonaws.com/model/amazon.nova-pro-v1:0/invoke",
  "modelId": "amazon.nova-pro-v1:0",
  "region": "eu-north-1"
}

Create Conversation with Deployed Model

POST /api/conversations
Content-Type: application/json
Authorization: Bearer <your-token>

{
  "title": "My AI Assistant",
  "use_case": "general-purpose",
  "deployed_resource": {
    "resource_name": "Nova Pro Assistant",
    "resource_type": "bedrock",
    "aws_resource_id": "amazon.nova-pro-v1:0",
    "region": "eu-north-1",
    "estimated_hourly_cost": 0
  }
}

// Response
{
  "success": true,
  "conversation_id": "uuid-here",
  "resource_id": "resource-uuid-here",
  "message": "Conversation and resource created successfully"
}

Available Models

Amazon Nova Models

Model IDCapabilitiesBest For
amazon.nova-pro-v1:0Text + Image input, Text outputComplex reasoning, analysis
amazon.nova-lite-v1:0Text onlyFast, cost-effective tasks
amazon.nova-micro-v1:0Text onlyUltra-low latency, simple tasks

Anthropic Claude Models

  • anthropic.claude-3-5-sonnet-20241022-v2:0 - Latest Claude 3.5 Sonnet
  • anthropic.claude-3-sonnet-20240229-v1:0 - Claude 3 Sonnet
  • anthropic.claude-3-haiku-20240307-v1:0 - Claude 3 Haiku (fast & efficient)

Amazon Titan Models

  • amazon.titan-text-premier-v1:0 - High-performance text generation
  • amazon.titan-text-express-v1 - Fast, efficient text generation

Model-Specific Configuration

Nova Models - Inference Profiles

Nova models require region-specific inference profiles. Staque IO handles this automatically:

// Original model ID
"amazon.nova-pro-v1:0"

// Automatically converted to inference profile
"eu.amazon.nova-pro-v1:0"  // For EU regions
"us.amazon.nova-pro-v1:0"  // For US regions

System Prompts

Customize the behavior of your Bedrock models with system prompts:

POST /api/bedrock/system-prompt
Content-Type: application/json
Authorization: Bearer <your-token>

{
  "modelId": "amazon.nova-pro-v1:0",
  "systemPrompt": "You are a financial analyst AI specialized in risk assessment..."
}

// Response
{
  "success": true,
  "message": "System prompt updated successfully"
}

Cost Management

Understanding Bedrock Pricing

Bedrock uses token-based pricing with no infrastructure costs:

  • No idle costs - Only pay when you use the model
  • Per-token pricing - Charged separately for input and output tokens
  • No infrastructure - No instance types or hourly rates

Example Pricing (Amazon Nova Pro)

Token TypePrice per 1K tokens
Input$0.0008
Output$0.0032

Track Usage and Costs

GET /api/usage/current?conversationId=<conversation-id>

// Response
{
  "success": true,
  "mtd": {
    "cost_usd": 12.45,
    "tokens_in": 125000,
    "tokens_out": 187500,
    "tokens_total": 312500,
    "requests": 850
  },
  "last24h": {
    "cost_usd": 2.34,
    "tokens_in": 15000,
    "tokens_out": 22500,
    "tokens_total": 37500,
    "requests": 102
  }
}

Best Practices

Model Selection

  • Nova Pro: Use for complex reasoning, multimodal tasks, and when quality is paramount
  • Nova Lite: Use for routine text tasks where speed and cost matter
  • Nova Micro: Use for simple, high-volume tasks requiring ultra-low latency
  • Claude 3.5 Sonnet: Use for coding tasks, complex analysis, and when you need the absolute best quality

Performance Optimization

  • Use shorter system prompts to reduce input token costs
  • Implement conversation history trimming for long threads
  • Choose the right model for your use case (don't over-provision)
  • Monitor token usage and adjust max_tokens settings

Security

  • Use IAM roles with least-privilege access
  • Rotate AWS credentials regularly
  • Enable CloudTrail logging for Bedrock API calls
  • Review and approve model access permissions carefully

Troubleshooting

Common Issues

Error: "Model not found or not accessible"

Cause: Model access not granted in AWS Bedrock console

Solution: Request model access through AWS Bedrock console → Model Access → Manage Model Access

Error: "AWS credentials not configured"

Cause: Missing or invalid environment variables

Solution: Ensure STAQUE_AWS_ACCESS_KEY_ID, STAQUE_AWS_SECRET_ACCESS_KEY, and STAQUE_AWS_REGION are set

Error: "Throttling exception"

Cause: Exceeded request rate limits

Solution: Implement exponential backoff retry logic or request quota increase from AWS

Error: "Invalid inference profile"

Cause: Incorrect region prefix for Nova models

Solution: Staque IO handles this automatically. Ensure your STAQUE_AWS_REGION is correctly set.

Next Steps