Chat APIs

APIs for interacting with deployed AI models through conversations.

POST /api/chat/thread

Send a message to a deployed AI model within a conversation thread. Supports creating new threads or continuing existing ones.

Request Body

{
  "message": "Hello! How can you help me today?",
  "conversationId": "conversation-uuid",
  "resourceId": "resource-uuid",
  "threadId": "thread-uuid"  // Optional - omit to create new thread
}

Response (200 OK)

{
  "success": true,
  "threadId": "thread-uuid",
  "messages": [
    {
      "role": "user",
      "content": "Hello! How can you help me today?",
      "timestamp": "2024-01-10T12:00:00Z",
      "metadata": {
        "username": "john",
        "resource_id": "resource-uuid"
      }
    },
    {
      "role": "assistant",
      "content": "Hello! I can assist you with...",
      "timestamp": "2024-01-10T12:00:01Z",
      "metadata": {
        "model_id": "amazon.nova-pro-v1:0",
        "model_response_time": 0.823,
        "tokens_used": 42,
        "username": "john"
      },
      "tokens_in": 15,
      "tokens_out": 27,
      "tokens_total": 42,
      "latency_ms": 823,
      "model_id": "amazon.nova-pro-v1:0"
    }
  ]
}

Error Responses

// 400 Bad Request
{
  "success": false,
  "threadId": "",
  "messages": [],
  "error": "Message, conversation ID, and resource ID are required"
}

// 404 Not Found
{
  "success": false,
  "threadId": "",
  "messages": [],
  "error": "Resource not found"
}

// 500 Internal Server Error
{
  "success": false,
  "threadId": "",
  "messages": [],
  "error": "NVIDIA NIM API error: 401 Unauthorized"
}

GET /api/chat/thread

Retrieve an existing conversation thread with all messages.

Query Parameters

ParameterRequiredDescription
threadIdYesThread UUID
conversationIdYesConversation UUID

Response (200 OK)

{
  "success": true,
  "threadId": "thread-uuid",
  "messages": [
    {
      "role": "user",
      "content": "Hello!",
      "timestamp": "2024-01-10T12:00:00Z"
    },
    {
      "role": "assistant",
      "content": "Hello! How can I help you?",
      "timestamp": "2024-01-10T12:00:01Z"
    }
  ]
}

POST /api/chat

Legacy endpoint. Sends a message to a deployed AI model without thread context.

Request Body

{
  "message": "Analyze these biomarker values: Glucose: 180 mg/dL",
  "conversationId": "conversation-uuid",
  "resourceId": "resource-uuid"
}

Response (200 OK)

{
  "success": true,
  "response": "Based on the glucose level of 180 mg/dL...",
  "metadata": {
    "model_response_time": 1.234,
    "tokens_used": 156,
    "model_id": "amazon.nova-pro-v1:0",
    "user_id": "user-uuid",
    "username": "john"
  }
}

GET /api/chat/legacy

Legacy endpoint. Retrieves messages from the old chat_messages table.

Query Parameters

ParameterRequiredDescription
conversationIdYesConversation UUID

Response (200 OK)

{
  "success": true,
  "messages": [
    {
      "role": "user",
      "content": "Hello!",
      "timestamp": "2024-01-10T12:00:00Z",
      "metadata": {}
    },
    {
      "role": "assistant",
      "content": "Hello! How can I help you?",
      "timestamp": "2024-01-10T12:00:01Z",
      "metadata": {
        "model_id": "amazon.nova-pro-v1:0",
        "tokens_used": 42
      }
    }
  ],
  "source": "legacy_chat_messages",
  "count": 2
}

💡 Best Practices

  • Use Threads: Always use /api/chat/thread for better context management
  • Thread IDs: Store threadId client-side for conversation continuity
  • Error Handling: Implement retry logic for transient failures
  • Streaming: Consider implementing streaming for real-time responses
  • Token Limits: Monitor conversation length to avoid context window limits

Usage Tracking

All chat interactions automatically track token usage and costs. View usage via:

GET /api/usage/current?conversationId=conversation-uuid

Response:
{
  "success": true,
  "mtd": {
    "cost_usd": 12.45,
    "tokens_in": 125000,
    "tokens_out": 187500,
    "tokens_total": 312500,
    "requests": 850
  },
  "last24h": {
    "cost_usd": 2.34,
    "tokens_in": 15000,
    "tokens_out": 22500,
    "tokens_total": 37500,
    "requests": 102
  }
}