For Chatbot Builders

One API for every chatbot you ship

WhatsApp, Telegram, Discord, Slack, or embedded web chat — stream responses from 16 AI models through one OpenAI-compatible endpoint. Per-user caps, auto-failover, zero prompt retention.

Start free — 2 calls, no card

Works with every messaging platform

💬

WhatsApp

Via Meta Cloud API webhook

✈️

Telegram

Via Bot API — 5 min setup

🎮

Discord

Via discord.js / interactions

💼

Slack

Events API + Bolt SDK

🌐

Web chat

Any frontend, streaming SSE

📱

iOS / Android

Mobile SDKs call the API directly

📞

Voice / Twilio

Pipe STT → AIPower → TTS

🔌

Custom

Any HTTP webhook works

Any platform that supports HTTP webhooks works — the AIPower API is the LLM layer, not the messaging layer.

The chatbot pattern — streaming replies

Stream tokens to the user as they arrive. Works identically across all 16 models.

// Node.js — works inside your webhook handler (WhatsApp / Telegram / etc.)
import OpenAI from "openai";

const aipower = new OpenAI({
  baseURL: "https://api.aipower.me/v1",
  apiKey: process.env.AIPOWER_API_KEY,
});

async function handleUserMessage(userId: string, text: string) {
  const stream = await aipower.chat.completions.create({
    model: "auto",  // DeepSeek V3 by default — 91% cheaper than GPT-5.4
    stream: true,
    user: userId,   // Tag requests per-user for billing + analytics
    messages: [
      { role: "system", content: "You are a friendly assistant." },
      ...conversationHistory[userId],  // Your session store
      { role: "user", content: text },
    ],
  });

  let full = "";
  for await (const chunk of stream) {
    const delta = chunk.choices[0]?.delta?.content || "";
    full += delta;
    await sendToPlatform(userId, delta);  // WhatsApp / Telegram / etc.
  }
  conversationHistory[userId].push(
    { role: "user", content: text },
    { role: "assistant", content: full },
  );
}

Pick the right model for your chatbot

Chatbot typeModelCost/MWhy
General Q&A / supportauto → DeepSeek V3$0.34Cheap, fast, knows English + Chinese
High-quality assistantauto-best → Claude Opus$30Best reasoning & writing
Real-time typing (<500ms)auto-fast → Qwen Turbo$0.12Lowest first-token latency
Coding bot (cursor-style)auto-code → Claude Sonnet$3.4578% SWE-bench
WeChat / Chinese-marketqwen/qwen-plus$0.13Best CN, cheaper than GPT
Free-tier / demo botzhipu/glm-4-flash$0.01Nearly free

Protect against abuse & runaway cost

Tag each request with a user ID. Set daily spending caps. Bail when users hit their quota.

// Option 1: Account-wide cap
// Set daily_cap_cents=500 in your dashboard → auto-halt at $5/day

// Option 2: Per-user cap (your app-side)
const userDailySpend = await db.getSpendToday(userId);
if (userDailySpend > 50) {  // 50 cents = ~150k free-tier tokens
  return "You've hit your daily limit. Upgrade to continue.";
}

// Option 3: Per-user analytics
const res = await aipower.chat.completions.create({
  model: "auto",
  user: userId,  // Tag it — query /api/usage/logs?user=...
  messages: [...],
});

Your chatbot never goes down from upstream

🛡️

Auto-failover

When OpenAI 5xx's or Anthropic rate-limits, requests transparently re-route to a backup provider. Your bot keeps responding.

Cloudflare edge

Gateway runs on Cloudflare Workers — 99.99% uptime, <50ms global latency to the gateway.

🔒

Zero prompt retention

We don't store chat content. Only billing metadata (user tag, tokens, model). Run your bot in regulated industries.

Chatbot FAQ

Do you support streaming (SSE)?

Yes. Every model supports `stream: true`. The response is OpenAI-compatible — any library that works with OpenAI streaming works here.

Can I run a bot for 10,000 concurrent users?

Yes. We've had users push 200+ concurrent streaming sessions through a single API key without issue. For 10k+ users, shard across multiple keys for rate-limit isolation.

What about hallucinations / off-topic answers?

Use system prompts with clear constraints, and use a router: start with cheaper models (DeepSeek) for common questions, escalate to Claude Opus only when confidence is low. We have a `/routing` page showing the pattern.

Can I store conversation history?

AIPower doesn't store it — you do. Pass the full message array on each request. We recommend Redis/DynamoDB for session state; the gateway is stateless.

What if my user is in China?

Route them to Chinese models (`qwen-plus`, `deepseek-chat`, `kimi`) — faster in-region, and they'll handle Chinese input better than GPT. We serve both CN and global traffic from the same API.

How do I prevent prompt injection?

Treat user input as untrusted content. Don't let users override your system prompt by embedding `<|system|>` markers. Our docs have an input-sanitization pattern. Use structured outputs (JSON schema) where possible to limit response shapes.

Building something else?

Launch your chatbot today.

2 free trial calls. +100 bonus on first $5 top-up. OpenAI SDK drop-in — no rewrite.