⚡ Quick Start
Get started with the UPIP AI API in under 60 seconds:
1. Get Your API Key
Sign up for a free account (no credit card required) at upip.company/signup.html
2. Make Your First Request
curl https://upip.company/api/ai-api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "qwen2.5:7b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
3. Use in Your Code
const response = await fetch('https://upip.company/api/ai-api/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'qwen2.5:7b',
messages: [{ role: 'user', content: 'Hello!' }]
})
});
const data = await response.json();
console.log(data.choices[0].message.content);
✓ OpenAI Compatible: Use the OpenAI Python library by changing just the base URL!
import openai
openai.api_key = "YOUR_API_KEY"
openai.api_base = "https://upip.company/api/ai-api/v1"
response = openai.ChatCompletion.create(
model="qwen2.5:7b",
messages=[{"role": "user", "content": "Hello!"}]
)
💰 Pricing
Save 79-99% compared to OpenAI, Anthropic, and Google AI.
| Plan |
Price |
Tokens/mo |
Images/mo |
Audio/mo |
Rate Limit |
| Free |
$0 |
100K |
10 |
30 minutes |
30 req/min |
| Startup |
$19/mo |
2M |
200 |
600 minutes |
60 req/min |
| Professional |
$49/mo |
10M |
1,000 |
3,000 minutes |
120 req/min |
| Enterprise |
$249/mo |
100M |
10,000 |
30,000 minutes |
600 req/min |
💸 Real Savings Example: 100M tokens with OpenAI costs $625/month.
With UPIP AI API? Just $249/month. That's $376/month in savings!
🧠 Available Models
Text Generation (Chat Completions)
qwen2.5:3b - Fast, efficient 3B parameter model
qwen2.5:7b - Balanced 7B parameter model (recommended)
llama3.2:3b - Meta's Llama 3.2 3B
phi3:3.8b - Microsoft's Phi-3 model
gemma2:9b - Google's Gemma 2 9B
mistral:7b - Mistral AI 7B
codellama:7b - Code-specialized Llama
- 70B+ models:
llama3.3:70b, qwen2.5:72b, mixtral:8x7b
Premium Models (Cloud API)
gemini-2.0-flash-exp - Google's latest Gemini 2.0
claude-3-5-sonnet-20241022 - Anthropic's Claude 3.5
Image Generation
stable-diffusion-xl - High-quality image generation
Audio (Speech-to-Text)
whisper-large-v3 - OpenAI's Whisper model
⚡ Smart Routing: Small models (3B-7B) run on our fast GPU layer.
Large models (70B+) automatically route to our dedicated inference layer for optimal performance.
📡 API Endpoints
POST
/v1/chat/completions
Text generation and chat completions (OpenAI compatible)
{
"model": "qwen2.5:7b",
"messages": [
{"role": "user", "content": "Hello!"}
],
"max_tokens": 500,
"temperature": 0.7
}
POST
/v1/images/generations
Generate images with Stable Diffusion XL
{
"prompt": "A beautiful sunset over mountains",
"n": 1,
"size": "1024x1024"
}
POST
/v1/audio/transcriptions
Transcribe audio with Whisper
FormData:
"file": audio_file.mp3
"model": "whisper-large-v3"
GET
/v1/models
List all available models
🔐 Authentication
All API requests require authentication using your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Your API key format: upip_live_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
⚠️ Keep Your API Key Secret: Never expose your API key in client-side code or public repositories.
Use environment variables or secure backend storage.
⚡ Rate Limits
Rate limits vary by subscription tier:
- Free: 30 requests/minute
- Startup: 60 requests/minute
- Professional: 120 requests/minute
- Enterprise: 600 requests/minute
Rate limit information is returned in response headers:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1234567890