Google Gemini

Google 的 Gemini 模型，支持多模态输入和超长上下文。

基本信息

官方 API：/v1/models/{model}:generateContent
统一网关格式：/api/openai/v1/chat/completions
鉴权方式：x-goog-api-key 或 key 查询参数

生成内容

方法：POST
官方路径：/v1/models/{model}:generateContent
统一路径：/api/openai/v1/chat/completions

路径参数

字段	必填	说明
`model`	是	例如 `gemini-1.5-pro`、`gemini-1.5-flash`

查询参数

字段	必填	说明
`key`	是	API 密钥，如果未在请求头中提供

请求头

名称	必填	说明
`Content-Type`	是	请求内容类型，固定为 `application/json`

请求体

字段	类型	必填	说明
`contents`	`array`	是	对话内容列表
`generationConfig`	`object`	否	生成配置
`safetySettings`	`array`	否	安全设置

请求示例

官方格式

curl "https://generativelanguage.googleapis.com/v1/models/gemini-1.5-pro:generateContent?key=$GOOGLE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Hello, Gemini!"
      }]
    }]
  }'

统一格式

curl https://your-domain.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -d '{
    "model": "gemini-1.5-pro",
    "messages": [
      {"role": "user", "content": "Hello, Gemini!"}
    ]
  }'

TypeScript

import { GoogleGenerativeAI } from '@google/generative-ai'

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!)
const model = genAI.getGenerativeModel({ model: 'gemini-1.5-pro' })

const result = await model.generateContent('Hello, Gemini!')
const response = result.response
console.log(response.text())

Python

import google.generativeai as genai
import os

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel('gemini-1.5-pro')

response = model.generate_content('Hello, Gemini!')
print(response.text)

成功响应

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Hello! How can I help you today?"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 3,
    "candidatesTokenCount": 8,
    "totalTokenCount": 11
  }
}

速率限制

每分钟请求数：60
每分钟 token 数：32000
免费层级通常限制更低
响应头：
- x-ratelimit-limit
- x-ratelimit-remaining
- x-ratelimit-reset

流式响应

通过 streamGenerateContent 端点进行流式响应
协议：SSE
每个数据块通常包含完整候选结构

const model = genAI.getGenerativeModel({ model: 'gemini-1.5-pro' })
const result = await model.generateContentStream('Tell me a story')

for await (const chunk of result.stream) {
  const chunkText = chunk.text()
  process.stdout.write(chunkText)
}

函数调用

Gemini 支持通过函数声明进行函数调用。

const model = genAI.getGenerativeModel({
  model: 'gemini-1.5-pro',
  tools: [{
    functionDeclarations: [{
      name: 'get_weather',
      description: 'Get the current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' }
        },
        required: ['location']
      }
    }]
  }]
})

const result = await model.generateContent("What's the weather in Paris?")

Context Caching

长上下文场景可显著降低重复输入成本
隐式缓存命中时输入价格会自动折扣
系统当前主要统计缓存读取 tokens
是否支持显式缓存与价格细节，需以具体模型说明为准

模型

模型	上下文窗口	最大输出	价格（输入 / 输出，USD / 1M）	能力
Gemini 1.5 Pro	`1000000`	`8192`	`1.25 / 5`	流式、函数调用、视觉
Gemini 1.5 Flash	`1000000`	`8192`	`0.075 / 0.3`	流式、函数调用、视觉
Gemini 1.0 Pro	`32760`	`2048`	`0.5 / 1.5`	流式、函数调用

SDK

语言	包名	安装命令	仓库
TypeScript	`@google/generative-ai`	`npm install @google/generative-ai`	`https://github.com/google/generative-ai-js`
Python	`google-generativeai`	`pip install google-generativeai`	`https://github.com/google/generative-ai-python`

常见错误码

代码	HTTP 状态	含义	处理建议
`INVALID_ARGUMENT`	`400`	参数无效	检查参数格式和值
`PERMISSION_DENIED`	`403`	权限被拒绝	检查 API 密钥
`RESOURCE_EXHAUSTED`	`429`	资源耗尽	等待配额重置或升级
`INTERNAL`	`500`	内部错误	稍后重试
`UNAVAILABLE`	`503`	服务不可用	使用指数退避重试

最佳实践

利用长上下文窗口处理大型文档
长上下文场景使用 Context Caching 降低成本
使用安全设置过滤不当内容
根据任务复杂度选择 Pro 或 Flash 模型
结合多模态能力处理图像和文本