Gemini Native generateContent API — API Reference

The Gemini Native API lets you interact with Gemini models using Google’s native request format, including the contents and generationConfig structure. All Gemini models accessible through this endpoint support both image and video analysis, making it the right choice whenever you need multimodal capabilities with the full flexibility of the native Gemini protocol. Base URL: http://apillm.globalaiopc.com/gw_llm_power Endpoints:

POST /v1/models/{model}:generateContent — Standard (non-streaming) response
POST /v1/models/{model}:streamGenerateContent — Streaming response

Authentication

Authenticate every request using the Authorization header with your API key:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Supported Models

All Gemini models available through this endpoint support image and video analysis.

Model	Variants
`gemini-2.5-flash-lite`	Standard only
`gemini-2.5-pro`	Standard, `-official`, `-low`
`gemini-3-flash-preview`	Standard, `-official`, `-low`
`gemini-3.1-flash-lite-preview`	Standard only
`gemini-3.1-pro-preview`	Standard, `-official`, `-low`

Model Suffix Reference

Suffix	Description
(none)	Standard / stable version
`-official`	Official version
`-low`	Budget version

Request Parameters

contents

array

required

The array of message content objects that make up the conversation. Each object typically contains a role and a parts array.

contents[].role

string

The role of the message author. Use user for human turns and model for prior model turns in multi-turn conversations.

contents[].parts

array

required

An array of content parts for the message. Typically contains text objects for plain text input. The native Gemini multimodal structure (inline images, video, etc.) is also supported.

contents[].parts[].text

string

The text content of the part.

generationConfig.temperature

number

Controls the randomness of the model’s output. Lower values produce more focused, deterministic responses; higher values produce more creative output.

generationConfig.topP

number

Nucleus sampling parameter. The model considers only the tokens comprising the top topP probability mass.

generationConfig.maxOutputTokens

integer

The maximum number of tokens the model may generate in its response.

systemInstruction.parts[].text

string

An optional system prompt that sets the context and behavior for the model. Provide this as a text part within the systemInstruction object.

Response Fields

candidates[].content.parts[].text

string

The text content generated by the model.

candidates[].finishReason

string

The reason the model stopped generating. Common values include STOP (natural end) and MAX_TOKENS (token limit reached).

usageMetadata.promptTokenCount

integer

The number of tokens in the input contents and system instruction.

usageMetadata.candidatesTokenCount

integer

The number of tokens in the generated response candidates.

usageMetadata.totalTokenCount

integer

The total number of tokens used in the request (prompt + candidates).

Code Examples

curl -X POST "http://apillm.globalaiopc.com/gw_llm_power/v1/models/gemini-3.1-pro-preview:generateContent" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Explain the history of artificial intelligence"}]
      }
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

Example Response

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "The history of artificial intelligence spans decades, progressing through three major phases: early rule-based systems in the 1950s–1980s, the rise of machine learning in the 1990s–2010s, and the current era of large language models and deep learning."
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 52,
    "totalTokenCount": 62
  }
}

To receive a streaming response, replace :generateContent with :streamGenerateContent in the request URL. The API will return a series of incremental response chunks in the native Gemini streaming format.

This endpoint follows the native Google Gemini protocol. For complete details on multimodal input, function calling, safety settings, and advanced generation configuration, refer to the Google Gemini generateContent documentation.

Documentation Index

​Authentication

​Supported Models

​Model Suffix Reference

​Request Parameters

​Response Fields

​Code Examples

​Example Response

Authentication

Supported Models

Model Suffix Reference

Request Parameters

Response Fields

Code Examples

Example Response