GLM 4.1V 9B Thinking

Zhipu AI chatvisionreasoning

API ID: thudm/glm-4.1v-9b-thinking

Input Price
$0.04
/1M tokens
Output Price
$0.14
/1M tokens

About GLM 4.1V 9B Thinking

GLM 4.1V 9B Thinking is a budget-friendly general-purpose model from Zhipu AI with standard context (66k), suitable for conversations, content creation, and general AI tasks.

๐Ÿ’ฐ
Price Ranking
#604 lowest price among 950 Chat models

Model Specifications

Context Length
66k
Max Output
โ€”
Release Date
2025-07-11
Capabilities
chat vision reasoning
Input Modalities
imagetext
Output Modalities
text

Best For

  • Complex reasoning, math problems, multi-step logic
  • Image analysis, document understanding, visual Q&A
  • Conversations, content writing, general assistance

Consider Alternatives For

  • Simple Q&A (cheaper models available)

๐Ÿ’ฐ Real-World Cost Examples

Estimated monthly costs for common use cases

Personal AI Assistant
$0.06
/month
50 conversations/day, ~500 tokens each
Customer Service Bot
$1.77
/month
1000 tickets/day, ~800 tokens each
Data Analysis Pipeline
$2.44
/month
500 analyses/day, ~2k tokens each

Zhipu AI Model Lineup

Compare all models from Zhipu AI to find the best fit

Model Input Output Context Capabilities
GLM 4.1V 9B Thinking Current Free Free 66k chat vision reasoning
GLM 4 9B Free Free 32k chat
GLM 4 9B Free Free 32k chat
GLM 4 32B Free Free 33k chat
GLM 4 32B Free Free 33k chat
GLM Z1 Rumination 32B Free Free 32k chat reasoning

Similar Models from Other Providers

Cross-brand alternatives with similar capabilities

Alibaba Qwen Qwen3 8B
Input: $0.04
Output: $0.14
Context: 41k
Alibaba Qwen Qwen3 235B A22B Instruct 2507
Input: $0.07
Output: $0.10
Context: 262k
Amazon Nova Micro 1.0
Input: $0.04
Output: $0.14
Context: 128k
Google Gemma 3 12B
Input: $0.04
Output: $0.13
Context: 131k

๐Ÿš€ Quick Start

Get started with GLM 4.1V 9B Thinking API

OpenAI-compatible SDK
from openai import OpenAI

client = OpenAI(
    base_url="https://api.provider.com/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="thudm/glm-4.1v-9b-thinking",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.choices[0].message.content)