Zai-org
GLM-4.5
GLM-4.5 Series Models are foundation models specifically engineered for intelligent agents. The flagship GLM-4.5 integrates 355 billion total parameters (32 billion active), unifying reasoning, coding, and agent capabilities to address complex application demands. As a hybrid reasoning system, it offers dual operational modes: - Thinking Mode: Enables complex reasoning, tool invocation, and strategic planning - Non-Thinking Mode: Delivers low-latency responses for real-time interactions This architecture bridges high-performance AI with adaptive functionality for dynamic agent environments.
GLM 4.5V
Z.ai's GLM-4.5V sets a new standard in visual reasoning, achieving SOTA performance across 42 benchmarks among open-source models. Beyond benchmarks, it excels in real-world applications through hybrid training, enabling comprehensive visual understanding—from image/video analysis and GUI interaction to complex document processing and precise visual grounding. In China's GeoGuessr challenge, GLM-4.5V surpassed 99% of 21,000 human players within 16 hours, reaching 66th place in a week. Built on the GLM-4.5-Air foundation and inheriting GLM-4.1V-Thinking's approach, it leverages a 106B-parameter MoE architecture for scalable, efficient performance. This model bridges advanced AI research with practical deployment, delivering unmatched visual intelligence
GLM 4.1V 9B Thinking
GLM-4.1V-9B-Thinking is an open-source Vision-Language Model (VLM) jointly released by Zhipu AI and Tsinghua University’s KEG Lab, specifically designed to handle complex multimodal cognitive tasks. Built upon the GLM-4-9B-0414 base model, it integrates Chain-of-Thought (CoT) reasoning and employs reinforcement learning strategies, significantly enhancing its cross-modal reasoning capabilities and stability. As a lightweight model with 9B parameters, it strikes an optimal balance between deployment efficiency and performance. Across 28 authoritative benchmark evaluations, it matches or surpasses the performance of the 72B-parameter Qwen-2.5-VL-72B in 18 metrics. The model excels in tasks such as image-text understanding, mathematical and scientific reasoning, and video comprehension, while also supporting 4K-resolution images and arbitrary aspect ratios.