Available Models

FreeInference provides access to multiple state-of-the-art LLM models for coding agents and IDEs.

Model Overview

Model ID	Name	Context Length	Max Output	Features
`llama-3.3-70b-instruct`	Llama 3.3 70B Instruct	131K tokens	8K tokens	Function calling, Structured output
`llama-4-scout`	Llama 4 Scout	128K tokens	16K tokens	Function calling, Structured output
`llama-4-maverick`	Llama 4 Maverick	128K tokens	16K tokens	Function calling, Structured output, Multimodal (text+image)
`glm-4.5`	GLM-4.5	128K tokens	96K tokens	Function calling, Structured output, Bilingual (Chinese/English)
`glm-4.5-air`	GLM-4.5-Air	128K tokens	96K tokens	Function calling, Structured output, Bilingual (Chinese/English)
`glm-4.6`	GLM-4.6	200K tokens	128K tokens	Function calling, Structured output, Bilingual (Chinese/English), Thinking mode
`deepseek-r1`	DeepSeek R1	64K tokens	8K tokens	Function calling, Structured output
`qwen3-coder-30b`	Qwen3 Coder 30B	32K tokens	8K tokens	Function calling, Structured output
`minimax-m2`	MiniMax M2	196K tokens	8K tokens	Function calling, Structured output

Model ID: llama-3.3-70b-instruct

Model ID: llama-4-scout

Model ID: llama-4-maverick

Model ID: glm-4.5

Model ID: glm-4.5-air

Model ID: glm-4.6

Model ID: deepseek-r1

Model ID: qwen3-coder-30b

Model ID: minimax-m2

To use different models, change the model name in your IDE configuration:

Cursor: Select from the dropdown in settings

Codex: Edit ~/.codex/config.toml:

model = "glm-4.6"  # Change to any model ID

Roo Code / Kilo Code: Select from the dropdown in extension settings