Structured outputs are a feature that forces a language model's response to match a developer-supplied JSON Schema exactly, using constrained decoding to make invalid tokens impossible to sample.
What is Structured Outputs (Constrained Decoding)?
Structured Outputs is a model capability that guarantees a language model's response conforms to a developer-provided JSON Schema. Rather than asking the model in the prompt to please return JSON and hoping it complies, the API enforces the schema during token generation so the returned text always parses and always matches the declared field names, types, and required keys.
The underlying mechanism is constrained decoding, sometimes called constrained sampling. At each generation step the model produces a probability distribution over the vocabulary, and the decoder masks out any token that would violate the grammar derived from the schema. Because forbidden tokens are removed before sampling, the model literally cannot emit a string that breaks the structure.
- Schema adherence is enforced at decode time, not requested in the prompt.
- Output is valid JSON that maps directly to a typed object.
- Differs from older JSON mode, which only guaranteed syntactic JSON, not a specific shape.
How constrained decoding works
The provider compiles the JSON Schema into a formal grammar, typically a context-free grammar or a finite-state machine. During inference the decoder tracks which tokens are legal given the partial output so far. After the model computes logits for the next token, tokens that the grammar disallows are set to negative infinity, so their probability after the softmax is effectively zero. Sampling then proceeds normally over the remaining valid tokens.
This approach changes the search space rather than the model weights. The model still chooses content based on its learned distribution, but the set of choices is restricted to those that keep the output on a path toward a complete, schema-valid document. The result is that field names, nested objects, enums, and required properties all appear exactly as specified.
- Schema is translated into a grammar the decoder can check token by token.
- Invalid next-tokens are masked before sampling, guaranteeing structural validity.
- Content quality still depends on the model; only the shape is guaranteed.
Reliability and how it differs from JSON mode
On OpenAI's evaluation of complex JSON Schema following, the gpt-4o-2024-08-06 model with Structured Outputs scored 100 percent schema adherence, compared with substantially lower rates for prompting alone. The earlier JSON mode guaranteed only that the output was syntactically valid JSON; it did not guarantee that the JSON matched a particular schema, so a response could be well-formed yet missing required fields or using the wrong types.
Structured Outputs is exposed in two places: as a response_format of type json_schema for general completions, and as strict: true inside a function or tool definition for function calling. Setting strict to true makes the arguments the model produces conform to the function's parameter schema, which removes a common class of agent failures where a tool call cannot be parsed.
- JSON mode: valid JSON syntax only.
- Structured Outputs: valid JSON that matches a named schema, with required keys and types enforced.
- Function calling gains the same guarantee through strict: true on the tool definition.
Code example
The example below requests a response constrained to a schema using the OpenAI Python SDK. The json_schema response format with strict set to true makes every field in the schema mandatory and disallows extra properties.
When to use structured outputs
Structured outputs fit any pipeline where a downstream program consumes the model's response: extraction into a database, populating UI components, classification with a fixed label set, or tool and function calling inside an agent. By moving validation into the decoder, the application can skip retry-and-reparse loops that handle malformed JSON.
Limitations apply. The guarantee covers shape, not truth, so a model can still return a schema-valid object with hallucinated values. Very large or deeply recursive schemas can be rejected or slow compilation, and not every model snapshot supports the feature. Treating the schema as a contract for structure while validating semantic correctness separately is the safe pattern.
- Ideal for extraction, classification, and agent tool calls.
- Removes parse-and-retry error handling for JSON.
- Does not validate that values are factually correct.
Key takeaways
- Structured outputs force model responses to match a JSON Schema by masking invalid tokens during decoding.
- Constrained decoding changes what the model can sample, not its weights, so structure is guaranteed and content is not.
- It supersedes JSON mode, which guaranteed only valid JSON syntax rather than a specific schema.
- On OpenAI's complex-schema evaluation, gpt-4o-2024-08-06 with structured outputs reached 100 percent schema adherence.
- Use strict: true for function calling to make tool-call arguments reliably parseable in agents.
Frequently asked questions
Related terms
Related reading
Put the idea into practice
MemX is an AI memory app built on these ideas: store anything, skip the folders, and find it again by asking in plain English.
Try MemX Free