Streaming chat completion
Streams the response from a chat completion model, one message at a time.
You will typically use this action to build backends for chat clients.
Example
The example above shows a Flow that provides chat completions to a chat client. We're using PostgreSQL vector search to enable RAG, and SQL Server to store and retrieve the chat history so that users can ask follow-up questions.
Properties
Name | Type | Description |
---|---|---|
Title | Optional | The title of the action. |
Connection | Required | Defines the connection to OpenAI. |
Model id | Required | ID of the chosen model. |
System prompt | Optional | A system-level instruction that guides the model’s behavior and response style. |
User prompt | Required | The input message from the user, which the model processes to generate a response. |
History | Optional | A record of past interactions that provides context to the conversation, helping the model maintain continuity. |
Context | Optional | Additional context information to influence the response (if supported). |
Temperature | Optional | Controls randomness/creativity of responses. Lower = more focused; higher = more creative. Default: 0.7. |
Max completion tokens | Optional | Sets a limit on the number of tokens in the model’s response. |
Result variable name | Optional | Stores the generated AI response. Default: response . |
Description | Optional | Additional details or notes regarding the chat completion setup. |