Intella Assist: OpenAI compatible chat completions request parameters

Overview


Intella Assist supports adjusting several parameters used in OpenAI-compatible Chat Completions API requests.
These parameters control aspects such as randomness, response length, and repetition behavior.

Supported Parameters


temperature
Type: Double (range 0.0 – 2.0)
Controls the randomness of generated responses.
A value of 0 produces deterministic results, 1 gives balanced variability, and higher values (up to 2) increase randomness and creativity.

top_p
Type: Double (range 0.0 – 1.0)
An alternative to the temperature parameter.
It uses “nucleus sampling” to limit responses to the top probability mass of tokens. In most cases, only one of temperature or top_p should be used.

max_tokens
Type: Integer (1 to model-specific limit)
Defines the maximum number of tokens (roughly words or pieces of words) that the model can generate in its response.
The limit depends on the model—for example, GPT-4-1106 supports approximately 128k tokens of context.

presence_penalty
Type: Double (range –2.0 to 2.0)
Encourages the model to introduce new topics.
Higher values reduce the likelihood of repeating earlier content and push the model to explore different ideas.

frequency_penalty
Type: Double (range –2.0 to 2.0)
Reduces repetition of identical words or phrases.
Higher values result in more varied and diverse language output.

Configuration


To adjust these parameters, add one or more of the following keys to your user.prefs file:

IntellaAssistReqParamTemperature
IntellaAssistReqParamTopP
IntellaAssistReqParamMaxTokens
IntellaAssistReqParamPresencePenalty
IntellaAssistReqParamFrequencyPenalty

File Location


The user.prefs file is stored in one of the following locations, depending on the product you are using:

C:\Users\<USERNAME>\AppData\Roaming\Intella\prefs\user.prefs
C:\Users\<USERNAME>\AppData\Roaming\Intella Investigator\prefs\user.prefs
C:\Users\<USERNAME>\AppData\Roaming\Intella Connect\prefs\user.prefs

Notes
- Changes take effect after restarting the product.
- Use either temperature or top_p, but not both.
- Increasing max_tokens allows longer answers but may raise processing time and token usage.
- If these parameters are not defined, they are not included in the request sent to the LLM provider.
In that case, the provider’s own default values (if any) will apply automatically.