Google Gemini API
Google uses a different API parameter list than OpenAI.
Google Token Count
Google caches their LLM responses and delivers a handfull of tokens in each sub-reply. This prevents the Synthonnel software from calculating comparable values for some performance metrics.
Tokens per Second
and Token Count
for Google responses are not directly compared to other Inference Providers.
Time to 1st Token
and Total Time
will be accurate and are comparable to other Inference Providers.
Parameters
The exact parameter name must be used, followed by the equals sign ( =
), then the value:
maxOutputTokens = 1024
Whitespace such as tabs and spaces are ignored.
topK = 0.5
temperature = 1.0
Comments have a hash ( #
) at the beginning of the line and are ignored:
#not_used = null
Definitions
The official Google Gemini documentation has details about the parameters themselves.
https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini#parameters
Unused
- Some parameters (such as
role
andparts
) are handled by the Synthonnel software. - Some parameters (such as
inlineData
) are complex datatypes and not used by Synthonnel at this time. - Some parameters (such as
fileUri
andmimeType
) do not make sense in Synthonnel usage so are not included.
Supported
Parameter | Data Type | Range |
---|---|---|
temperature | float | 0.0 - 1.0 (gemini-1.0-pro-001) |
temperature | float | 0.0 - 2.0 (gemini-1.0-pro-002) |
maxOutputTokens | integer | 1 to model dependant |
topK | integer | 1 to 40 |
topP | float | 0 to 1 |
stopSequences | string or array | valid string or array |
The Google Gemini API has some configurable guardrail parameters for responses. These parameters have specific inputs.
Parameter | Allowed Values |
---|---|
category | HARM_CATEGORY_SEXUALLY_EXPLICIT |
HARM_CATEGORY_HATE_SPEECH | |
HARM_CATEGORY_HARASSMENT | |
HARM_CATEGORY_DANGEROUS_CONTENT | |
threshold | BLOCK_NONE |
BLOCK_LOW_AND_ABOVE | |
BLOCK_MED_AND_ABOVE | |
BLOCK_ONLY_HIGH |