Chat

Generate text from text.

POSThttp://localhost:33322/v1/chat/completions

Create chat completion

Given a list of messages belonging to a chat history, generate a response.

Required attributes

Name
messages
Type
array
Description
A list of messages representing a chat history. It is essentially the context used by the model to generate a response.

Name
model
Type
string
Description
The model used for chat completions.
- If the model name is "default", the chat model from the configuration is used (see Documentation » Configuration for details).
- If the model name follows the format repo-owner/repo-name/model-name, the indicated model is used and, if it is not present, it will be downloaded from huggingface. If it cannot be downloaded, Edgen responds with an error. Example: "TheBloke/neural-chat-7B-v3-3-GGUF/neural-chat-7b-v3-3.Q4_K_M.gguf".
- If the model name contains just a file name, e.g.: "my-model.bin", Edgen will try using the file of this name in the data directory as defined in the configuration. If the the file does not exist there, Edgen responds with an error.

Optional attributes

Name
frequency_penalty
Type
float
Description
A number in [-2.0, 2.0]. A higher number decreases the likelihood that the model repeats itself.

Name
logit_bias
Type
map
Description
A map of token IDs to [-100.0, +100.0]. Adds a percentage bias to those tokens before sampling; a value of -100.0 prevents the token from being selected at all. You could use this to, for example, prevent the model from emitting profanity.

Name
max_tokens
Type
integer
Description
The maximum number of tokens to generate. If None, terminates at the first stop token or the end of sentence.

Name
n
Type
integer
Description
How many choices to generate for each token in the output. 1 by default. You can use this to generate several sets of completions for the same prompt.

Name
presence_penalty
Type
float
Description
A number in [-2.0, 2.0]. Positive values "increase the model's likelihood to talk about new topics."

Name
seed
Type
integer
Description
The random number generator seed for the session. Random by default.

Name
stop
Type
string or array
Description
A stop phrase or set of stop phrases. The server will pause emitting completions if it appears to be generating a stop phrase, and will terminate completions if a full stop phrase is detected. Stop phrases are never emitted to the client.

Name
stream
Type
bool
Description
If true, stream the output as it is computed by the server, instead of returning the whole completion at the end. You can use this to live-stream completions to a client.

Name
response_format
Type
string
Description
The format of the response stream. This is always assumed to be JSON, which is non-conformant with the OpenAI spec.

Name
temperature
Type
float
Description
The sampling temperature, in [0.0, 2.0]. Higher values make the output more random.

Name
top_p
Type
float
Description
Nucleus sampling. If you set this value to 10%, only the top 10% of tokens are used for sampling, preventing sampling of very low-probability tokens.

Name
tools
Type
array
Description
A list of tools made available to the model.

Name
tool_choice
Type
string
Description
If present, the tool that the user has chosen to use. OpenAI states:
- none prevents any tool from being used,
- auto allows any tool to be used, or
- you can provide a description of the tool entirely instead of a name.

Name
user
Type
string
Description
A unique identifier for the end user creating this request. This is used for telemetry and user tracking, and is unused within Edgen.

Name
one_shot
Type
bool
Description
Indicate if this is an isolated request, with no associated past or future context. This may allow for optimisations in some implementations. Default: false

Name
context_hint
Type
integer
Description
A hint for how big a context will be.
Warning
An unsound hint may severely drop performance and/or inference quality, and in some cases even cause Edgen to crash. Do not set this value unless you know what you are doing.

Default
Streaming

Request

POST

/v1/chat/completions

curl http://localhost:33322/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key-required" \
-d '{
  "model": "default",
  "messages": [
    {
      "role": "system",
      "content": "You are EdgenChat, a helpful AI assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}'

Response

{"id":"f403d6f4-4826-40b1-8798-77e4837e5041","choices":[{"message":{"role":"assistant","content":"Hello! How can I help you today?","name":null,"tool_calls":null},"finish_reason":null,"index":0}],"created":1708958149,"model":"main","system_fingerprint":"edgen-0.1.3","object":"text_completion","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0}}

GEThttp://localhost:33322/v1/chat/completions/status

Chat completion status

Shows the current status of the chat completions endpoint (e.g. downloads).

Response attributes

Name
active_model
Type
string
Description
The model that is currently active for this endpoint.

Name
donwload_ongoing
Type
bool
Description
The model for this endpoint is currently being downloaded.

Name
donwload_progress
Type
number
Description
The progress of the ongoing model download in percent.

Name
last_errors
Type
string[]
Description
Errors that occurred recently on this endpoint.

Request

GET

/v1/chat/completions/status

curl http://localhost:33322/v1/chat/completions/status \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key-required"

Response

{"active_model":"neural-chat-7b-v3-3.Q4_K_M.gguf","download_ongoing":false,"download_progress":100,"last_errors":["Custom { kind: PermissionDenied, error: \"verboten\" }]}

Chat

Create chat completion

Required attributes

Optional attributes

Warning

Request

Response

Chat completion status

Response attributes

Request

Response