Chat

Generate text from text.


POSThttp://localhost:33322/v1/chat/completions

Create chat completion

Given a list of messages belonging to a chat history, generate a response.

Required attributes

  • Name
    messages
    Type
    array
    Description

    A list of messages representing a chat history. It is essentially the context used by the model to generate a response.

  • Name
    model
    Type
    string
    Description

    The model used for chat completions.

    • If the model name is "default", the chat model from the configuration is used (see Documentation » Configuration for details).

    • If the model name follows the format repo-owner/repo-name/model-name, the indicated model is used and, if it is not present, it will be downloaded from huggingface. If it cannot be downloaded, Edgen responds with an error. Example: "TheBloke/neural-chat-7B-v3-3-GGUF/neural-chat-7b-v3-3.Q4_K_M.gguf".

    • If the model name contains just a file name, e.g.: "my-model.bin", Edgen will try using the file of this name in the data directory as defined in the configuration. If the the file does not exist there, Edgen responds with an error.

Optional attributes

  • Name
    frequency_penalty
    Type
    float
    Description

    A number in [-2.0, 2.0]. A higher number decreases the likelihood that the model repeats itself.

  • Name
    logit_bias
    Type
    map
    Description

    A map of token IDs to [-100.0, +100.0]. Adds a percentage bias to those tokens before sampling; a value of -100.0 prevents the token from being selected at all. You could use this to, for example, prevent the model from emitting profanity.

  • Name
    max_tokens
    Type
    integer
    Description

    The maximum number of tokens to generate. If None, terminates at the first stop token or the end of sentence.

  • Name
    n
    Type
    integer
    Description

    How many choices to generate for each token in the output. 1 by default. You can use this to generate several sets of completions for the same prompt.

  • Name
    presence_penalty
    Type
    float
    Description

    A number in [-2.0, 2.0]. Positive values "increase the model's likelihood to talk about new topics."

  • Name
    seed
    Type
    integer
    Description

    The random number generator seed for the session. Random by default.

  • Name
    stop
    Type
    string or array
    Description

    A stop phrase or set of stop phrases. The server will pause emitting completions if it appears to be generating a stop phrase, and will terminate completions if a full stop phrase is detected. Stop phrases are never emitted to the client.

  • Name
    stream
    Type
    bool
    Description

    If true, stream the output as it is computed by the server, instead of returning the whole completion at the end. You can use this to live-stream completions to a client.

  • Name
    response_format
    Type
    string
    Description

    The format of the response stream. This is always assumed to be JSON, which is non-conformant with the OpenAI spec.

  • Name
    temperature
    Type
    float
    Description

    The sampling temperature, in [0.0, 2.0]. Higher values make the output more random.

  • Name
    top_p
    Type
    float
    Description

    Nucleus sampling. If you set this value to 10%, only the top 10% of tokens are used for sampling, preventing sampling of very low-probability tokens.

  • Name
    tools
    Type
    array
    Description

    A list of tools made available to the model.

  • Name
    tool_choice
    Type
    string
    Description

    If present, the tool that the user has chosen to use. OpenAI states:

    • none prevents any tool from being used,
    • auto allows any tool to be used, or
    • you can provide a description of the tool entirely instead of a name.
  • Name
    user
    Type
    string
    Description

    A unique identifier for the end user creating this request. This is used for telemetry and user tracking, and is unused within Edgen.

  • Name
    one_shot
    Type
    bool
    Description

    Indicate if this is an isolated request, with no associated past or future context. This may allow for optimisations in some implementations. Default: false

  • Name
    context_hint
    Type
    integer
    Description

    A hint for how big a context will be.

    Warning

    An unsound hint may severely drop performance and/or inference quality, and in some cases even cause Edgen to crash. Do not set this value unless you know what you are doing.

  • Default
  • Streaming

Request

POST
/v1/chat/completions
curl http://localhost:33322/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key-required" \
-d '{
  "model": "default",
  "messages": [
    {
      "role": "system",
      "content": "You are EdgenChat, a helpful AI assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}'

Response

{"id":"f403d6f4-4826-40b1-8798-77e4837e5041","choices":[{"message":{"role":"assistant","content":"Hello! How can I help you today?","name":null,"tool_calls":null},"finish_reason":null,"index":0}],"created":1708958149,"model":"main","system_fingerprint":"edgen-0.1.3","object":"text_completion","usage":{"completion_tokens":0,"prompt_tokens":0,"total_tokens":0}}

GEThttp://localhost:33322/v1/chat/completions/status

Chat completion status

Shows the current status of the chat completions endpoint (e.g. downloads).

Response attributes

  • Name
    active_model
    Type
    string
    Description

    The model that is currently active for this endpoint.

  • Name
    donwload_ongoing
    Type
    bool
    Description

    The model for this endpoint is currently being downloaded.

  • Name
    donwload_progress
    Type
    number
    Description

    The progress of the ongoing model download in percent.

  • Name
    last_errors
    Type
    string[]
    Description

    Errors that occurred recently on this endpoint.

Request

GET
/v1/chat/completions/status
curl http://localhost:33322/v1/chat/completions/status \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key-required"

Response

{"active_model":"neural-chat-7b-v3-3.Q4_K_M.gguf","download_ongoing":false,"download_progress":100,"last_errors":["Custom { kind: PermissionDenied, error: \"verboten\" }]}