The Edgen configuration. It is read from a file where you can define your models' locations, select which model to use for each endpoint, the number of threads Edgen can use and more.

Config NameDescriptionDefault Value
threadsNumber of CPU threads for processing<number_physical_cores> -1
default_uriDefault URI for communicationhttp://
chat_completions_models_dirDirectory for chat completions models<DATA_DIR>/edgen/models/chat/completions
chat_completions_model_nameName of chat completions modelneural-chat-7b-v3-3.Q4_K_M.gguf
chat_completions_model_repoHuggingFace repo for chat completionsTheBloke/neural-chat-7B-v3-3-GGUF
audio_transcriptions_models_dirDirectory for audio transcriptions models<DATA_DIR>/edgen/models/audio/transcriptions
audio_transcriptions_model_nameName of audio transcriptions modelggml-distil-small.en.bin
audio_transcriptions_model_repoHuggingFace repo for audio transcriptionsdistil-whisper/distil-small.en
gpu_policyPolicy to choose how a model gets loaded!always_device
max_request_sizeMaximum size a request can have100 Megabytes

Configuration Paths for DATA_DIR

Linux$XDG_DATA_HOME/_project_path_ or $HOME/.local/share/_project_path_/home/Alex/.local/share/edgen
macOS$HOME/Library/Application Support/_project_path_/Users/Alex/Library/Application Support/com.EdgenAI.Edgen

Model Name and Repo

Model name and repo define the model to use and how to obtain it automatically. If you download the model yourself you just have to copy it to the corresponding model directory and set the model_name setting to the file name. The repo has only informative character in this case, for instance:

Config NameYour Value

If you prefer to let Edgen manage your models, you need to fill in the correct model name and repo, e.g.

Config NameYour Value

In this case, if the model does not exist in the model directory, Edgen will automatically download for you. You can use the model manager (API Reference » Models) to inspect and delete automatically downloaded models.

GPU policies

Edgen supports the following policies, each with their own sub-settings:

  • !always_device - Models will always get loaded to a GPU.
    • overflow_to_cpu - If true, when a model can't be loaded to a GPU, it gets loaded to system memory. Else, Edgen will free GPU memory until the model can be loaded. WARNING: neither of these systems are currently implemented.
  • !always_cpu - Models will always get loaded to system memory.
    • overflow_to_device - If true, when a model can't be loaded to system memory, it gets loaded to a GPU. Else, Edgen will free system memory until the model can be loaded. WARNING: neither of these systems are currently implemented.