Configuration

The Edgen configuration. It is read from a file where you can define your models' locations, select which model to use for each endpoint, the number of threads Edgen can use and more.

Config Name	Description	Default Value
`threads`	Number of CPU threads for processing	<number_physical_cores> -1
`default_uri`	Default URI for communication	http://127.0.0.1:33322
`chat_completions_models_dir`	Directory for chat completions models	`<DATA_DIR>/edgen/models/chat/completions`
`chat_completions_model_name`	Name of chat completions model	neural-chat-7b-v3-3.Q4_K_M.gguf
`chat_completions_model_repo`	HuggingFace repo for chat completions	TheBloke/neural-chat-7B-v3-3-GGUF
`audio_transcriptions_models_dir`	Directory for audio transcriptions models	`<DATA_DIR>/edgen/models/audio/transcriptions`
`audio_transcriptions_model_name`	Name of audio transcriptions model	ggml-distil-small.en.bin
`audio_transcriptions_model_repo`	HuggingFace repo for audio transcriptions	distil-whisper/distil-small.en
`gpu_policy`	Policy to choose how a model gets loaded	!always_device
`max_request_size`	Maximum size a request can have	100 Megabytes

Configuration Paths for DATA_DIR

Platform	Value	Example
Linux	`$XDG_DATA_HOME/_project_path_` or `$HOME/.local/share/_project_path_`	`/home/Alex/.local/share/edgen`
macOS	`$HOME/Library/Application Support/_project_path_`	`/Users/Alex/Library/Application Support/com.EdgenAI.Edgen`
Windows	`{FOLDERID_RoamingAppData}\_project_path_\data`	`C:\Users\Alex\AppData\Roaming\EdgenAI\Edgen\data`

Model Name and Repo

Model name and repo define the model to use and how to obtain it automatically. If you download the model yourself you just have to copy it to the corresponding model directory and set the model_name setting to the file name. The repo has only informative character in this case, for instance:

Config Name	Your Value
`chat_completions_models_dir`	`<DATA_DIR>/edgen/models/chat/completions`
`chat_completions_model_name`	my-fancy-model
`chat_completions_model_repo`	ModelMaster/fancy-model-1.v1-1.GGUF

If you prefer to let Edgen manage your models, you need to fill in the correct model name and repo, e.g.

Config Name	Your Value
`chat_completions_models_dir`	`<DATA_DIR>/edgen/models/chat/completions`
`chat_completions_model_name`	fancy-model-1.v1-1.gguf
`chat_completions_model_repo`	ModelMaster/fancy-model-1.v1-1.GGUF

In this case, if the model does not exist in the model directory, Edgen will automatically download for you. You can use the model manager (API Reference » Models) to inspect and delete automatically downloaded models.

GPU policies

Edgen supports the following policies, each with their own sub-settings:

!always_device - Models will always get loaded to a GPU.
- overflow_to_cpu - If true, when a model can't be loaded to a GPU, it gets loaded to system memory. Else, Edgen will free GPU memory until the model can be loaded. WARNING: neither of these systems are currently implemented.
!always_cpu - Models will always get loaded to system memory.
- overflow_to_device - If true, when a model can't be loaded to system memory, it gets loaded to a GPU. Else, Edgen will free system memory until the model can be loaded. WARNING: neither of these systems are currently implemented.