Configuration
The Edgen configuration. It is read from a file where you can define your models' locations, select which model to use for each endpoint, the number of threads Edgen can use and more.
Config Name | Description | Default Value |
---|---|---|
threads | Number of CPU threads for processing | <number_physical_cores> -1 |
default_uri | Default URI for communication | http://127.0.0.1:33322 |
chat_completions_models_dir | Directory for chat completions models | <DATA_DIR>/edgen/models/chat/completions |
chat_completions_model_name | Name of chat completions model | neural-chat-7b-v3-3.Q4_K_M.gguf |
chat_completions_model_repo | HuggingFace repo for chat completions | TheBloke/neural-chat-7B-v3-3-GGUF |
audio_transcriptions_models_dir | Directory for audio transcriptions models | <DATA_DIR>/edgen/models/audio/transcriptions |
audio_transcriptions_model_name | Name of audio transcriptions model | ggml-distil-small.en.bin |
audio_transcriptions_model_repo | HuggingFace repo for audio transcriptions | distil-whisper/distil-small.en |
gpu_policy | Policy to choose how a model gets loaded | !always_device |
max_request_size | Maximum size a request can have | 100 Megabytes |
Configuration Paths for DATA_DIR
Platform | Value | Example |
---|---|---|
Linux | $XDG_DATA_HOME/_project_path_ or $HOME/.local/share/_project_path_ | /home/Alex/.local/share/edgen |
macOS | $HOME/Library/Application Support/_project_path_ | /Users/Alex/Library/Application Support/com.EdgenAI.Edgen |
Windows | {FOLDERID_RoamingAppData}\_project_path_\data | C:\Users\Alex\AppData\Roaming\EdgenAI\Edgen\data |
Model Name and Repo
Model name and repo define the model to use and how to obtain it automatically. If you download the model yourself you just have to copy it to the corresponding model directory and set the model_name
setting to the file name. The repo has only informative character in this case, for instance:
Config Name | Your Value |
---|---|
chat_completions_models_dir | <DATA_DIR>/edgen/models/chat/completions |
chat_completions_model_name | my-fancy-model |
chat_completions_model_repo | ModelMaster/fancy-model-1.v1-1.GGUF |
If you prefer to let Edgen manage your models, you need to fill in the correct model name and repo, e.g.
Config Name | Your Value |
---|---|
chat_completions_models_dir | <DATA_DIR>/edgen/models/chat/completions |
chat_completions_model_name | fancy-model-1.v1-1.gguf |
chat_completions_model_repo | ModelMaster/fancy-model-1.v1-1.GGUF |
In this case, if the model does not exist in the model directory, Edgen will automatically download for you. You can use the model manager (API Reference » Models) to inspect and delete automatically downloaded models.
GPU policies
Edgen supports the following policies, each with their own sub-settings:
!always_device
- Models will always get loaded to a GPU.overflow_to_cpu
- If true, when a model can't be loaded to a GPU, it gets loaded to system memory. Else, Edgen will free GPU memory until the model can be loaded. WARNING: neither of these systems are currently implemented.
!always_cpu
- Models will always get loaded to system memory.overflow_to_device
- If true, when a model can't be loaded to system memory, it gets loaded to a GPU. Else, Edgen will free system memory until the model can be loaded. WARNING: neither of these systems are currently implemented.