TGI Server
HyperDex SDK provides the HyperDex-Serve Python package for serving LLMs. HyperDex-Serve allows the LPU (or LPU-GPU Hybrid System) to be run via a RESTful API. The API follows the same structure as HuggingFace’s widely-used Text-Generation-Inference (TGI) API, which is commonly used for serving open-source LLMs.
Requirements and Install Guide
Requirements and Install Guide is same as Python API.
Serving Model
hyperdex-serve provides an HTTP server that implements TGI API. You can execute the server by the command below.
Usage
Command Line Interface
Once the HyperDex-Serve package is installed, you can use the hdex-serve
command with the following options:
Serving Model
Below is an example of serving a HuggingFace model. The model to be served must be pre-compiled using the HyperDex Compiler SDK.
Tip
HyperDex-Serve can be used to start the server with the above command, and you can test it using the client code from TGI. Example code is available in /opt/hyperdex/examples
for reference.
Descriptions of HyperDex-Serve Arguments
Arguments | Description |
---|---|
-h, --help |
show help message and exit |
--host HOST |
host name |
--port PORT |
port number |
--allow-credentials |
allow credentials |
--allowed-origins ALLOWED_ORIGINS |
allowed origins |
--allowed-methods ALLOWED_METHODS |
allowed methods |
--allowed-headers ALLOWED_HEADERS |
allowed headers |
--api-key API_KEY |
If provided, the server will require this key to be presented in the header. |
--served-model-path SERVED_MODEL_PATH |
The path to model checkpoint used in the API. |
--served-model-name SERVED_MODEL_NAME |
The model name used in the API. If not specified, the model name will be the same as the huggingface name. |
--served-lpu-device-num SERVED_LPU_DEVICE_NUM |
The total number of LPU device used in the API |
--served-gpu-device-num SERVED_GPU_DEVICE_NUM |
The total number of GPU device used in the API |
--response-role RESPONSE_ROLE |
The role name to return if request.add_generation_prompt=true . |
--ssl-keyfile SSL_KEYFILE |
The file path to the SSL key file |
--ssl-certfile SSL_CERTFILE |
The file path to the SSL cert file |
--root-path ROOT_PATH |
FastAPI root_path when app is behind a path based routing proxy |
--verbose |
Lanch the program with verbose mode |