# vLLM model host for ADK agents
Supported in ADKPython v0.1.0
Tools such as [vLLM](https://github.com/vllm-project/vllm) allow you to host models efficiently and serve them as an OpenAI-compatible API endpoint. You can use vLLM models through the [LiteLLM](/adk-docs/agents/models/litellm/) library for Python. ## Setup 1. **Deploy Model:** Deploy your chosen model using vLLM (or a similar tool). Note the API base URL (e.g., `https://your-vllm-endpoint.run.app/v1`). * *Important for ADK Tools:* When deploying, ensure the serving tool supports and enables OpenAI-compatible tool/function calling. For vLLM, this might involve flags like `--enable-auto-tool-choice` and potentially a specific `--tool-call-parser`, depending on the model. Refer to the vLLM documentation on Tool Use. 2. **Authentication:** Determine how your endpoint handles authentication (e.g., API key, bearer token). ## Integration Example The following example shows how to use a vLLM endpoint with ADK agents. ```python import subprocess from google.adk.agents import LlmAgent from google.adk.models.lite_llm import LiteLlm # --- Example Agent using a model hosted on a vLLM endpoint --- # Endpoint URL provided by your vLLM deployment api_base_url = "https://your-vllm-endpoint.run.app/v1" # Model name as recognized by *your* vLLM endpoint configuration model_name_at_endpoint = "hosted_vllm/google/gemma-3-4b-it" # Example from vllm_test.py # Authentication (Example: using gcloud identity token for a Cloud Run deployment) # Adapt this based on your endpoint's security try: gcloud_token = subprocess.check_output( ["gcloud", "auth", "print-identity-token", "-q"] ).decode().strip() auth_headers = {"Authorization": f"Bearer {gcloud_token}"} except Exception as e: print(f"Warning: Could not get gcloud token - {e}. Endpoint might be unsecured or require different auth.") auth_headers = None # Or handle error appropriately agent_vllm = LlmAgent( model=LiteLlm( model=model_name_at_endpoint, api_base=api_base_url, # Pass authentication headers if needed extra_headers=auth_headers # Alternatively, if endpoint uses an API key: # api_key="YOUR_ENDPOINT_API_KEY" ), name="vllm_agent", instruction="You are a helpful assistant running on a self-hosted vLLM endpoint.", # ... other agent parameters ) ```