# Authentication & API Key Management kanoa supports multiple authentication methods depending on your backend and environment. This guide covers everything from quick setup to production deployment and security best practices. ## Quick Setup ### Environment Variables (Recommended) The simplest and most secure way to manage API keys: ```bash # Gemini (Google) - Get at https://aistudio.google.com/apikey export GOOGLE_API_KEY="your-google-api-key" # Claude (Anthropic) - Get at https://console.anthropic.com/ export ANTHROPIC_API_KEY="your-anthropic-api-key" # Molmo (Local) - No API key needed! # Models stored in ~/.cache/kanoa/models/molmo/ by default export MOLMO_MODEL_PATH="$HOME/.cache/kanoa/models/molmo" # Optional override ``` ### Using `.env` Files (Local Development) ⚠️ **Security Note**: API keys generate costs for you or your organization. Protect them carefully to avoid unauthorized usage and unexpected charges. #### Option 1: User Config Directory (Recommended) Store `.env` outside the repository for maximum security: ```bash # Create user config directory mkdir -p ~/.config/kanoa # Create .env file cat > ~/.config/kanoa/.env << EOF GOOGLE_API_KEY=your-google-api-key ANTHROPIC_API_KEY=your-anthropic-api-key EOF ``` Load in your code: ```python from pathlib import Path from dotenv import load_dotenv # Load from user config config_dir = Path.home() / ".config" / "kanoa" load_dotenv(config_dir / ".env") from kanoa import AnalyticsInterpreter interpreter = AnalyticsInterpreter(backend='gemini') ``` #### Option 2: Repo Root with Pre-commit Protection For integration tests and CI/CD, you can use repo-root `.env`: ```bash # Create .env in repo root (already in .gitignore) echo "GOOGLE_API_KEY=your-key" > .env echo "ANTHROPIC_API_KEY=your-key" >> .env ``` Load in your code: ```python from dotenv import load_dotenv load_dotenv() # Loads .env from current directory from kanoa import AnalyticsInterpreter interpreter = AnalyticsInterpreter(backend='gemini') ``` The `.env` file is in `.gitignore`, and we use `detect-secrets` pre-commit hook as backup protection. kanoa backends automatically detect environment variables, so you don't need to pass keys explicitly. --- ## Backend-Specific Authentication ### Gemini (Google) #### Local Development ##### Option 1: API Key (Simplest) ```bash export GOOGLE_API_KEY="your-api-key" ``` Then in Python: ```python interpreter = AnalyticsInterpreter(backend='gemini') # Automatically uses GOOGLE_API_KEY ``` ##### Option 2: Application Default Credentials (ADC) For Google Cloud projects: ```bash gcloud auth application-default login ``` Then in Python: ```python interpreter = AnalyticsInterpreter(backend='gemini') # Automatically uses ADC ``` > **Note for Contributors**: Integration tests can use ADC instead of API keys. Just run `gcloud auth application-default login` before running tests. #### Production / CI/CD ##### Service Account with Workload Identity Federation (Recommended) 1. Create a Service Account with `roles/aiplatform.user` 2. Configure Workload Identity Federation for GitHub Actions 3. Use the `google-github-actions/auth` action ##### Service Account Key (Less Secure) 1. Create a Service Account 2. Generate a JSON key 3. Store as GitHub Secret 4. Set `GOOGLE_APPLICATION_CREDENTIALS` in CI ```yaml - name: Authenticate to Google Cloud uses: google-github-actions/auth@v1 with: credentials_json: ${{ secrets.GCP_SA_KEY }} ``` ### Claude (Anthropic) #### Local Development ```bash export ANTHROPIC_API_KEY="your-api-key" ``` Or in Python: ```python interpreter = AnalyticsInterpreter( backend='claude', api_key='your-api-key' # Optional if env var is set ) ``` #### Production / CI/CD Store the API key as a GitHub Secret: ```yaml env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ``` ### Molmo (Local Inference) Molmo runs entirely locally - **no API key required**! #### Default Model Location kanoa stores Molmo models in: ```bash ~/.cache/kanoa/models/molmo/ ``` This follows the XDG Base Directory Specification, consistent with Hugging Face and other tools. #### Download Model ##### Option 1: Download to default location (recommended) ```bash # Set the default XDG-compliant path export MOLMO_MODEL_PATH="$HOME/.cache/kanoa/models/molmo" # 1. Install CLI pip install huggingface_hub # 2. Login (Required for gated models like Molmo) # Get token at: https://huggingface.co/settings/tokens huggingface-cli login # 3. Download model huggingface-cli download allenai/Molmo-7B-D-0924 --local-dir "$MOLMO_MODEL_PATH" ``` ##### Option 2: Custom location ```bash # Set custom path export MOLMO_MODEL_PATH="/path/to/your/models" # Download to custom location huggingface-cli download allenai/Molmo-7B-D-0924 --local-dir "$MOLMO_MODEL_PATH" ``` ##### Option 3: Using Python ```python from huggingface_hub import snapshot_download from pathlib import Path # Download to default XDG location model_path = Path.home() / ".cache" / "kanoa" / "models" / "molmo" snapshot_download(repo_id="allenai/Molmo-7B-D", local_dir=str(model_path)) ``` #### Environment Variables - `MOLMO_MODEL_PATH`: Override default model directory (optional) - `XDG_CACHE_HOME`: Change XDG cache root (defaults to `~/.cache`) #### Requirements - Python 3.11+ - PyTorch (install for your hardware) - GPU recommended (but CPU works) ⚠️ **Disk Space**: Molmo-7B requires ~14GB --- ## Production Deployment ### Cloud Secret Management For production deployments, use dedicated secret management services: #### Google Cloud Secret Manager (Recommended for Gemini) ```python from google.cloud import secretmanager def get_api_key(project_id: str, secret_id: str) -> str: client = secretmanager.SecretManagerServiceClient() name = f"projects/{project_id}/secrets/{secret_id}/versions/latest" response = client.access_secret_version(request={"name": name}) return response.payload.data.decode("UTF-8") # Use in kanoa api_key = get_api_key("my-project", "gemini-api-key") interpreter = AnalyticsInterpreter(backend='gemini', api_key=api_key) ``` #### AWS Secrets Manager ```python import boto3 import json def get_secret(secret_name: str) -> dict: client = boto3.client('secretsmanager') response = client.get_secret_value(SecretId=secret_name) return json.loads(response['SecretString']) # Use in kanoa secrets = get_secret("kanoa-api-keys") interpreter = AnalyticsInterpreter( backend='claude', api_key=secrets['anthropic_api_key'] ) ``` #### HashiCorp Vault ```python import hvac client = hvac.Client(url='http://localhost:8200') client.token = 'your-vault-token' # Read secret secret = client.secrets.kv.v2.read_secret_version(path='kanoa/api-keys') api_key = secret['data']['data']['google_api_key'] interpreter = AnalyticsInterpreter(backend='gemini', api_key=api_key) ``` --- ## Security Best Practices ### Core Principles 1. **Never hardcode API keys** in source code 2. **Use environment variables** for local development 3. **Leverage secret management services** for production 4. **Implement least privilege** access control 5. **Regular key rotation** and monitoring ### Key Rotation Rotate API keys regularly (every 90 days recommended): ```bash # 1. Generate new key from provider console # 2. Update environment variable or secret manager # 3. Test with new key # 4. Revoke old key ``` ### Monitoring Monitor API usage for unusual patterns: ```python from kanoa import AnalyticsInterpreter interpreter = AnalyticsInterpreter(backend='gemini', track_costs=True) # After using costs = interpreter.get_cost_summary() print(f"Total cost: ${costs['total_cost_usd']:.4f}") print(f"Total calls: {costs['total_calls']}") ``` ### Least Privilege Use separate API keys for: - Development - Staging - Production - CI/CD pipelines ### Never Expose Client-Side ⚠️ **Never** include API keys in: - Frontend JavaScript - Mobile apps - Public repositories - Docker images (use secrets injection) --- ## Troubleshooting ### "API key not found" ```bash # Check if environment variable is set echo $GOOGLE_API_KEY # If empty, set it export GOOGLE_API_KEY="your-key" # Or use .env file python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.getenv('GOOGLE_API_KEY'))" ``` ### "Invalid API key" 1. Verify key is correct (copy-paste from console) 2. Check for extra whitespace 3. Ensure key hasn't been revoked 4. Verify billing is enabled (for cloud providers) ### "Permission denied" Ensure your Service Account has the correct roles: - Gemini: `roles/aiplatform.user` - Vertex AI: `roles/aiplatform.user` + `roles/storage.objectViewer` (for PDFs) ### "Your default credentials were not found" Run: ```bash gcloud auth application-default login ``` --- ## For Contributors ### Development Setup 1. **Clone the repository**: ```bash git clone https://github.com/lhzn-io/kanoa.git cd kanoa ``` 2. **Create `.env` file** (not tracked in git): ```bash # .env GOOGLE_API_KEY=your-google-api-key ANTHROPIC_API_KEY=your-anthropic-api-key # Molmo models stored in ~/.cache/kanoa/models/molmo/ by default # MOLMO_MODEL_PATH=/custom/path # Optional override ``` 3. **Install development dependencies**: ```bash pip install -e .[dev] ``` 4. **Run integration tests**: ```bash # Test Gemini pytest -m gemini tests/integration/test_gemini_integration.py # Test Claude pytest -m claude tests/integration/test_claude_integration.py # Test Molmo (requires local model) pytest -m molmo tests/integration/test_molmo_integration.py ``` ### Contributing Guidelines When contributing to kanoa: 1. ✅ **DO**: Use environment variables or `.env` files 2. ✅ **DO**: Add `.env` and `.secrets/` to `.gitignore` 3. ✅ **DO**: Document any new API key requirements 4. ❌ **DON'T**: Commit API keys or secrets 5. ❌ **DON'T**: Hardcode keys in examples or tests 6. ❌ **DON'T**: Include keys in screenshots or documentation --- ## References This authentication strategy is based on best practices from: - **LangChain**: Environment variables, secret management integration - **OpenAI SDK**: `.env` files, never hardcode keys - **Anthropic SDK**: `python-dotenv`, GitHub secret scanning - **Google Cloud**: Secret Manager, ADC (Application Default Credentials)