Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.vchata.com/llms.txt

Use this file to discover all available pages before exploring further.

Health Controller

The Health Controller provides system health monitoring, status checks, and diagnostic information for the VChata platform.

Base Path

/health

Overview

This controller provides essential system monitoring capabilities:
  • 💚 Health Checks - System and service health monitoring
  • 📊 Status Information - Detailed system status and metrics
  • 🔍 Diagnostic Tools - System diagnostics and troubleshooting
  • 📈 Performance Metrics - System performance and resource usage
  • 🛡️ Security Status - Security and authentication status
  • 🔧 Service Dependencies - External service connectivity checks

Authentication & Authorization

  • 🔓 Public Access - Health endpoints are publicly accessible for monitoring
  • 🔐 Diagnostic Access - Some diagnostic endpoints require authentication
  • 🏢 Organization Scoped - Organization-specific health checks require authentication

Health Checks

Basic Health Check

GET /health
{
  "status": "healthy",
  "timestamp": "2024-01-20T16:00:00Z",
  "uptime": 86400,
  "version": "1.2.3",
  "environment": "production"
}
Description: Basic health check endpoint for load balancers and monitoring systems.

Detailed Health Check

GET /health/detailed
Authorization: Bearer <token>
{
  "status": "healthy",
  "timestamp": "2024-01-20T16:00:00Z",
  "uptime": 86400,
  "version": "1.2.3",
  "environment": "production",
  "services": {
    "database": {
      "status": "healthy",
      "responseTime": 12,
      "lastChecked": "2024-01-20T16:00:00Z"
    },
    "redis": {
      "status": "healthy",
      "responseTime": 5,
      "lastChecked": "2024-01-20T16:00:00Z"
    },
    "stripe": {
      "status": "healthy",
      "responseTime": 150,
      "lastChecked": "2024-01-20T16:00:00Z"
    },
    "facebook": {
      "status": "healthy",
      "responseTime": 200,
      "lastChecked": "2024-01-20T16:00:00Z"
    }
  },
  "metrics": {
    "cpu": {
      "usage": 45.2,
      "load": 1.8
    },
    "memory": {
      "used": 2048,
      "total": 4096,
      "percentage": 50.0
    },
    "disk": {
      "used": 1024,
      "total": 2048,
      "percentage": 50.0
    }
  }
}
Description: Detailed health check with service dependencies and system metrics.

System Status

Get System Status

GET /health/status
Authorization: Bearer <token>
{
  "status": "operational",
  "timestamp": "2024-01-20T16:00:00Z",
  "services": {
    "api": {
      "status": "operational",
      "uptime": 86400,
      "requestsPerMinute": 1500,
      "averageResponseTime": 250
    },
    "database": {
      "status": "operational",
      "connections": 45,
      "maxConnections": 100,
      "queryTime": 12
    },
    "cache": {
      "status": "operational",
      "hitRate": 85.5,
      "memoryUsage": 512,
      "maxMemory": 1024
    },
    "external": {
      "stripe": {
        "status": "operational",
        "lastCheck": "2024-01-20T16:00:00Z",
        "responseTime": 150
      },
      "facebook": {
        "status": "operational",
        "lastCheck": "2024-01-20T16:00:00Z",
        "responseTime": 200
      }
    }
  },
  "incidents": [],
  "maintenance": []
}
Description: Comprehensive system status including all services and external dependencies.

Get Service Status

GET /health/services/database
Authorization: Bearer <token>
{
  "service": "database",
  "status": "healthy",
  "timestamp": "2024-01-20T16:00:00Z",
  "details": {
    "type": "PostgreSQL",
    "version": "14.5",
    "host": "db.vchata.com",
    "port": 5432,
    "database": "vchata_production",
    "connections": {
      "active": 45,
      "idle": 12,
      "max": 100
    },
    "performance": {
      "queryTime": 12,
      "slowQueries": 2,
      "cacheHitRatio": 95.8
    },
    "replication": {
      "status": "healthy",
      "lag": 0
    }
  }
}
Description: Detailed status information for a specific service.

Performance Metrics

Get Performance Metrics

GET /health/metrics?timeRange=1h
Authorization: Bearer <token>
{
  "timestamp": "2024-01-20T16:00:00Z",
  "timeRange": "1h",
  "metrics": {
    "system": {
      "cpu": {
        "usage": 45.2,
        "load": [1.8, 1.5, 1.2],
        "cores": 8
      },
      "memory": {
        "used": 2048,
        "total": 4096,
        "percentage": 50.0,
        "swap": {
          "used": 0,
          "total": 0
        }
      },
      "disk": {
        "used": 1024,
        "total": 2048,
        "percentage": 50.0,
        "io": {
          "read": 125.5,
          "write": 89.2
        }
      },
      "network": {
        "bytesIn": 1024000,
        "bytesOut": 2048000,
        "packetsIn": 15000,
        "packetsOut": 12000
      }
    },
    "application": {
      "requests": {
        "total": 90000,
        "perMinute": 1500,
        "errors": 45,
        "errorRate": 0.05
      },
      "responseTime": {
        "average": 250,
        "p50": 180,
        "p95": 800,
        "p99": 1500
      },
      "throughput": {
        "requestsPerSecond": 25,
        "bytesPerSecond": 1024000
      }
    },
    "database": {
      "connections": {
        "active": 45,
        "idle": 12,
        "max": 100
      },
      "queries": {
        "total": 450000,
        "perSecond": 125,
        "slow": 25
      },
      "performance": {
        "averageQueryTime": 12,
        "cacheHitRatio": 95.8,
        "indexUsage": 98.5
      }
    }
  }
}
Description: Comprehensive performance metrics for system monitoring.
query
object

Get Historical Metrics

GET /health/metrics/historical?start=2024-01-20T00:00:00Z&end=2024-01-20T23:59:59Z&granularity=1h
Authorization: Bearer <token>
{
  "start": "2024-01-20T00:00:00Z",
  "end": "2024-01-20T23:59:59Z",
  "granularity": "1h",
  "data": [
    {
      "timestamp": "2024-01-20T00:00:00Z",
      "cpu": 42.1,
      "memory": 48.5,
      "disk": 49.8,
      "requests": 1400,
      "responseTime": 245
    },
    {
      "timestamp": "2024-01-20T01:00:00Z",
      "cpu": 38.7,
      "memory": 47.2,
      "disk": 49.8,
      "requests": 1200,
      "responseTime": 230
    }
  ]
}
Description: Historical performance metrics for trend analysis.

Diagnostic Tools

System Diagnostics

GET /health/diagnostics
Authorization: Bearer <token>
{
  "timestamp": "2024-01-20T16:00:00Z",
  "diagnostics": {
    "system": {
      "os": "Linux",
      "version": "Ubuntu 20.04.3 LTS",
      "kernel": "5.4.0-89-generic",
      "architecture": "x86_64"
    },
    "runtime": {
      "node": "18.17.0",
      "v8": "10.2.154.26-node.26",
      "platform": "linux",
      "arch": "x64"
    },
    "application": {
      "name": "vchata-backend",
      "version": "1.2.3",
      "environment": "production",
      "uptime": 86400
    },
    "dependencies": {
      "database": "PostgreSQL 14.5",
      "redis": "Redis 6.2.7",
      "stripe": "stripe@12.0.0"
    }
  }
}
Description: System diagnostic information for troubleshooting.

Connectivity Test

POST /health/connectivity-test
Authorization: Bearer <token>
Content-Type: application/json

{
  "services": ["database", "redis", "stripe", "facebook"],
  "timeout": 5000
}
{
  "timestamp": "2024-01-20T16:00:00Z",
  "results": [
    {
      "service": "database",
      "status": "success",
      "responseTime": 12,
      "error": null
    },
    {
      "service": "redis",
      "status": "success",
      "responseTime": 5,
      "error": null
    },
    {
      "service": "stripe",
      "status": "success",
      "responseTime": 150,
      "error": null
    },
    {
      "service": "facebook",
      "status": "success",
      "responseTime": 200,
      "error": null
    }
  ],
  "summary": {
    "total": 4,
    "successful": 4,
    "failed": 0,
    "averageResponseTime": 91.75
  }
}
Description: Tests connectivity to external services and dependencies.

Organization Health

Get Organization Health

GET /health/organization/org_abc123
Authorization: Bearer <token>
{
  "organizationId": "org_abc123",
  "status": "healthy",
  "timestamp": "2024-01-20T16:00:00Z",
  "health": {
    "accounts": {
      "total": 5,
      "active": 5,
      "inactive": 0,
      "issues": 0
    },
    "integrations": {
      "social": {
        "connected": 3,
        "healthy": 3,
        "issues": 0
      },
      "billing": {
        "status": "active",
        "paymentMethod": "valid",
        "subscription": "active"
      }
    },
    "usage": {
      "apiCalls": {
        "last24h": 15000,
        "limit": 100000,
        "percentage": 15.0
      },
      "storage": {
        "used": 1024,
        "limit": 10240,
        "percentage": 10.0
      }
    },
    "alerts": [],
    "recommendations": [
      {
        "type": "optimization",
        "message": "Consider upgrading your plan for better performance",
        "priority": "low"
      }
    ]
  }
}
Description: Organization-specific health information and recommendations.

Health Status Values

System Status

  • healthy - All systems operational
  • degraded - Some services experiencing issues
  • unhealthy - Critical services down
  • maintenance - System under maintenance

Service Status

  • operational - Service running normally
  • degraded - Service experiencing performance issues
  • outage - Service completely unavailable
  • maintenance - Service under maintenance

Error Responses

Common Errors

{
  "status": "unhealthy",
  "timestamp": "2024-01-20T16:00:00Z",
  "message": "Service temporarily unavailable",
  "services": {
    "database": {
      "status": "outage",
      "error": "Connection timeout"
    }
  }
}
{
  "status": "error",
  "timestamp": "2024-01-20T16:00:00Z",
  "message": "Internal server error during health check",
  "error": "Database connection failed"
}

Monitoring Integration

Prometheus Metrics

The health controller exposes Prometheus-compatible metrics at /health/metrics/prometheus:
# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",status="200"} 1500

# HELP http_request_duration_seconds HTTP request duration in seconds
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.1"} 800
http_request_duration_seconds_bucket{le="0.5"} 1200
http_request_duration_seconds_bucket{le="1.0"} 1450
http_request_duration_seconds_bucket{le="+Inf"} 1500

# HELP system_cpu_usage CPU usage percentage
# TYPE system_cpu_usage gauge
system_cpu_usage 45.2

# HELP system_memory_usage Memory usage percentage
# TYPE system_memory_usage gauge
system_memory_usage 50.0

Grafana Dashboard

Health metrics can be visualized in Grafana dashboards for real-time monitoring.

Alerting

Health Check Alerts

The system can be configured to send alerts based on health check results:
  • Service Down - Alert when critical services become unavailable
  • Performance Degradation - Alert when response times exceed thresholds
  • Resource Usage - Alert when CPU, memory, or disk usage is high
  • Error Rate - Alert when error rates exceed acceptable levels

Integration with Monitoring Tools

  • DataDog - Custom metrics and dashboards
  • New Relic - Application performance monitoring
  • PagerDuty - Incident management and alerting
  • Slack - Real-time notifications and status updates

Security Considerations

Access Control

  • Public Endpoints - Basic health checks are publicly accessible
  • Authenticated Endpoints - Detailed diagnostics require authentication
  • Rate Limiting - Health endpoints are rate limited to prevent abuse
  • IP Restrictions - Sensitive diagnostic endpoints can be IP restricted

Data Protection

  • Minimal Information - Health checks expose only necessary information
  • No Sensitive Data - No passwords, tokens, or personal data in health responses
  • Sanitized Output - All diagnostic output is sanitized before exposure