Files
Goa-gel-fullstack/Documentation/operations/monitoring.md

82 lines
1.8 KiB
Markdown
Raw Normal View History

# Monitoring & Alerts
## Health Endpoints
| Endpoint | Description |
|----------|-------------|
| `/api/health` | API server health |
| `/api/health/db` | Database connectivity |
| `/api/health/blockchain` | Besu node status |
### Health Response
```json
{
"status": "healthy",
"timestamp": "2026-02-09T10:00:00Z",
"components": {
"database": "healthy",
"blockchain": "healthy",
"cache": "healthy"
}
}
```
## Key Metrics
### Application Metrics
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| `http_request_duration_seconds` | API response time | > 2s |
| `http_requests_total` | Request count | - |
| `active_sessions` | Logged-in users | - |
| `queue_depth` | Pending jobs | > 1000 |
### Infrastructure Metrics
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| `cpu_usage_percent` | CPU utilization | > 80% |
| `memory_usage_percent` | Memory utilization | > 85% |
| `disk_usage_percent` | Disk utilization | > 90% |
| `db_connection_pool` | Active connections | > 80% of max |
### Business Metrics
| Metric | Description |
|--------|-------------|
| `applications_submitted` | New applications |
| `applications_processed` | Completed processing |
| `sla_breaches` | SLA violations |
| `certificates_issued` | Licenses issued |
## Alert Configuration
### Critical Alerts
- API health check failing
- Database unreachable
- Blockchain node disconnected
- Disk space < 10%
### Warning Alerts
- Response time > 2 seconds
- Error rate > 1%
- SLA breach count increasing
- Certificate minting failures
## Dashboard
Access Grafana dashboards at:
```
https://monitoring.tlas.gov.in/grafana
```
Dashboards available:
- System Overview
- Application Processing
- Blockchain Status
- SLA Compliance