Troubleshooting
This page covers common issues encountered during Guardimesh installation and operation, along with diagnostic commands and solutions.
Installation Issues
Scanner pods stuck in CrashLoopBackOff
Symptoms: Scanner pods repeatedly crash and restart.
Diagnosis:
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner --previous
kubectl describe pod -n guardimesh-system <scanner-pod>
Common causes:
| Cause | Solution |
|---|---|
| Missing or invalid API key | Verify the guardimesh-api-key Secret exists and contains a valid key |
| Incorrect backend URL | Check scanner.saas.backendURL in Helm values |
| ClamAV socket not ready | The scanner waits for ClamAV to start — check antivirus container logs |
| Insufficient permissions | Ensure the scanner runs as privileged with hostPID and host filesystem mount |
# Check if the API key secret exists
kubectl get secret guardimesh-api-key -n guardimesh-system
# Check antivirus container (ClamAV) logs
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-antivirus
Scanner pods running but 0/5 containers ready
Cause: ClamAV daemon takes time to load signature databases (30–90 seconds with large databases).
Solution: Wait for all containers to become ready. The startup probe gives ClamAV up to 5 minutes.
kubectl get pods -n guardimesh-system -w
Operator pod not starting
kubectl logs -n guardimesh-system deployment/guardimesh-operator
kubectl describe deployment guardimesh-operator -n guardimesh-system
Common causes:
- Image pull failure (check
imagePullSecretsfor private registries) - Insufficient RBAC (CRDs not installed)
- Resource limits too low
No Scan Results Appearing
Check scanner is shipping results
# Look for successful sends in scanner logs
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "send\|ship\|result"
Check namespace is not skipped
The scanner skips kube-system and guardimesh-system by default, plus any namespaces matching skipNamespacePrefixes (default: openshift-).
# View current effective config (if remote config is enabled)
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "config\|skip"
Verify your test pod is in a namespace that is not excluded.
Check active scanning is enabled
If active scanning is disabled and scheduled scanning is not configured, the scanner will not scan anything.
# Check scan config in web console, or look at scanner logs for config loading
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | head -50
Check subscription status
If your trial has expired or subscription is inactive, the backend returns 403 and scanners stop shipping.
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "403\|expired\|subscription"
Solution: Renew your subscription or contact support.
Node Limit Exceeded
Symptoms: Scanner logs show node_limit_exceeded errors. New scans are rejected.
Cause: More nodes are reporting than your tier allows.
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep "node_limit"
Solutions:
- Upgrade to a higher tier with more node capacity
- Reduce the number of nodes running the scanner (use
nodeSelectororaffinityin Helm values) - Remove unused nodes from the cluster
ClamAV Issues
ClamAV daemon not starting
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-antivirus
Common causes:
| Cause | Solution |
|---|---|
| Missing signature files | Check puller init container logs: kubectl logs <pod> -c init-guardimesh-signatures |
| Insufficient memory | ClamAV needs ~2 GB RAM with full databases. Increase antivirusResources.limits.memory |
| Corrupted signature files | Delete and re-pull signatures by restarting the pod |
Signature updates failing
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-puller
Check:
- Network connectivity to
storage.guardimesh.io(or your internal signature server in air-gap mode) - API key is valid
- Storage service is healthy
OpenShift-Specific Issues
SecurityContextConstraint (SCC) denied
Symptoms: Pods fail to start with unable to validate against any security context constraint.
Solution: The scanner requires a privileged SCC. Create one or use the existing privileged SCC:
oc adm policy add-scc-to-user privileged -z guardimesh-scanner -n guardimesh-system
Or apply a custom SCC:
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: guardimesh-scanner
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostPID: true
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
fsGroup:
type: RunAsAny
supplementalGroups:
type: RunAsAny
volumes:
- '*'
users:
- system:serviceaccount:guardimesh-system:guardimesh-scanner
SELinux denials
If SELinux is enforcing and the scanner cannot read host filesystems:
# Check for SELinux denials
ausearch -m avc -ts recent | grep guardimesh
The privileged SCC should handle this, but if not, ensure the container runs with seLinuxContext: type: spc_t.
Network and Connectivity Issues
Scanner cannot reach backend API
# Test connectivity from scanner pod
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-scanner -- \
curl -s -o /dev/null -w "%{http_code}" https://api.guardimesh.io/healthz
Solutions:
- Check cluster egress rules / NetworkPolicies
- If behind a proxy, set
HTTP_PROXYandHTTPS_PROXYviaextraEnv - For custom CA certificates, use
scanner.saas.tls.caSecret
TLS certificate errors
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "tls\|certificate\|x509"
Solutions:
- For corporate proxies with TLS inspection, provide the CA certificate via
scanner.saas.tls.caSecret - As a last resort (not recommended for production):
scanner.saas.tls.skipVerify: true
Performance Issues
High CPU usage on scanner pods
Common causes:
- Fanotify monitoring on high-write workloads — increase
FANOTIFY_DEBOUNCE_SECor disable fanotify for noisy namespaces - Large number of pods starting simultaneously — the scan deduplication TTL prevents duplicate scans but many unique pods will queue
- ClamAV scanning large files — set resource limits to prevent starvation of application workloads
Slow scan results
Scan results typically appear in the web console within 30 seconds of detection. If delayed:
- Check scanner pod logs for send failures
- Check the retry buffer (scanner retries failed sends automatically)
- Check BigQuery/PostgreSQL health (for the data pipeline)
Web Console Issues
Cannot log in
- Verify your email/password combination
- Check if the account is activated (check email for activation link)
- Clear browser cookies and try again
- If using OAuth, ensure the OAuth provider is accessible
Scan results not loading
- Check browser console for network errors
- Verify your session is still valid (try logging out and back in)
- Check if the backend-api is healthy: look for errors in the web console's Network tab
Diagnostic Commands
Quick reference for common diagnostic commands:
# Overall status
kubectl get pods -n guardimesh-system
kubectl get daemonset -n guardimesh-system
kubectl get events -n guardimesh-system --sort-by='.lastTimestamp' | tail -20
# Scanner logs (last 100 lines)
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
-c guardimesh-scanner --tail=100
# Antivirus (ClamAV) logs
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
-c guardimesh-antivirus --tail=50
# Puller logs
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
-c guardimesh-puller --tail=50
# Scanner version
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-scanner -- \
curl -s localhost:8086/versionz
# ClamAV version and signature info
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-antivirus -- \
clamdscan --version
# Resource usage
kubectl top pods -n guardimesh-system
# Check what config the scanner is using
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep "remote config\|applied config"
Getting Support
If you cannot resolve an issue:
- Gather diagnostic information using the commands above
- Note your tier, cluster size, and Kubernetes version
- Contact support at support@guardimesh.com
- Include relevant pod logs, events, and your
GuardimeshScannerorGuardimeshPlatformCR YAML
Next Steps
- Configuration Reference — Verify your settings
- Architecture — Understand component interactions
- Getting Started — Start fresh if needed