Troubleshooting

This page covers common issues encountered during Guardimesh installation and operation, along with diagnostic commands and solutions.

Installation Issues

Scanner pods stuck in CrashLoopBackOff

Symptoms: Scanner pods repeatedly crash and restart.

Diagnosis:

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner --previous
kubectl describe pod -n guardimesh-system <scanner-pod>

Common causes:

Cause	Solution
Missing or invalid API key	Verify the `guardimesh-api-key` Secret exists and contains a valid key
Incorrect backend URL	Check `scanner.saas.backendURL` in Helm values
ClamAV socket not ready	The scanner waits for ClamAV to start — check antivirus container logs
Insufficient permissions	Ensure the scanner runs as privileged with hostPID and host filesystem mount

# Check if the API key secret exists
kubectl get secret guardimesh-api-key -n guardimesh-system

# Check antivirus container (ClamAV) logs
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-antivirus

Scanner pods running but 0/5 containers ready

Cause: ClamAV daemon takes time to load signature databases (30–90 seconds with large databases).

Solution: Wait for all containers to become ready. The startup probe gives ClamAV up to 5 minutes.

kubectl get pods -n guardimesh-system -w

Operator pod not starting

kubectl logs -n guardimesh-system deployment/guardimesh-operator
kubectl describe deployment guardimesh-operator -n guardimesh-system

Common causes:

Image pull failure (check imagePullSecrets for private registries)
Insufficient RBAC (CRDs not installed)
Resource limits too low

No Scan Results Appearing

Check scanner is shipping results

# Look for successful sends in scanner logs
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "send\|ship\|result"

Check namespace is not skipped

The scanner skips kube-system and guardimesh-system by default, plus any namespaces matching skipNamespacePrefixes (default: openshift-).

# View current effective config (if remote config is enabled)
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "config\|skip"

Verify your test pod is in a namespace that is not excluded.

Check active scanning is enabled

If active scanning is disabled and scheduled scanning is not configured, the scanner will not scan anything.

# Check scan config in web console, or look at scanner logs for config loading
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | head -50

Check subscription status

If your trial has expired or subscription is inactive, the backend returns 403 and scanners stop shipping.

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "403\|expired\|subscription"

Solution: Renew your subscription or contact support.

Node Limit Exceeded

Symptoms: Scanner logs show node_limit_exceeded errors. New scans are rejected.

Cause: More nodes are reporting than your tier allows.

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep "node_limit"

Solutions:

Upgrade to a higher tier with more node capacity
Reduce the number of nodes running the scanner (use nodeSelector or affinity in Helm values)
Remove unused nodes from the cluster

ClamAV Issues

ClamAV daemon not starting

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-antivirus

Common causes:

Cause	Solution
Missing signature files	Check puller init container logs: `kubectl logs <pod> -c init-guardimesh-signatures`
Insufficient memory	ClamAV needs ~2 GB RAM with full databases. Increase `antivirusResources.limits.memory`
Corrupted signature files	Delete and re-pull signatures by restarting the pod

Signature updates failing

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-puller

Check:

Network connectivity to storage.guardimesh.io (or your internal signature server in air-gap mode)
API key is valid
Storage service is healthy

OpenShift-Specific Issues

SecurityContextConstraint (SCC) denied

Symptoms: Pods fail to start with unable to validate against any security context constraint.

Solution: The scanner requires a privileged SCC. Create one or use the existing privileged SCC:

oc adm policy add-scc-to-user privileged -z guardimesh-scanner -n guardimesh-system

Or apply a custom SCC:

apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  name: guardimesh-scanner
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostPID: true
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
fsGroup:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
volumes:
  - '*'
users:
  - system:serviceaccount:guardimesh-system:guardimesh-scanner

SELinux denials

If SELinux is enforcing and the scanner cannot read host filesystems:

# Check for SELinux denials
ausearch -m avc -ts recent | grep guardimesh

The privileged SCC should handle this, but if not, ensure the container runs with seLinuxContext: type: spc_t.

Network and Connectivity Issues

Scanner cannot reach backend API

# Test connectivity from scanner pod
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-scanner -- \
  curl -s -o /dev/null -w "%{http_code}" https://api.guardimesh.io/healthz

Solutions:

Check cluster egress rules / NetworkPolicies
If behind a proxy, set HTTP_PROXY and HTTPS_PROXY via extraEnv
For custom CA certificates, use scanner.saas.tls.caSecret

TLS certificate errors

kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep -i "tls\|certificate\|x509"

Solutions:

For corporate proxies with TLS inspection, provide the CA certificate via scanner.saas.tls.caSecret
As a last resort (not recommended for production): scanner.saas.tls.skipVerify: true

Performance Issues

High CPU usage on scanner pods

Common causes:

Fanotify monitoring on high-write workloads — increase FANOTIFY_DEBOUNCE_SEC or disable fanotify for noisy namespaces
Large number of pods starting simultaneously — the scan deduplication TTL prevents duplicate scans but many unique pods will queue
ClamAV scanning large files — set resource limits to prevent starvation of application workloads

Slow scan results

Scan results typically appear in the web console within 30 seconds of detection. If delayed:

Check scanner pod logs for send failures
Check the retry buffer (scanner retries failed sends automatically)
Check BigQuery/PostgreSQL health (for the data pipeline)

Web Console Issues

Cannot log in

Verify your email/password combination
Check if the account is activated (check email for activation link)
Clear browser cookies and try again
If using OAuth, ensure the OAuth provider is accessible

Scan results not loading

Check browser console for network errors
Verify your session is still valid (try logging out and back in)
Check if the backend-api is healthy: look for errors in the web console's Network tab

Diagnostic Commands

Quick reference for common diagnostic commands:

# Overall status
kubectl get pods -n guardimesh-system
kubectl get daemonset -n guardimesh-system
kubectl get events -n guardimesh-system --sort-by='.lastTimestamp' | tail -20

# Scanner logs (last 100 lines)
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
  -c guardimesh-scanner --tail=100

# Antivirus (ClamAV) logs
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
  -c guardimesh-antivirus --tail=50

# Puller logs
kubectl logs -n guardimesh-system -l app.kubernetes.io/component=guardimesh-scanner \
  -c guardimesh-puller --tail=50

# Scanner version
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-scanner -- \
  curl -s localhost:8086/versionz

# ClamAV version and signature info
kubectl exec -n guardimesh-system <scanner-pod> -c guardimesh-antivirus -- \
  clamdscan --version

# Resource usage
kubectl top pods -n guardimesh-system

# Check what config the scanner is using
kubectl logs -n guardimesh-system <scanner-pod> -c guardimesh-scanner | grep "remote config\|applied config"

Getting Support

If you cannot resolve an issue:

Gather diagnostic information using the commands above
Note your tier, cluster size, and Kubernetes version
Contact support at support@guardimesh.com
Include relevant pod logs, events, and your GuardimeshScanner or GuardimeshPlatform CR YAML

Next Steps

Configuration Reference — Verify your settings
Architecture — Understand component interactions
Getting Started — Start fresh if needed

Installation Issues​

Scanner pods stuck in CrashLoopBackOff​

Scanner pods running but 0/5 containers ready​

Operator pod not starting​

No Scan Results Appearing​

Check scanner is shipping results​

Check namespace is not skipped​

Check active scanning is enabled​

Check subscription status​

Node Limit Exceeded​

ClamAV Issues​

ClamAV daemon not starting​

Signature updates failing​

OpenShift-Specific Issues​

SecurityContextConstraint (SCC) denied​

SELinux denials​

Network and Connectivity Issues​

Scanner cannot reach backend API​

TLS certificate errors​

Performance Issues​

High CPU usage on scanner pods​

Slow scan results​

Web Console Issues​

Cannot log in​

Scan results not loading​

Diagnostic Commands​

Getting Support​

Next Steps​

Installation Issues

Scanner pods stuck in CrashLoopBackOff

Scanner pods running but 0/5 containers ready

Operator pod not starting

No Scan Results Appearing

Check scanner is shipping results

Check namespace is not skipped

Check active scanning is enabled

Check subscription status

Node Limit Exceeded

ClamAV Issues

ClamAV daemon not starting

Signature updates failing

OpenShift-Specific Issues

SecurityContextConstraint (SCC) denied

SELinux denials

Network and Connectivity Issues

Scanner cannot reach backend API

TLS certificate errors

Performance Issues

High CPU usage on scanner pods

Slow scan results

Web Console Issues

Cannot log in

Scan results not loading

Diagnostic Commands

Getting Support

Next Steps