Zscaler – Wildcards: Optimizing Zscaler ZPA App Segments for Performance and Scale

When deploying Zscaler Private Access (ZPA), it’s tempting to take shortcuts during early rollouts: one giant wildcard app segment (*.corp.local), all TCP/UDP ports open, assigned to every App Connector group. It works… but it’s a silent killer for performance, scalability, and troubleshooting.

Here’s why that approach causes headaches and how to fix it.

How ZPA Health Checks Actually Work

When Health Reporting is enabled, each App Connector proactively checks application reachability based on your app segment definition:

  • FQDN → connector resolves DNS and probes every IP returned.

  • Static IP → connector probes just that IP.

  • TCP ports → TCP 3-way handshake test per port.

  • UDP ports → ICMP probe first; TCP fallback if ICMP fails.

Example

Suppose you define an app segment for app1.cordero.me with the following settings:

  • TCP ports: 443

  • UDP ports: 5000

If app1.cordero.me resolves to 192.168.20.11 and 192.168.20.12, then each App Connector will check:

  • 192.168.20.11:443

  • 192.168.20.12:443

  • 192.168.20.11:5000

  • 192.168.20.12:5000

That’s 4 checks per cycle — very manageable.

Problems arise when wildcards or broad port ranges multiply this check count into the thousands.

Health Reporting Modes (and the “30 minutes”)

  • Continuous → Connector always probes the defined ports. (Not allowed for wildcards or >10 ports.)

  • On Access → Connector starts probing when a user connects, then continues for up to 30 minutes after the last session. After that, the app’s health shows as Unknown.

  • None → No probes at all. ZPA assumes the app is always reachable.

In all cases, the list of ports you configure drives which probes run. If you define “all ports except 53,” the connector will attempt them all, even if the user only ever uses 443.

The 6,000-Check Rule

App Connectors throttle at ~20 health checks per second.

  • With ~6,000 checks, a cycle takes about 300 seconds (5 minutes).

  • With ~20,000 checks, a cycle can stretch 15 minutes or longer, delaying accurate health reporting and wasting connector capacity.

Quick math:

Checks per connector per cycle
(# of FQDNs × avg # of IPs returned by DNS + # of static IPs) × (# of TCP ports + # of UDP ports)

A segment defined as “all ports except 53” effectively tells the connector to probe 65,534 ports per IP — instantly blowing past the 6,000-check guidance.

Why Wildcards Hurt Efficiency

  • Every connector probes every port for every FQDN when wildcards are assigned to all groups.

  • DNS fan-out multiplies the number of checks (one hostname → many IPs).

  • ZPA can pick any connector in the group that reports “Up,” leading to unpredictable brokering.

  • Connector capacity is finite (both concurrent sessions and health-check budget).

  • Troubleshooting becomes murky (“which connector/DC is brokering this session?”).

Avoid Double Load Balancing (F5 GTM + ZPA)

Pointing ZPA at an F5 GTM VIP introduces two independent load-balancing algorithms (GTM and ZPA). Their stickiness timers and failover logic don’t coordinate, creating inefficiency and odd failures.

Best practice: Point ZPA app segments directly to the DC-local FQDN/IP and pin them to the connector group in that DC. Keep GTM for non-ZPA clients if needed, but bypass it for ZPA.

Best Practices to Fix It

  1. Start with Top Applications
    Use ZPA Analytics → Top Applications by Bandwidth to prioritize the biggest hitters.

  2. Create tight app segments
    Use explicit FQDNs and only the ports the app actually needs (e.g., 443, 1521, 3389). Avoid global VIP hostnames that hide LB.

  3. Pin by location (principle of locality)
    DC1 apps → DC1 connector group. DC2 apps → DC2 connector group. If DC2 is just DR, leave it out of steady state.

  4. Keep a temporary safety net
    A narrow wildcard tied to one small connector group for short-term migration/troubleshooting. Retire it once stable.

  5. Right-size health reporting
    Shorter intervals (60–120s) for critical apps, longer for low-priority ones. Split GUI/API/RDP segments if needed.

  6. Watch the two dials

    • Per-connector concurrent sessions

    • Per-connector health-check cycle time
      Add connectors or reduce scope if a group runs hot.

Migration Runbook

  1. Inventory & prioritize: Export Top Apps by Bandwidth. For each: FQDNs, ports, home DC, failover model.

  2. Map to DCs: Assign apps only to their home DC connector group.

  3. Define specific segments: Replace wildcards with app-specific FQDNs and ports.

  4. Assign cleanly: Only the connector group in the app’s home DC.

  5. Order policies: Put specific app segments above legacy wildcard.

  6. Plan DR: If app lives in DC2, create a separate DC2 segment but keep disabled until failover.

  7. Cut over in waves: Top 5 apps first, then next 10–20. Validate health.

  8. Observe & adjust: Right-size connector groups.

  9. Retire the wildcard: Disable the catch-all segment once migration is complete.

Pro tip: Keep a short-lived rollback wildcard segment disabled. If a cutover breaks, toggle it on, fix, then try again.

Before vs After

Before: Wildcard Everywhere

User → ZPA → All Connectors Probe (Wildcard + All Ports) → Random DC / VIP → App

High connector CPU, long health-check cycles, unpredictable brokering.

After: Specific & Local

User → ZPA → DC1 Connector Group (Only DC1 app checks) → App (DC1)

Fewer checks, stable stickiness, predictable routing, easier troubleshooting.

FAQ

Q: If my wildcard allows “all ports except 53,” will ZPA test all those ports?
Yes. The connector probes every port you include in the app segment, regardless of which port the user actually needs.

Q: Can I keep F5 GTM in front of apps?
For non-ZPA clients, yes. For ZPA, no — avoid double LB.

Q: What’s a healthy target?
Keep each connector to ~6,000 checks per cycle or fewer and ensure session concurrency stays comfortably below limits. Scale out by adding connectors.

Appendix: App Inventory Template

App / Owner FQDN(s) Ports (TCP/UDP) Home DC Connector Group Health Interval DR / Failover Notes
ERP erp.dc1.corp.local TCP 443, 1521 DC1 CG-DC1-Core 60s Enable DC2 segment during DR only
RDP Farm rdp.dc2.corp.local TCP 3389 DC2 CG-DC2-Core 120s GUI only; no UDP 3389 required