Issue: PAN-216314
Firewalls play an essential role in network security, and like any other sophisticated technology, they sometimes encounter issues. Palo Alto Networks, one of the leading manufacturers of firewall appliances, had an issue identified as PAN-216314. This issue affected how the firewall handled session timeouts for certain traffic. Let’s break down the issue and how it can be fixed.
Understanding the Issue
Palo Alto firewalls have a DP (Dataplane) functionality that monitors and controls network traffic. One of the features of the DP is to keep track of how long network sessions last and how much data they transmit. This is especially crucial for sessions that are offloaded to a hardware engine for efficient processing.
In the context of this issue, the firewall had two mechanisms for registering these offloaded traffic counters traffic-based
and time-based
.
However, after an upgrade, the time-based mechanism was inadvertently disabled. This meant that DP did not refresh session timeout values for some hardware-offloaded sessions. The consequence was that these sessions were prematurely closed as if they had been idle for too long, even though there might have been minimal traffic. This affected traffic with a low packet rate over a certain time interval.
Digging into the technical details, on PA-3200 series firewalls, the hardware offload engine usually sends periodic stats to DP to enable refreshing session timeouts. However, starting from version 10.1.9, this periodic stats update was disabled. Instead, stats were sent only when a session accumulated approximately 508 packets. The problem with this approach is that if the session didn’t have “enough” packets within the session timeout interval, DP ended up closing these sessions.
Diving Deeper: The Role of ctr_scan_dis Configuration
To truly understand the crux of PAN-216314 issue, it’s important to delve into the nitty-gritty of a particular configuration setting – `ctr_scan_dis`.
In Palo Alto firewalls, `ctr_scan_dis` stands for ‘Control Scan Disable‘. This is a switch that can have two values: 1 or 0. It controls whether the hardware offloading engine sends periodic statistics to the DP (Dataplane) for sessions that have been offloaded.
- When `ctr_scan_dis` is set to 1, it indicates that the offloading engine is **not** sending periodic statistics to the DP. In this scenario, the DP will only receive statistics when the session accumulates a certain number of packets (around 508 packets as per the default configuration). This can lead to problems if the session doesn’t collect enough packets within the session’s timeout interval. The DP might consider the session as aged-out and close it, even though traffic is still transmitted at a low rate.
- When `ctr_scan_dis` is set to 0, the offloading engine sends periodic statistics to the DP regardless of the number of packets in the session. This ensures that the DP effectively refreshes the session timeout values and does not close sessions prematurely.
In the context of issue PAN-216314, after an upgrade, the `ctr_scan_dis` value was inadvertently set to 1, which disabled the periodic stats updates. This was not the intended behavior and caused the aforementioned session timeout issues.
By setting `ctr_scan_dis` back to 0 through the workaround CLI command, we essentially re-enable the periodic statistics update, ensuring that the session timeout values are refreshed properly and sessions with minimal traffic are not aged-out prematurely.
This configuration is a small but crucial cog in the mechanism that ensures smooth network traffic handling by the firewall.
How to Fix the Issue
Thankfully, there’s a workaround as well as a permanent fix to this issue.
Temporary Workaround:
Administrators can run the following CLI (Command Line Interface) command on each HA (High Availability) node without causing any interruption in traffic:
debug dataplane internal pdt fe100 csr wr_sem_ctrl_ctr_scan_dis value 0
This command sets the control value (`ctr_scan_dis`) back to 0, which is the correct value, re-enabling periodic stats updates.
Permanent Fix:
Alternatively, engineers can upgrade the Palo Alto Firewall’s operating system (PAN-OS) to a version where the issue has been resolved. As of this writing, the following versions are available for upgrade:
– 10.1.9-h3
– 10.1.10 (preferred release)
– 10.2.4
– 11.0.1
Upgrading the PAN-OS to any of the above versions will permanently resolve this issue.
In Conclusion
Network security is an ongoing process, and maintaining the devices that protect your network is crucial. The PAN-216314 issue serves as an example of how a small configuration change can have unintended consequences. Thankfully, with a clear understanding of the issue and the steps outlined above, administrators can easily resolve this problem and ensure the smooth operation of their Palo Alto firewalls. By understanding the intricacies of configurations such as `ctr_scan_dis`, network administrators can be better equipped to diagnose and resolve issues efficiently. Always ensure that your network devices are running the optimal configurations and updated to the latest firmware versions to safeguard against any potential issues.
Verify the Change
If you want to verify the if `ctr_scan_dis` is either a 0 or 1, run the command below.
Before:
admin@PA-3020> debug dataplane internal pdt fe100 csr rd name sem_ctrl
Reading csr sem_ctrl on fe100...
sem_ctrl register values:
----------------------------
[0] some_parameter_1 = 0x0
[1] some_parameter_2 = 0xA
[2] some_parameter_3 = 0x3F
...
[8] ctr_scan_dis = 0x1
...
[16] some_parameter_n = 0x5A
----------------------------
Total 17 entries displayed.
After:
admin@PA-3020> debug dataplane internal pdt fe100 csr rd name sem_ctrl
Reading csr sem_ctrl on fe100...
sem_ctrl register values:
----------------------------
[0] some_parameter_1 = 0x0
[1] some_parameter_2 = 0xA
[2] some_parameter_3 = 0x3F
...
[8] ctr_scan_dis = 0x0
...
[16] some_parameter_n = 0x5A
----------------------------
Total 17 entries displayed.