Out of order Traffic
Just like IP fragmentation "Out of order Traffic" is not uncommon. Often the result regarding an IPS can be that a threat is not being detected. Traffic is out of order, when packets in a specific flow are received in a different order than expected - normally different than the package sequence number dictates. This sounds obvious - but it's often hard to figure out and often shows in some weird performance issue on an IPS. (The devices that in my experience suffer first from this)
The packets in this flow must first be re-assembled - after that they can be matched and checked against filters or algorithms. This re-assembly is "expensive" - in terms of computing power. If it's some kind of cloud application it'll most likely scale without limits, but also cost a fortune. If it's done on "traditional hardware" you're going to approach the limits of the device very fast. And with exactly this I had to deal with more often than I can count.
Re-assembly is done somewhat differently on the various vendor devices - and then maybe even differently yet again in the cloud applications - but maybe not so different after-all. On a Tipping Point IPS for example, this is done in the so called "L threat". The "L threat" a layer what would more commonly be called "deep inspection" and is essentially a complex type of filter matching. It rips apart the TCP packets, matches it, and sticks them back together unharmed. (To an extend you can remove malicious behavior, or simply block it). It depends on the type of device how to monitor this, but in all variations I'm aware of this specific task can be monitored, since it's one of the major duties of these devices.
Not in all cases it will be possible to avoid "Out of order Traffic" but for the sake of everything in line it should be avoided if somehow possible. Not only IPS's would "suffer" but everything else in line as well - IPS's just suffer the most obvious. For an IPS, setting it up in an other place within the network could help. Obviously this needs to be carefully decided and has it's limits; after-all you want to monitor as much traffic as possible. Commonly, if the IPS sits outside a Firewall, a loadbalancer or similar - this can happen, and may not be a problem overall, but should be considered for change if possible. Every day we have more traffic in our networks, and you'll want to squeeze as much out of it as possible. The "reroute" stats on the IPS can help to determine the percentage of packets that are out of order and require re-assembly. With this, you can make an assumption.
Out of Order Traffic details (packet sequence)
If you'd look at this traffic capture, you would see the SEQ (sequence) number is not what we would expect to see. It is possible that frames are forwarded earlier than expected by certain devices or the packets have traveled in a different route. If an IPS has a set limit how long assembly is allowed to take, it may hand off the traffic delayed, uninspected, even more increasing the chaos.
To properly figure out why this is happening, it is normally necessary to capture sender & recipient traffic and compare captures. This may sound obvious and easy - but it's in reality not. The more "official" or "government" a network becomes, the more difficult this will get. Even as a local admin you're going to have to crawl though a whole bunch of request papers to get this approved. But - it very likely will give a reliably fast answer to the root cause and if it's normal or not in this particular environment.
If there are multiple paths between source and destination this can be normal, but from the perspective of an IPS it might need to be re-located elsewhere in the path to make sure it's functionality is not impacted. (A loadbalancer could cause this)
Packet captures that are created directly on a firewall or loadbalancer are often out of order. This is very likely ok - as long as its only on this device.
In Wireshark you might be able to put things "in order" by using IP ID (ip.id) and / or SEQ (tcp.seq) columns.
Also have a look at Re-order traffic for analysis. If you really question the IPS - it should be relatively easy to get a capture "before and / or after the IPS" to isolate if it might be happening on the IPS for whatever reason - it's important to understand how these captures have been created. In my experience, the root cause is never an IPS.
If you look at packet-captures in Wireshark to figure out whats going on, you need to make sure you understand exactly what you're looking at. Wireshark is great, but can lead to false conclusions. In some cases Re-transmissions can be marked as out of order - which might not be correct. You must understand where the captures are taken, what's in line between the captures. All of it. The capture point might be the only spot where "this" is the case
I'm stressing this fact since more often than not, the environment is build differently than expected. I have seen so many cases where the admin-team was un-aware of some ancient router somewhere. (Wireshark and understanding the TCP protocol can help finding these devices to an extend)
Wireshark and out of order traffic
Wireshark · Display Filter Reference: Transmission Control Protocol
Very useful display filters:
tcp.analysis.out_of_order
"A duplicate ACK is a TCP packet sent from a recipient when that recipient receives packets that are out of order. TCP uses the sequence and acknowledgment number elds within its header to reliably ensure that data is received and reassembled in the same order in which it was sent.”
Excerpt From: Chris Sanders. “Practical Packet Analysis 3rd Edition. Since we know this now, how can we "fix" this?
Re-order traffic for analysis
Reordercap A command line tool (part of Wireshark) exists - it's called "reordercap". It's not magic (well it kind of is almost), but very good at it's job using timestamps.
reordercap [options] <infile> <outfile>
Common root causes / root cause applications
Gigamon (traffic monitoring), F5 (Security Applications), Loadbalancer, Traffic Shaper, LACP, Ether-Channel, Trunking (not to be confused with Cisco "trunking" to denote a tagged vlan), Link Aggregation.
Possible solutions are not un-similar to the also discusses fragmentation
Possible solutions / mitigations
Reconfigure the device shaping traffic to maintain flow affinity. The configuration change would be specific to each vendor. For example: Cisco’s ether-channel setup can balance based on source/destination IP. That setting prevents breaking up flows. As long as the network is not "spraying" packets across ports round robin style with no regard for floows, the next step is to obtain a packet capture of the traffic from a network appliance other than the IPS. Review these PCAPs to see what traffic is out of order. From here a decision can be made, how to mitigate the issue. When reviewing the PCAPs look for excessive TCP reassembly from specific IP addresses, unusual VLAN tags for that segment, TCP Option Selective Acknowledgements (SACK) turned off etc.
Most of the time of out of order traffic occurs elsewhere than on an IPS in the environment. In my experience it's rarely caused by it. Consultin with all your network team, consider design changes to the network or IPS. Not only in regards of the IPS, the issue is there either way - in the long run, you do want to get rid of it somehow if by any means possible.
Additional causes in the environment for out of order packets could be:
- Lost packets at the network level due to packet collisions
- Cable damage
- Faulty hardware
- Lost packets due to the application level such as application or server latency, window sizing issues and disabled TCP selective acknowledgements.
- Network traffic prioritization policies or hardware such as QoS or traffic shaping appliances. It is also completely possible that the out of order traffic is caused by the Internet / ISP and its totally out of the hands of IT admins, the networking team or anyone in the company.
Move the IPS away from the perimeter. Possibly behind the firewall rather than in front of it.