Rick Hofstede, Flow Export & Analysis Expert, RedSocks B.V.
The IPFIX protocol moved to the Standards Track of the Internet Engineering Task Force (IETF) in September 2013 and the number of flow exporters supporting IPFIX is growing rapidly. So is the number of fields that can be exported using NetFlow or IPFIX. Besides the traditional fields that were already available in the NetFlow era, many more fields are being added by vendors for exporting all sorts of information, such as HTTP hostnames and URLs. In this blog post, I will elaborate on the variety of fields that can be exported, and why some fields may not satisfy the requirements of an (analysis) appliance in general and the RedSocks Malicious Threat Detector (MTD) in particular.
Flow exporters are nowadays widely deployed in enterprise, access and backbone networks. They can typically be found as part of a packet forwarding device, such as a router or switch, or in the form of a dedicated flow export device, commonly referred to as ‘probe’. By aggregating packets into flows and exporting flow records using NetFlow or IPFIX, flow data is exported to flow collectors for storage and analysis. In many cases, flow collectors are integrated in a data analysis appliance, such as the RedSocks MTD.
The exported flow data consists of various fields – Information Elements (IEs) in IPFIX jargon – that describe the properties of a flow. The most common fields are source and destination IP addresses, port numbers and the L4 protocol. It is up to a flow exporter to inform flow collectors about the structure of exported flow records, such that the flow collector knows exactly what to expect and how to process them. This is done by means of so-called templates.
Although it is often believed that IEs in NetFlow and IPFIX are well-defined and understood, we see in practice that misunderstandings and confusions are no exception when it comes to understanding NetFlow and IPFIX compatibility. In this blog post, we explain why this is the case and provide several examples.
Every field (or IE) features two IDs in NetFlow and IPFIX:
Enterprise ID: used to indicate whether the field is a ‘global’ field, registered by IANA, or a ‘local’ field, defined within the scope of an enterprise/organization. The latter type is also commonly referred to as ‘enterprise-specific’. IANA uses enterprise ID 0 (zero), while others should use their IANA-assigned Private Enterprise Number (PEN). For example, the enterprise ID used for RedSocks-specific fields is 44913, the RedSocks PEN.
Field ID: use to identify a specific field.
The combination of these two IDs uniquely identifies a field, and is used in templates to inform about which fields are to be expected. In addition, a template contains a length specificier for every field, or a constant to indicate a variable-length field. Some examples of common fields and their IDs:
Enterprise ID 0, field ID 7: source port
Enterprise ID 0, field ID 8: source IPv4 address field
Enterprise ID 0, field ID 150: flow start time, in seconds
Enterprise ID 0, field ID 152: flow start time, in milliseconds
Enterprise ID 35632, field ID 197: HTTP hostname (nTop, enterprise-specific)
Enterprise ID 39499, field ID 1: HTTP hostname (INVEA-TECH/FlowMon Networks, enterprise-specific)
Enterprise ID 44913, field ID 20: HTTP hostname (RedSocks, enterprise-specific)
The fact that data is exported using NetFlow or IPFIX does neither say anything about (a) the presence of specific fields in a set of flow data, nor about (b) the semantics of exported fields. It is up to the implementator of a flow export device to decide which fields are actually supported and exported. For example, although it is not uncommon for flow records to feature a start and end time, this is not mandatory. And if it is included, the vendor can decide about the supported timestamp resolutions. The IANA IPFIX IE registry features several definitions of start time fields, one per resolution: second, milliseconds, nanoseconds, etc.
As a vendor of a flow data analysis appliance, the MTD, we are facing many pitfalls when it comes to analyzing flow data from various sources, exported using many different configurations, using export devices from many different vendors. We will discuss a few of them here, mostly related to IPFIX IEs.
- You can only find data you are looking for: Reading flow data is done using (enterprise and) field IDs, where the field IDs can be considered the ‘primary key’ of a column. If you don’t know which data you’re looking for or the field ID, you can’t find the data. As such, if the RedSocks MTD does not know a particular column, it can never include this column in its analysis. And more concretely, in the context of HTTP hostnames: While the RedSocks MTD supports HTTP hostname fields of RedSocks, INVEA-TECH/FlowMon Networks and nTop, it doesn’t support any other fields until those other fields are explicitly added.
- Different field semantics: Although a specific field may be supported/exported by a flow exporter and known to the analysis software/appliance, we know from experience that its implementation may be different from its specification. Here we list several examples.
- HTTP hostname: There are many different ways to export HTTP hostnames in flow data. Some vendors prepend the protocol specification (e.g., ‘http://’), others don’t. Some vendors include the path (e.g., ‘/page/net.html’), others don’t. Some vendors split HTTP hostname and path over two fields, others don’t. Hence, many different combinations are possible, so even if a field ID is known to the analysis software/appliance, it is necessary for the to know the exact export format to be able to handle the data. For example, assume that one of our heuristics aims at finding hostnames in flow data that start with a digit (0-9). If hostnames are not exported with protocol specifications prepended to it, we can start checking right-away whether the first character of the hostname is a digit. Otherwise, we first have to strip the protocol specification to be able to find a digit.
- Timestamps: IPFIX supports timestamps with various resolutions and every resolution is represented by its own field. What is often not realized is that even though a flow exporter supports millisecond-resolution fields, its contents/timestamps don’t necessarily need to feature that resolution. The resolution is namely often hardware-bound and the timestamp granularity of software implementations is very limited. Care should therefore be taken not to overrate the semantics of timstamp fields.
- TCP flags: Many older flow exporters that are integrated in packet forwarding devices and perform packet forwarding mostly in hardware, are not able to export TCP flags for hardware-switched flows. On a typical campus network, it was found that this corresponds to 99.6% of all TCP flows. This is a striking example of an exported field for which the semantics are completely different from the specification.
- Field lengths: The RedSocks MTD features powerful heuristics for analyzing suspicious hostnames in flow data (which will later be extended to HTTP user agent strings too). If the exported hostnames are however truncated, e.g., because the flow exporter uses fields with a fixed length of 20 characters, the exporter may not export the relevant portions of a hostname, causing the MTD to not be able to perform its analysis. This is why the RedSocks Probe exports as many fields as possible with a variable length, to ensure that strings are not truncated.
NetFlow and IPFIX are nothing more than wire formats, meaning that support for these protocols only indicates to speak the ‘language’. Which fields are actually exported, and more importantly, the semantics of those fields, are therefore up to the vendor of a flow export device. So is your flow exporter compatible the RedSocks MTD? When it comes to generic fields, i.e., typically fields that are standardized by IANA (enterprise ID 0), there is a very good chance that you are good to go. However, when it comes to enterprise-specific fields, i.e., typically those related to application-layer visibility, it is wise to consult RedSocks. To ensure optimal compatibility with the RedSocks MTD, we highly recommend the use of our secure flow exporter, the RedSocks Probe.