Solution provider takeaway: User-agent strings, elements of HTTP headers, can be used as a network monitoring tool to reveal information about client networks without affecting network performance or privacy. Learn how to benefit from user-agent strings in this tip by Richard Bejtlich.
Clients of network services often want to know more about their network. In this edition of Traffic Talk, I will demonstrate how user-agent strings can be used as a networking monitoring tool to reveal an enormous amount of information with little or no impact on network performance or privacy.
A user-agent string is an element of an HTTP header sent by HTTP clients such as Web browsers. The following HTTP request includes a user-agent string from a Windows XP SP3 system running Firefox, talking to a Squid proxy server.
The user-agent string displays a lot of interesting information that can be used to identify the version of the operating system and application th...
To continue reading for free, register below or login
To read more you must become a member of SearchNetworkingChannel.com
');
// -->

at made the request.
Collecting user-agent strings
Network administrators can collect user-agent strings in two ways. The first is to extract them from proxy logs. For example, a Squid proxy log might contain an entry like the following:
The triple colons ( ::: ) were added intentionally, for reasons that will appear next. The entry in the squid.conf file used to generate this log format is the following:
The second way to gather user-agent strings is to examine network traffic, perhaps using a tool like Httpry. I explained this method in the tip "Network security monitoring using transaction data."
Once you have logs, what can you do with them? Consider the following command that examines Squid proxy logs, extracts the source IP addresses and user-agents, counts unique appearances, and sorts them.
Here we see the file separator (FS) is set to triple colons. In my experience, "traditional" file separators like commas or pipes appear too frequently in HTTP requests to be useful for logging, but you are free to use whatever file separator you would like. An excerpt of the output of running a command like this on a small live network appears next. I describe a few interesting elements of each after they are listed.
The three entries above show 192.168.2.104 has been updating its AVG antivirus product.
Now we see a different PC running AVG. It has a different LIC (license) key. Google searches for both keys reveal they are not unique to these systems.
The Adobe program is interesting because it must have checked local proxy settings to do its update. The "Client" entry is extremely interesting because it appears only once.
We can search the proxy logs for that entry:
We can see the system accessed g.microsoft.com with IP address 207.46.216.54 (which belongs in Microsoft's 207.46.0.0/16 netblock). So this appears to be related to a Microsoft application.
This entry is also obscure.
Checking the logs, we see another Microsoft application, perhaps related to Dr. Watson and Windows Defender.
These Python entries are probably not caused by a Windows application. Checking the logs we see they are used by Ubuntu.
As you can see, you can learn a lot about a network simply by looking at user-agent strings. The very simple network used to generate the logs for this story offered more than 60 different entries for analysis, but I displayed only nine for the sake of brevity. User-agent string mining can be used passively to identify and track applications and systems, for both inventory and security purposes. Consider ways you can use user-agent strings for network monitoring when working with client networks!
Richard Bejtlich is director of incident response for General Electric and author of the TaoSecurity blog.