Monday, October 27, 2014

Just to check

So now we're being reminded that by default our ISP has the ability to both view and modify the traffic we entrust to them.

http://webpolicy.org/2014/10/24/how-verizons-advertising-header-works/

No big shock, but it's still disappointing (at least to me.)  The obvious solution is to use a VPN or other encryption (always SSL?) to prevent tampering with your traffic (when possible.)

Since it's likely that other providers are doing something similar, or will decide to at some time in the future, I decided I wanted an easy way to check for HTTP header tampering.

The result is this script, which simply prints what Apache is able to deduce about a connection.  There must be 1 or 2 million sites which do the same sort of thing, but I wanted one that I control!  And of course, my wheel is rounder any anyone else's.  :-)

There's actually a bit of code which highlights the Verizon Universal ID hash if it's present.  I'll update this if I find out how other providers are also doing this sort of thing  (done ... see below.)

You can try it out at: http://www.sekur1ty.org/foo/bar.pl

(BTW, if you're checking your phone connection, make sure you're not using a WIFI connection when you try this.)

Here's the code:


 

#!/usr/bin/perl -w -t

# Simple CGI program to return what Apache reports about the client and the HTTP request received.

print "Content-type: text/html\n\n";
print "Here is everything that I know about you ...<br><br>";

$DoLog = 0;
if ($ENV{"QUERY_STRING"} =~ /\s*log\s*=\s*yes/i){
    $DoLog = 1;
    open LOG, ">>/tmp/bar.log";
    @ltime = localtime(time);   # Wed Oct 29 01:31:47 UTC 2014
    $Time = sprintf ("%02i/%02i/%04i %02i:%02i:%02i",$ltime[4],$ltime[3],$ltime[5]+1900,$ltime[2],$ltime[1],$ltime[0]);
    print LOG "----------------------------------------\n";
    print LOG "$Time\n";
    print "<b>You have requested the uber secret log option. This will save a record of this connection info.</b><br><br>";
}

print "This is info not directly in the HTTP header, some comes directly from the server (e.g. SERVER_ADDR) and some is derived from your connection (such as REMOTE_ADDR)<br><br>";
# Print the non-header info first, followed by the HTTP header info
print "<code>";
foreach my $key (keys %ENV) {
    if ($key !~ /HTTP_/){
 &do_print ($key);
    }
}

print "</code>";
print "<br>This is what was found in the HTTP header of the request. <i>(Note: Verizon, ACR and other potential tracking hashes highlighted if present.)</i><br><br>";

print "<code>";
foreach my $key (keys %ENV) {
    if ($key =~ /HTTP_/){
 &do_print ($key);

    }
}
print "</code>";

sub do_print {   # filter out a few characters and print
    my $key = pop();
    my $value = $ENV{$key};

    if ($DoLog){
 print LOG "$key: $value\n";
    }

    $value =~ s/\&/&amp;/g;
    $value =~ s/>/&gt;/g;
    $value =~ s/</&lt;/g;
    $value =~ s/\"/&quot;/g;
    $value =~ s/\'/&apos;/g;
    $value =~ s/\`/&#0096;/g;

    if ($key =~ /uid|acr|msisdn|subno/i){  # verizon hash is UIDH, others are ACR etc, match on just uid & acr to catch variations
 print "<b>$key: $value</b><br>";
    }
    else{
 print "$key: $value<br>";
    }


}




Update (10/28/2014): It turns out that variations of -X-ACR (Anonymous Customer Record) are also being used.  I've updated the program to flag those and a few others I've read about ... see below.

http://blog.jgc.org/2012/02/mobile-subscriber-leakage-in-http.html


The ACR value appears to be based on a draft RFC:
http://tools.ietf.org/html/draft-uri-acr-extension-04
http://www.gsma.com/oneapi/anonymous-customer-reference-beta/

Update (11/13/2014): Just to rub our noses in fact that our data can (and is) being modified after we send it across the Internet ... It turns out that Cisco devices have a default setting to modify SMTP (email) sessions to prevent the negotiation of SMTP over TLS (i.e. email across an encrypted connection.)  This is related to the joy above, since the Cisco actually modifies the data that has been sent, fooling one side of the conversation into believing that the other did is refusing to support TLS.

Why does Cisco do this by default?  It turns out the Cisco device wants to inspect the SMTP sessions to prevent malicious activity, and of course it can't do that if the session is encrypted.  Most likely they do this with good intentions, but in today's environment, this is just screaming conspiracy all over the place.  For example:
http://arstechnica.com/tech-policy/2014/11/condemnation-mounts-against-isp-that-sabotaged-users-e-mail-encryption/

In fact, it's a somewhat obscure, but still easily found "feature" on Cisco firewalls:

https://stomp.colorado.edu/blog/blog/2012/12/31/on-smtp-starttls-and-the-cisco-asa/

http://www.cisco.com/c/en/us/td/docs/security/asa/asa-command-reference/I-R/cmdref2/i2.html#pgfId-1765148

So while it seems that in this case folks are just trying to do the right thing, they are still willy-nilly changing what we send out across the Internet "for our own good."

Update (10/7/2015): Almost a year after I wrote this, and the problem has only gotten worse.  Verizon has just entered into an arrangement with AOL to merge their profile data (essentially who you are) with AOL's huge database of our browsing habits (derived from AOL's substantial ad network.)  I guess this shouldn't be a surprise, since Verizon bought AOL earlier this year.  I imagine this is exactly why they made that purchase.

This does make me wonder what's next.  If I were to pay a bill by postal mail, perhaps Verizon could extract a DNA sample from the envelope?  Maybe they can work out an exchange with China for the OPM data?   :-)

Here are all the sordid details: https://www.propublica.org/article/verizons-zombie-cookie-gets-new-life