We're Doomed!: 2013

Tuesday, October 8, 2013

YACC

(YACC: Yet Another Cool Class - not the parser generator)

I love the low cost online courses that I've taken this summer. There's nothing like spending a Saturday focused on writing cool programs ... learning something new, with a knowledgeable instructor talking you through the tricky parts.

I just finished taking the second Ruby for Information Security Professionals course offered by Marcus Carey at threatagent.com. Not surprisingly, I walked away a bit smarter and with a big grin on my face.

While his first class (http://jrnerqbbzrq.blogspot.com/2013/08/more-cool-classes.html) provides an introduction to Ruby in the context of writing Ruby code for Metasploit, this class doesn't touch Metasploit. Instead, it assumes you have a basic familiarity with Ruby, and focuses on various techniques for accessing Open Source Intelligence. What this means is that he walks you through writing code to pull down information from various on-line sources of public information such as Bing, Twitter, LinkedIn and Shodan. :-)

By visiting several different sources of information, Marcus is able to introduce us to different techniques to collect information. So for example, Bing provides a really sweet API that gives you access to the full power of their search engine and get results back in easily parsed json. LinkedIn however, chooses to hoard their information, forcing us to scrape information off their web pages. Marcus shows us how to reverse engineer LinkedIn pages and use the power of Nokogiri to pull useful information from LinkedIn's cold-dead-hands. How cool!

The class is taught via a webinar, where Marcus shares his desktop to demonstrate code as he builds up applications in real-time. While watching Marcus' desktop, in another windows we're developing the same code. When we have questions, Marcus can just demonstrate the answer for us to see. This is a great paradigm for teaching a class like this. However, it works better if you can use two monitors - one with Marcus' desktop and the other showing the window that you're working in. If your desktop only has one monitor, you'll be switching back and forth between windows a lot. (Maybe pressing your laptop into service to watch the webinar would work.) He also provides a reference document which shows some of the key code snippets.

The class assumes you've taken his first Ruby course, and while Marcus works hard to bring everybody up to the same level, you'll probably struggle if you've never seen Ruby before.

You need to have a working copy of Ruby, with the 'whois', 'open-uri', 'nokogiri', 'shodan' and 'twitter' Ruby packages installed. It would behoove you to get these installed ahead of time, I found that I couldn't get 'nokogiri' to install on my preferred Ubuntu system - fortunately it installed with no fuss on my Pentoo system so I used that for the class. Lots of folks used Kali, which seemed to work well.

Afterwards, Marcus makes available a video of the entire class. Great for review.

So here's the bottom line: For $125, this 8 hour long class is a screaming deal. It's relevant to what we do, it's very well taught and it's just good wholesome fun!

You can read about it at: https://www.threatagent.com/training/ruby_osint

Tuesday, September 17, 2013

I didn't know that gold can tarnish

It's been accepted for years that cryptography is hard to implement and is full of snake-oil products. The only way to be reasonably sure that encryption is effective is to:

Stay current
Use robust key lengths
Manage keys/passwords carefully
And most importantly, only use FIPS 140-2 validated encryption.

In general, FIPS 140-2 has always been the gold standard of encryption, and trust in FIPS 140-2 has been a cornerstone of being able to trust most security products available today.

However, that trust is now under attack.

The Arstechnica article below provides an alarming report, describing how between 2006 and 2007 the Taiwan government issued at least 10,000 flawed smart cards. These smart cards were designed to be used by Taiwanese citizens for many sensitive transactions, including activities such as submitting tax returns. The cards with the flaw had virtually useless encryption, putting at risk any data "protected" by the cards. The gist of the Arstechnica article is that these failures occurred in spite of the cards being FIPS 140-2 validated, and that the FIPS 140-2 validation process is broken.

But reading the research paper which the Arstechnica article is based on suggests that it's not as simple as that...

Despite being FIPS 140-2 validated, it turns out that the random number generator used by the card (technically, the "Renesas AE45C1 smart card microcontroller" used by the card) "sometimes fails", producing (non)random numbers that can lead to certificates which are easily compromised. This is exactly the type of failure mode that FIPS 140-2 is designed to catch. However, the generator on this card had an optional "health check" which was intended to detect when the random number generator was failing. Not surprisingly, the FIPS 140-2 validation for the card only applies if this health check is enabled. In other words, if the health check is turned off, as it was on the 10,000 or so broken cards, FIPS 140-2 does not apply and you're on your own using these cards.

Here's the way the research report describes the problem (MOICA is the agency which issued the cards):

"Unfortunately, the hardware random-number generator on the AE45C1 smart card microcontroller sometimes fails, as demonstrated by our results. These failures are so extreme that they should have been caught by standard health tests, and in fact the AE45C1 does offer such tests. However, as our results show, those tests were not enabled on some cards. This has now also been confirmed by MOICA. MOICA’s estimate is that about 10000 cards were issued without these tests, and that subsequent cards used a “FIPS mode” (see below) that enabled these tests"

This is pretty standard, if you look at FIPS 140-2 validation reports, or Common Criteria evaluations, you'll always see a very precise description of exactly how the product must be configured in order for the validation to apply. It's common to see that FIPS 140-2 validated software or hardware has a "FIPS mode", which must be enabled for the validation to apply.

Looking at the FIPS 140-2 certificate for at least one version of this chip (https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Zertifizierung/Reporte02/0212a_pdf.pdf), the report specifically says "postprocessing should be included in the users embedded software", which I believe is a requirement to include the health check.

Clearly, this was a horrific failure. The Taiwanese government issued a bunch of smart cards which were used to authenticate citizens and protect sensitive data, that were completely broken. Yes, the folks producing the card made a critical mistake. But placing this at the feet of FIPS 140-2 is, IMHO, missing the point.

It would be nice if FIPS 140-2 meant a product was idiot proof, but that's not the way the world works. Encryption is complicated and has to be done correctly or it doesn't work. The whole purpose of the FIPS 140-2 testing regime is to ensure that encryption has been rigorously tested under controlled conditions ... and most importantly, to document those conditions so that we know how to use it in a way that can be trusted. Just because something is FIPS 140-2 validated doesn't mean it's idiot proof or that it can't be configured insecurely.

In any event, in my opinion, despite being very clear about what they did and didn't do when testing this chip - NIST's reputation has been badly tarnished and it will take significant time and effort on their part to undo the damage. I guess even a gold standard can tarnish sometimes ...

Here's the Arstechnica article describing the failure:
http://arstechnica.com/security/2013/09/fatal-crypto-flaw-in-some-government-certified-smartcards-makes-forgery-a-snap/2/

Here's the actual research paper describing the findings:
http://smartfacts.cr.yp.to/smartfacts-20130916.pdf

A very good overview of the problem provided by the researchers:
http://smartfacts.cr.yp.to/index.html

Tin-Foil Hat Addendum

Given the <euphemism>crisis of trust</euphemism> that the NSA, and the US Government, is currently going through - including accusations that the NSA has been surreptitiously weakening encryption products, it's very hard to avoid the theory that the NSA might have had a hand in this failure. That's certainly possible in this case, but there's nothing to suggest that the lab issuing the FIPS 140-2 validation was complicit in this failure.

BTW, given the relationship between Taiwan and Mainland China, I've always assumed that Taiwan is constantly under cyber attack by China. Putting on my second layer of tin-foil headgear, could this be the result of a Chinese effort, not an American one? I'm sure the NSA is very good, but China is certainly no cyber-slouch either, and they might have a better pool of human resources on the ground in Taiwan - which would have simplified introducing this vulnerability into the card.

Finally, removing my tin-foil hats for a second, this could simply be a screw up. Broken products get shipped every day, and encryption errors like this are subtle and hard to notice when present in only a very small percentage of the cards.

Saturday, September 14, 2013

The Law of Unintended Consequences and Biometrics

So here's an interesting twist ...

Generally, the government can't force you to provide information you know, and then use it against you. Apparently, forcing folks to incriminate themselves is a slippery slope to state sponsored torture - go figure.

As a result, the state can't compel you to give up passwords or encryption keys. Although it's recently been challenged, and seems to be subject to subtle interpretations of the law, this protection appears to be holding up in court (http://en.wikipedia.org/wiki/Key_disclosure_law#United_States.)

But, if your authentication or encryption key is a biometric (e.g. a fingerprint), all bets are off and the state has every right to force you to give them access. This is despite the fact that the biometric might be more secure from a pure security perspective.

This article talks about that little irony, in the context of Apples new iPhone - which can use one's fingerprints to protect the information on the phone.

http://www.wired.com/opinion/2013/09/the-unexpected-result-of-fingerprint-authentication-that-you-cant-take-the-fifth/

So, being "more secure" from a technical perspective (assuming you buy into single-factor biometric authentication) does not necessarily translate into better protection from legal intrusion. :-)

Wednesday, August 28, 2013

More Cool Classes

Last weekend I had the opportunity to take another really fun course. This one was Ruby Programming for Information Security Professionals, offered by Marcus Carey at ThreatAgent.com. (https://www.threatagent.com/training)

It dovetailed very nicely with the Penetration Testing courses I took from Georgia Weidman earlier this summer. Georgia's courses provided an accelerated introduction to using Metasploit (and some other pentesting tools).

With Georgia's classes under your belt, Marcus' Ruby class gives you one of the tools you need to take using Metasploit to the next level. Since Metasploit modules (and Metasploit itself) are written in Ruby, Marcus' class gives you the introduction to Ruby that you need to start writing Metasploit modules. And even if you're not itching to write an exploit module just yet, he teaches more than enough to let you read and understand Metasploit modules - which is itself a very powerful capability.

About 2/3 of the class is spent in an introduction to Ruby, starting with using the irb interactive Ruby environment, and moving on to the basics of the language. Ruby turns out to be a delightful language and a pleasure to learn. Marcus takes the class through the basics of the language using lots of hands-on examples, so it never gets boring. After we've learned enough Ruby to be "dangerous", we finish off this part of the course writing some quick examples doing things like parsing json, accessing a web site, and making DNS queries. What fun!

However, the last 1/3 of the class is the real pay-off. That's when we start writing a Metasploit module. The module utilizes some of the code we'd already written, and does a simple DNS reconnaissance of a selected domain. Utilizing a template provided by Marcus, we go through the basics of producing a module which can be integrated into Metasploit.

As with the classes I took from Georgia Weidman, the class it taught via a live webinar. It's easy to ask questions, and Marcus is very responsive and attentive to his students. He teaches the class assuming that you're either running Ruby and Metasploit directly, or that you're running Kali. The only "attacks" are really just accessing public DNS and web sites, so there's no need to provide sacrificial VMs for us to attack. He provides a written outline for the class, which is very helpful as you work along with him through the examples. After the class, he provides a video of the webinar, so you can review the class in detail. Overall, the class is presented in an organized, interesting and professional manner.

As with Georgia's classes, this class is an incredible deal at $125 for the day long class. If you'd like to read my rant about the cost of training, go back to my review of Georgia's class - which along with Marcus' class, is an example of what our community needs more of.

Since I've taken the class, I've been on an orgy of coding up a module for Metasploit. It's been a long time since I've been so enthused about a project that I've gone into sleep-deprivation mode to work on it. :-) I have Marcus to thank for that!

Anyway, here's the bottom line. Ruby Programming for Information Security Professionals, taught by Marcus Carey is an awesome course.

This class is for you if you have some programming knowledge, but don't know Ruby and want to jump into writing Metasploit modules. Yes, you can RTFM. But for a relatively little bit of money, and 8 hours of your time, you can really jump-start the process and go from zero to writing a Metasploit module by the end of the day. Of course, there's a ton about both Ruby and Metasploit that he doesn't have time to cover, but you will have enough that you can move forward by writing code ... not by just reading about writing code.

Combine this with Georgia's classes (take them first), and you'll be well on your way to being a very competent Metasploiter (is that a word :-)

BTW, a little while ago I finally looked at Python ... and fell in love. I've been studying it since then, with the intention of abandoning Perl for Python. But I have to admit, Ruby really appeals to me and I'm wondering if I may just abandon Python and do all my programming in Ruby. Does that make me a fickle person? :-)

Tuesday, August 20, 2013

Phew! Finally Recovering from DEFCON

This was the second year that I've attended DECON "on my own dime", after a gap of about 9 years when I wasn't able to attend.

Last year, my first time back in 8 years or so, I think I was in a state of shock throughout most of the weekend. Everything had grown so big - with 15,000 folks attending there was a line for almost everything even remotely popular. But, if you scratched beneath the surface it was still the same DEFCON as before ... with the same passion for playing with anything that couldn't run away, just 15 times bigger and with a slightly more corporate veneer.

This year, I was a bit more prepared. There were still long lines everywhere - and for some talks the room filled up before everyone who wanted to attend got in. But with some planning and flexibility, it was a hugely rewarding DEFCON.

What were the high points for me this year?

This year they released the official DEFCON documentary, which was mostly filmed at last year's event. The documentary explains what DEFCON is about and shows the history of DEFCON. It's not bad; I learned a good bit about the early history of DEFCON. It does a really good job of capturing some of the "hacker ethic" which is what makes DEFCON so great. It also gives a good view into the core group which runs DEFCON every year. On the con (sorry!) side, it is a bit of a self absorbed love-fest. Apparently the documentary was funded by Dark Tangent (Jeff Moss, the person who runs DEFCON every year.), so it shouldn't be a big surprise that only the good side made it out of the cutting room. But again, I recommend it. They're giving it away for free, it's up on You Tube and lots of other places: http://youtu.be/3ctQOmjQyYg

I got a huge kick out of the car hacking talks. Tuners have been hacking auto ECUs for years, figuring out how to rewrite the tuning tables to make car perform better. My last track car, a Mazda Miata had a third party ECU which completely replaced the Mazda unit, allowing a huge range of custom engine tuning options. But now cars are so much more like regular computing platforms, and are so much more computerized, they're become really interesting to the hacking community in general. Insead of just controlling the engine, now virtually every aspect of a car is controlled by a network of computers. Think about it, if you drive a car with an auto parallel-parking feature, there's a computer driving your car when it parks for you. Same thing with the crash avoidance, or cruise control that maintains a safe distance from the car in front if you. So, hacking cars has become a lot more interesting than just tweaking ECUs to run less engine timing. They didn't talk about it here, but others have been looking at compromising a car's internal network remotely (such as via Bluetooth). I can't wait to see these threads of work combined. Here's one video showing some of what they've done: http://youtu.be/oqe6S6m73Zw. Here's the paper describing their work and open source tools: http://youtu.be/3ctQOmjQyYg. Yes, I said tools - you too can jack into your car's OBD-II port and start injecting traffic onto your car's shared network. :-)

I attended the "Policy Wonk Lounge" which turned out to be a very a un-DEFCON like event. It was an informal opportunity for attendees to meet with some relatively high level DC .gov and .mil insiders. It was also the only event where there was an obvious core of press attending, and it was the first time I've ever been to a meeting which was formally "off the record". Not surprisingly (to me at least), the DC folks were reasonable, thoughtful folks who really try to do the right thing. Nothing earth shattering was decided or revealed, but it was really useful to have an open discussion. Here's the basic description: https://www.defcon.org/html/defcon-21/dc-21-speakers.html#Wonk

Speaking of the Policy Wonk Lounge, this was the year that "Feds" were uninvited to DEFCON in response to the NSA domestic spying issue. I was wondering just how that would all go down ... and as near as I can tell the big impact was that the NSA didn't have a recruiting table in the vendor room (they had one there last year) or explicitly public talks. I was pleased that the spirit of tolerance which I always considered a DEFCON hallmark still lived. There are clearly some sharp political differences between DEFCON attendees, but I personally never saw (or heard) of it becoming an issue.

Remember Pentoo? It's a Linux distribution focused on penetration testing. I personally hadn't played with it in awhile, and haven't really thought about it recently. The hot pentesting distribution for the past couple of year has been Kali (nee BackTrack.) But several talks made a point of mentioning that Pentoo still exists, and *some* people like it better than Kali. The cool thing about Pentoo is that it's being maintained, provides a high quality alternative to Kali (i.e. a different set of tools to consider) and is based on the Gentoo Linux distribution. That's what's really great about a conference like DEFCON, you can often read the paper a presenter had written on some topic, but when you attend the talk and the Q&A afterwards, you often pick up all sorts of gems.

Another thing that made me smile: In the hardware hacking area there were a few 3-D printers set up. One guy had a hacked Kinnect, and was using it make and give out 3-D scans of folks (essentially a scan of your head.) You could use the scan to print a sculpture of your head on a 3-D printer. Imagine what DEFCON attendees will be showing us with those in a few years! In fact, a photo-copy shop a block from my house just installed a 3-D printer, we live in interesting times!

I'm already excited about next year at DEFCON ...

Wednesday, July 10, 2013

Sometimes, life just hands you an ice cream cone

Recently, I was just sitting at my computer, when I got a call on my phone. Unfortunately, I don't have a recording app on my phone (I did on my old one), so this is just the highlights from a few handwritten notes and my memory ...

(call from 212-777-3001)
Me: Hello?
Caller: Hello, this is <mumble>Global Soft<mumble>, we're recording errors on your computer
Me: huh?
(Really? I'm finally getting one of "those" calls)
Caller: we're getting lots of errors from your computer. viruses, malware, ....
Me: huh? How do you know about this stuff?
Caller: we receive error messages from your computer. your computer is infected ... i just need to walk you through a few steps to fix it ...
Me: huh?
...
Me: huh? I'm sorry, I'm pretty dumb about computers. How do you know what's wrong with my computer?
Me: huh? Oh! I know! Do you mean I bought your service when I bought the computer
Caller: yeah, yeah, that's right. that's what you did!
...
Caller: ok, I just need you do to a few things ...
Caller: turn on your computer ...
Caller: Let me know when you see your desktop ...
Me: huh? it's on, I'm looking right at it.
Caller: do you see your desktop
Me: huh? I don't know ... it says dollar sign
Caller: (confused) huh? :-)
Me: huh? I see a dollar sign prompt (I'm looking at a Linux shell prompt, but was trying to remember what a Wylbur prompt looked like ... If you're wondering: http://en.wikipedia.org/wiki/ORVYL_and_WYLBUR)
Caller: where's your desktop?
TMe: huh? what's a desktop? oh! That! there is no desktop. This is a brand new computer they just gave me
Me: before this we did everything with punched cards ...
Caller: how do you get to the internet?
Me: huh? Do you mean how do we do things? I can submit any card deck you need, the submission desk is just down the hall ...
Caller: Are you at work? Is this your personal computer?

(... much hilarity ensues while I offer to submit cards and he tries to get me to the desktop and/or internet)

Me: huh? Of course I'm at work. I don't have a personal computer
Caller: Can you get to the Internet from work
Me: I'm not authorized to use the Internet

CLICK! (he finally hung up)

:-)

I am kicking myself a bit. Not only did I have no way to record the call, but I realized afterwards that I have a throw-away, very vulnerable, Windows-XP virtual machine (from a course I took recently) that would have been a perfect victim. Unfortunately, I have a feeling that my dyslexia would have kicked in ... and my credit card would have ended up being denied in that case. :-)

But, pretending I was using punched cards did give me a bit of a giggle.

Update:

Here's an article which give another example of how somebody else had fun with these guys: http://arstechnica.com/tech-policy/2012/10/i-am-calling-you-from-windows-a-tech-support-scammer-dials-ars-technica/

Update 2: Another article, also from ARS, provides more detail on how one of these operations is run (and how the FTC is taking them down.) http://arstechnica.com/tech-policy/2014/05/stains-of-deceitfulness-inside-the-us-governments-war-on-tech-support-scammers/

Update 3 (9/12/2014): There's now a metasploit module which allows you to turn the tables on these scammers. http://www.scriptjunkie.us/2014/09/exploiting-ammyy-admin-developing-an-0day/

Sunday, June 30, 2013

A Couple of Cool Classes

I've devoted the last couple of Saturdays to taking the first two classes on penetration testing offered by Georgia Weidman. (http://www.bulbsecurity.com/)

The short version of this posting is that I completely recommend them, they're awesome!

The first class, Penetration Testing with Metasploit is exactly what the title promises. It's the perfect class for someone who, like me, is fairly familiar with the tools of our trade, but has never taken the time to learn how to use Metasploit. Yes, you can just read a book or the user docs, but learning how to use it by attacking realistic targets is a much better way to learn. (And much more fun!)

Even if you're relatively new to security, I think you can still get a lot from the class. Here's a test: If I say "Port 80 on localhost", or "cracking hashes from /etc/shadow", does that mean anything to you? Do you think you can stand up a pre-configured virtual machine using VMware player or VirtualBox? If your answer to these is "yes", I think you'll be able to participate in this class. The focus is on using Metasploit, and a few other tools ... so if you can follow directions, you should be able to keep up. Keep in mind, the point of Metasploit is to package exploits so that you can use them without knowing the details of how they work. Even if you don't completely understand the exploits being demonstrated, seeing them in action is extremely valuable.

The class is entirely hands on. Prior to the class, Georgia sends you two virtual machines, one running Windows XP and one running Ubuntu Linux. She also instructs you to grab a copy of the Kali virtual machine (Kali, nee BackTrack, is a collection of pentesting tools.) You'll be shocked to hear that both of the virtual machines she provides have some vulnerabilities. :-)

Georgia runs the class using an on-line webinar system that lets her talk to everyone while she shares her screen. She also gives out a set of slides, which provide a written backup to what she's showing. The basic flow of the class is that you use the Kali VM to attack the XP and Ubuntu "victim" virtual machines. On the screen she's sharing, Georgia is running the same exploit you are, discussing it while she demonstrates it. This is not some instructor reading from a power-point deck, it's more like watching reality TV for hackers ... except you get to play along! Finally, the webinar system allows students to submit questions, which Georgia is good about answering quickly and clearly.

Of course, the class is not without glitches. As an instructor, you can't spin up a bunch of virtual machines on your laptop, interactively run malicious exploits against them and share the entire mess via a webinar/screen-sharing service from your home, without something breaking. In both classes, some time was lost dealing with glitches, resulting in the class running 9 hours long instead of the scheduled 8. Even with a few breaks thrown in, 9 hours is a long time. By the end of each class I was a quivering bowl of Jello ... I have no idea how Georgia was able to keep going for 9 hours. But each time, while I was pretty fried by the end I was also grinning like a mad man.

After the class is over, Georgia provides access to a video of the class. She also will be granting students access to a lab network which contains additional machines to practice on.

So here's the best part ... the class costs only $100!

<Rant> I've gotten very frustrated at the cost of decent training these days. For example, I'm a huge fan of some of the SANS courses, but there's no way I can afford them personally, and many employers simply can't afford to drop that kind of money on training. I'm fully aware of, and OK with, the profit motive. But it feels like the best and biggest training organizations are heavy on "what the market will bear", and light on "what's best for the industry". Thank goodness for events like DEFCON, BSides or SNOWFROC ... without those there would be nothing for those of us who make up the "middle class" of security.</Rant>

In summary, this class is by far the best training deal I've ever encountered. I learned some valuable skills taught by a real pro, I had a total blast and I didn't have to max out the credit card to do it.

I'm not sure when it'll be offered next, but check out: http://www.bulbsecurity.com/online-security-training/penetration-testing-with-metasploit/ for more information.

The second class, Penetration Testing Level 2, is very much a continuation of the first. It's assumed you're familiar with the material from the first class, and goes into detail about more sophisticated attacks. In addition to the VMs from the first class, an additional Windows-7 VM is provided. Metasploit is still the primary tool, but other tools are also used for more sophisticated attacks. For example msfvenom, the Social Engineering Toolkit and Hyperion are all used to package exploits. In another exercise, one of the virtual machines is compromised and then used to pivot and attack a second machine. These are still "elementary" pentesting techniques, but the hands-on nature of the class really takes it beyond the purely academic and makes it a valuable learning experience.

Penetration Testing Level 2 costs a whopping $200, and is worth every penny. Again, I'm not sure when it's going to be offered again, check out: http://www.bulbsecurity.com/online-security-training/penetration-testing-level-2/

A couple of recommendations if you take one of these classes:

Grab the virtual machines ahead of time and make sure you've got them running well. If you're building your environment the morning of the class, you're already behind the curve.
If possible, use a two monitor setup. Having Georgia's shared screen on one monitor, and running Kali on the second monitor, is the trick setup for these classes.

It sounds like Georgia may create an entire series of classes along these lines ... at this sort of price point, given the high quality (and fun quotient) of the first two classes, I think that the entire series would be a pretty interesting training option.

Wednesday, June 5, 2013

Password Cracking is a Art

Just a quick posting to recommend the following article:

http://arstechnica.com/security/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/

It's easy to think that cracking passwords is a point and click activity ... just grab a big password list, recruit a bunch of processing power and let'r run. If you think that's how it works, you're wrong.

This article describes the process taken by three separate password cracking experts to attack the same list of password hashes. They approached the challenge with different tools, different approaches and achieved different results.

The key point (other than some nice tricks) is that as with many security endeavors, password cracking is a both a craft and an art. To be good at it, you need to know the underlying cryptography, you need to know your tools and you need to know how people behave. And then most importantly, you need to develop creative solutions based on your knowledge and hard earned experience.

As I go about my daily work in this field, I'm often reminded of the passion and craftsmanship I experienced a very long time ago when taking a wood working class. It was at a top design school, and I was a rank beginner surrounded by folks building beautiful pieces of furniture. They understood how to make wood do things I could only dream about, things that seemed like magic until you understood how they did it.

Kinda like figuring out that '3e93fb79e0970b6b8229ff8bec22d069' is the hash for 'qeadzcwrsfxv1331'.

:-)

Thursday, May 16, 2013

Another adaptation to enhance our survival :-)

Below's a nice little note which points out that since some malware tries to evade analysis by detecting when it's running in a "lab" environment, you can "immunize" your systems by making them look like a lab.

https://community.rapid7.com/community/infosec/blog/2013/05/13/vaccinating-systems-against-vm-aware-malware

In this case, they provide a tool which makes a few simple changes to your system and runs a few programs to simulate running under VMware. Cute, but of course soon enough the attackers will just evolve more sophisticated ways to detect when their code is really being examined.

This is the same sort of strategy used by some animals in nature. If you appear to be something dangerous, predators will leave you alone. Technically, this is known as Batasian Mimicy (http://en.wikipedia.org/wiki/Batesian_mimicry.)

One interesting aspect of Batasian Mimicy is that even "poor" mimics derive a benefit - it will be interesting to see if that observation holds true in the online contest between hunter and prey. :-)

Friday, May 10, 2013

Is Hotmail the Only One?

Here's a nice little bit of research out of rutgers.edu. It turns out that Hotmail will shutdown your email account after it's been idle for 270 days. Not a crazy policy, and perhaps even with some security benefit.

But, here's the bad part, they also make your username (aka email address @ hotmail) available for reuse.

The researchers were able to use this detail, combined with the Facebook policy of sending password reset credentials to the email address on record, to take over the Facebook accounts associated with "expired" hotmail accounts.

This attack was assisted by using some simple scripts which allow easily testing whether a hotmail account has expired or not. The biggest limitation on the attack is that Facebook generally restricts visibility into an account's email address to "Friends" of the account. In effect, this means automating the attack becomes a tree traversal exercise as one compromises an account, and then attacks any friends of the compromised account who might be vulnerable.

I have a few comments on this.

I understand trying to let folks have the email address they want, that's just good business. I can can even see how letting folks take an address permanently out of the pool of available addresses is begging for abuse ... but we're seeing more and more examples of how stealing an email address opens the gate wide for identity theft. Facebook is pretty much in the mainstream with their password reset policies. Now is a good time for Microsoft to change their policy; Don't make expired email accounts available for reuse, it's just too easy to abuse.
I sympathize with Facebook. I've always considered password resets to be a very difficult problem. Short of having somebody physically present showing ID, how can you really be certain who you're granting access to an account to? As I am constantly reminded, "on the Internet, nobody knows you're a dog". In this case, all you really know is that you're being asked to grant access to somebody who doesn't know the correct password. :-)
Again, this problem is hard when addressed at scale. Consider the case of Mat Honan, where Apple tried to do something more sophisticated than just fire an email to a stored email address, and yet their process was still shown to be quite vulnerable (http://www.wired.com/gadgetlab/2012/08/apple-amazon-mat-honan-hacking/.)
Can you say Two Factor, Single Sign On?

Here's a good summary: http://www.net-security.org/secworld.php?id=14892

Here's the full paper: http://precog.iiitd.edu.in/events/psosm2013/9psosm3s-parwani.pdf

As I say in my subject, is Hotmail the only major email provider who allow reuses of email addresses?

Monday, April 29, 2013

Always remember to shred your ship when done with it

We've all seen the hapless user who sells their computer on eBay without wiping the hard drive.

How about selling a Coast Guard patrol board to the North Koreans without wiping the navigation system. :-)

What else did they forget to sanitize?

http://www.theregister.co.uk/2013/04/29/japan_coast_guard_forgets_wipe_data_norks/

There's a reason for process and rules, including that annoying check sheet for hardware disposal.

Thursday, April 25, 2013

Ouch!

One of the issues we have to address as security folks is protecting a person's privacy. If you've ever dealt with Personal Health Information (PHI), you know that there are strict rules about what aspects of a person's identity must be protected when associated with medical data.

In what can only be described as an object lesson of how important this is, the folks at the Data Privacy Lab (at Harvard) conducted an interesting experiment - looking into how many folks in the Personal Genome Project they could identify just by birthdate, sex and zip code.

Amazingly, they identified 200 participants with 84% to 90% accuracy. Let me repeat that for emphasis ... using just birthdate, zip and sex they were able to link 200 folks to their "anonymous" genome with good accuracy. They basically matched data from the genome project with public voter registration data and other public data.

Here's a web site where they report their findings: http://dataprivacylab.org/projects/pgp/
The full report is at: http://dataprivacylab.org/projects/pgp/1021-1.pdf

Best of all, they have a web site where you can put in your birthdate, sex and zip, and they'll tell you how many folks match in their public records. (http://aboutmyinfo.org/)

I tried it for my info, and there's only one record which matches my info. I live in a relatively small town (Boulder, CO) but still I was shocked. It's a good thing I don't feel a need to hid my identity.

For reference, here's what HIPAA says about data that needs to be protected (thanks Wikipedia, http://en.wikipedia.org/wiki/Protected_health_information):

Under the US Health Insurance Portability and Accountability Act (HIPAA), PHI that is linked based on the following list of identifiers must be treated with special care.

Names
All geographical identifiers smaller than a state, except for the initial three digits of a zip code if, according to the current publicly available data from the Bureau of the Census: the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and [t]he initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000
Dates (other than year) directly related to an individual
Phone numbers
Fax numbers
Email addresses
Social Security numbers
Medical record numbers
Health insurance beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers and serial numbers, including license plate numbers;
Device identifiers and serial numbers;
Web Uniform Resource Locators (URLs)
Internet Protocol (IP) address numbers
Biometric identifiers, including finger, retinal and voice prints
Full face photographic images and any comparable images
Any other unique identifying number, characteristic, or code except the unique code assigned by the investigator to code the data

Tuesday, April 23, 2013

Pasting for Gold redux

A fair while ago I posted a entry talking about scraping files from Pastebin (http://jrnerqbbzrq.blogspot.com/2013/02/pasting-for-gold.html). I even posted some code for a program which would grab copies of selected files for later analysis.

Since then, I've continued to play with it. It's really great fun to see what folks post to Pastebin when chatting amongst themselves. As I mentioned in my first posting, the number of system compromises, lists of cracked passwords, "dox'ng", IRC chat logs, and general Internet cruft is impressive. Viewing these files provides a bit of insight into a few of the dark corners of the Internet, and frankly can be a bit addicting ... kinda like plopping down in front of a reality TV show and seeing just how childish grown-ups can really be.

I've upgraded the scraping program, primarily to deal with occasional problems with connectivity to Pastebin. I suspect the problems are due to Pastebin sometimes blocking my connections ... probably via a load balancer which mistakes my earnest efforts as some sort of abuse. Their terms of use are vague, but in their FAQ (http://pastebin.com/faq), regarding their AUP they say "Do not aggressively spider the site", and go on to say they'll block you if you do. I sent them an email asking about this, but they never replied. You can play with the -w option to increase the delay between connections if you have problems, although the longer you set the interval the greater the chance you'll miss some files.

As with the first version of this program, it is heavily based on the program written by malc0de:
http://malc0de.com/tools/scripts/pastebin.txt

Here it is. I call it PasteScrape.pl:

#!/usr/bin/perl -w

#
#Simple perl script to parse pastebin to alert on keywords of interest. 
#1)Install the the LWP and MIME perl modules
#2)Create two text files one called keywords.txt and tracker.txt
#2a)keywords.txt is where you need to enter keywords you wish to be alerted on, one per line.
#3)Edit the code below and enter your smtp server, from email address and to email address. 
#4)Cron it up and receive alerts in near real time
#

########################################################################
# Downloaded 1-29-13 from http://malc0de.com/tools/scripts/pastebin.txt
# by DA - I'm not the author, but I'm afraid that I've had my way with it.
# Changes:
#     Removed email code
#     Added random sleep to be considerate 
#     Added infinite loop to be inconsiderate
#     Added write the matching paste to a separate file (writeHitToFile)
#     Added writting matching expression to writeHitToFile
#     Moved read of regex to inside main loop - catch changes on the fly
#     Added write log of hits to HitList.txt
#     Added getopt and cleaned up a bit
########################################################################

$debugRequested = 0;
$delayInterval = 5;  # Default max delay between queries to web site
$keyWordsFileName = 'keywords.txt';
$fetchErrorCnt = 0;
$tryOneMoreTime = 0;
$webProxy = 0;

use LWP::Simple;
use LWP::UserAgent;

use Getopt::Long;

GetOptions ("h" => \$Help_Option, "d" => \$debugRequested, "w=s" => \$delayInterval, "k=s" => \$keyWordsFileName, 
     "p=s" => \$webProxy );

if ($Help_Option){ &showHelp;}

my $ua = new LWP::UserAgent;
$ua->agent("Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1");

if ($webProxy){
    $ua->proxy('http', $webProxy);
}


my $tracking_file = 'tracker.txt';

while (1){

    # Load keywords.  Check the file each loop in case they've changed.
    open (MYFILE, $keyWordsFileName) or die "Couldn't open $keyWordsFileName: $!";
    @keywords = <MYFILE>;
    chomp(@keywords) ;
    $regex = join('|', @keywords);
    close MYFILE;

#Set the date for this run
    my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
    my $datestring = sprintf("%4d-%02d-%02d",($year + 1900),($mon+1),$mday);
    my $dateTimeString = sprintf("%4d-%02d-%02d %02d:%02d",($year + 1900),($mon+1),$mday, $hour, $min);

    $dir = sprintf("%4d-%02d-%02d",($year + 1900),($mon+1), $mday);

    if ($webProxy){
 $ua->proxy('http', $webProxy);
    }
    my $req = new HTTP::Request GET => 'http://pastebin.com/archive';
    my $res = $ua->request($req);
    $pastebin = $res->content; 

    unless (defined $pastebin){
 die "Request from pastebin failed @ $dateTimeString: ($!)\n";
    }

    my @links = getlinks();
    $linkCount = $#links;

    &debugPrint ("\n");  # Just a stupid formatting thing
    print "Starting new batch at $dateTimeString. Save-to dir is $dir. Keywords file is $keyWordsFileName. regex is: $regex\n";
    &debugPrint ("size of \@links: $linkCount\n");
    if (@links) {
 $fetchErrorCnt = 0;
 $tryOneMoreTime = 0;
 foreach $line (@links){
     &RandSleep ($delayInterval);
     if  (checkurl($line) == 0){
  my $request = "http://pastebin.com/$line\n";
  my $link = $line;
  if ($webProxy){
      $ua->proxy('http', $webProxy);
  }
  my $req = new HTTP::Request GET => "$request"; 
  my $res = $ua->request($req);
  my $content = $res->content;
  my @data = $content;
  if ($debugRequested){
      &debugPrint ("checking ($linkCount) - http://pastebin.com/$line ... ");
      $linkCount--;
  }
  foreach $line (@data){
      if ($content =~ m/\<textarea.*?\)\"\>(.*?)\<\/textarea\>/sgm){ 
   @data = $1; 
   foreach $line (@data){
       if ($line =~ m/($regex)/i){
    $Match = keyWordMatch ($line);
    storeurl($link);
    &debugPrint (" matched $Match ...");
    &writeHitToFile ($link, $line, $Match);
       }
   }
   next;
      }
  }
     }  
 }
    }
    else {  # Sometimes the fetch fails.  Don't really know why, but we try a few more times before giving up
 unless ($tryOneMoreTime){ # unless we're on the very last try
     print "fetch of links failed - can't say why (guess: $!). Sleeping for a minute ... \n";
     sleep 60;
     print "awake. Trying again\n";
 }
 if (++$fetchErrorCnt >= 10){
     if ($tryOneMoreTime){
  print "That's it, waited an hour and still failing ... Giving up\n";
  exit;
     }
     print "10 failures in a row.  Sleeping for an hour and then trying ONE MORE TIME\n";
     $tryOneMoreTime = 1;
     sleep 3600;
 }
    }
}

sub getlinks{
    my @results;
    if (defined $pastebin) {
        @data = $pastebin;
        foreach $line (@data){
            while ($line =~ m/border\=\"0\"\s\/\>\<a\shref\=\"\/(.*?)"\>/g){
                my $url = $1;
         push (@results, $url);        
     }
 }
    }
    
    return @results;
}

sub storeurl {
    my $url = shift;
    open (FILE,">> $tracking_file") or die("cannot open $tracking_file");
    print FILE $url."\n";
    close FILE;
}

sub checkurl {
    my $url = shift;
    if (-e $tracking_file){
 open (FILE,"<$tracking_file") or die("cannot open $tracking_file for read");
    }
    else {
 return 0;  # File doesn't exist yet
    }
    foreach my $line ( <FILE> ) {
 if ( $line =~ m/$url/i ) {
     &debugPrint ("detected repeat check of $url ");
     return 1;
 }
    }
    return 0;
}

sub RandSleep{
    my $maxSleepTime = pop;
    my $sleepTime = int rand ($maxSleepTime + 1); # Need the +1 since we'll never hit maxSleepTime otherwise

    &debugPrint ("sleeping for $sleepTime ... ");
    sleep $sleepTime;
    &debugPrint ("awake!\n");
}

sub writeHitToFile{

    my $matchingExpression = pop;
    my $Contents = pop;
    my $url = pop;
    chomp ($url);

    unless (-e $dir){
 mkdir $dir or die "could not create directory $dir: $!\n";
    }

    if (-d $dir){
 open (HIT_FILE, ">$dir/$url") or die "could not open $dir/$url for write: $!\n";
 print HIT_FILE "http://pastebin.com/$url matched \"$matchingExpression\"\n" or die "print of url to $dir/$url failed: $!\n";
 print HIT_FILE $Contents or die "print of contents to $dir/$url failed: $!\n";
 close HIT_FILE;

 # Get the current time for the list file entry
 my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
 my $datestring = sprintf("%4d-%02d-%02d %02d:%02d",($year + 1900),($mon+1),$mday, $hour, $min);

 open (HIT_LIST_FILE, ">>HitList.txt") or die "could not open HitList.txt for append: $!\n";
 print HIT_LIST_FILE "$dir/$url - http://pastebin.com/$url matched \"$matchingExpression\" at $datestring\n" or die "print of hit to HitList.txt failed: $!\n";
 close HIT_LIST_FILE;
    }
    else {
 die "$dir exists but is not a directory!\n";
    }
}

sub keyWordMatch{
    my $matchingLine = pop;

    foreach $check (@keywords){
 if ($matchingLine =~ m/$check/i){
     return $check;
 }
    }
    return "No Match";
}

sub showHelp {
    print<<endHelp
$0: [-h] [-d] [-w <Max Wait Interval in seconds>][-p <http proxy>] [-k <Keywords File>]
-h: Show this help message
-d: Print debug output
-w <wait seconds>: Max wait in seconds between fetches.  Each fetch is delayed a random amount between 0 and this value. Default is 5 seconds.
-k <filename>: Name of file with keywords to monitor for.  Each line of the file is text or a perl regular expression. Default is \'keywords.txt\'
-p: Proxy through <http proxy>  (good for use with Zap or Burp)

Track progress via \"tail -f HitList.txt\"
endHelp
 ;
    exit;  # We always exit after showing help
}

sub debugPrint{
    unless ($debugRequested){ return;}

    my $message = pop;
    $saveState=$|; $| = 1;  # Save whether print is buffered, and make unbuffered

    print $message;  # print the message

    $| = $saveState; # return print buffering to previous state
}

To use this program you may need to install the LWP perl module (although it appears to be installed by default). PasteScrape.pl expects to find a file named 'keywords.txt' which contain search strings (regular expressions) to match against any files it finds (one string per line.) If a file found on Pastebin contains a match, it's saved in a folder named after the current date (yyyy-mm-dd).

I've only used it under Linux (Ubuntu) and haven't tested it elsewhere. But if LWP works on your OS, it should be pretty portable.

In the previous posting, I mentioned that the biggest problem is dealing with the flood of files that I was getting. There are two ways to deal with this type of problem. The first is to more strictly limit what you save (via the 'keywords.txt' file), but that's no fun! The second way is to become more efficient at sorting through the hundreds or thousands of files collected each day. That's what the second program here is for.

I call it PasteView.pl


#!/usr/bin/perl -w

# This program is a custom hack job to view the saved files from my
# custom hack job Pastebin scraper program. :-)
# 
# The format of the files is as follows: The first line of the file
# contains the full URL and what specific string was matched in the
# paste to result in it being saved.  The rest of the file is the
# contents of the saved paste.
# 
# E.G:
# $ head -10 2013-02-09/017Gy3yZ 
# http://pastebin.com/017Gy3yZ matched &dquo;password&dquo;
# [10:45:02] [INFO] LaunchFrame.main:161: FTBLaunch starting up (version 1.2.2)
# [10:45:02] [INFO] LaunchFrame.main:162: Java version: 1.6.0_38
# [10:45:02] [INFO] LaunchFrame.main:163: Java vendor: Sun Microsystems Inc.
# [10:45:02] [INFO] LaunchFrame.main:164: Java home: C:\Program Files\Java\jre6
# [...]
#
# So, foreach file in the argv, we look at the first line.  The first
# line describes the primary match that was identified by the Pastebin
# scraper.  After we collected all the matches, we list each of the
# matched strings found (from the first line of each file) and a
# count.  The user will then be prompted to select one &dquo;match&dquo;, and
# will then have the option to view any of the files which correspond
# to the match.
#
# The -m option shortcuts the collection process and just presents
# files which match the -m options.
#
# Timestamp
#
# The -n (show new files) and the -w (write timestamp) options control
# the ability to only view &dquo;new&dquo; files.  New files are files which
# were created after a previously established timestamp.  The
# timestamp is stored in a file, and is established either via the -w
# option, or via the "w" command while viewing files via the -s
# option.
#
# De-Escape character entities
#
# It"s very common for pastebin files to utilize the HTML character
# entities (e.g. &lt; for "<").  The -e option filters out a small
# subset of these for readability
#
########################################

use Getopt::Long;

$DEBUG = 0;
$debugFileName = "DEBUG.txt";
$choiceCount = 40;   # Default on how many choices to give
$lineLength = 80;    # Default to chop line length
$viewOnlyNewFiles = 0;  # Flag: only look at files created since timestamp
$showMatchingLine = 0; # Flag, show first matching line instead of first line in file
$timeFileName = "lastCheck";  # Where to write timestamp
$lastTimeChecked = 0;

GetOptions (&dquo;h&dquo; => \$Help_Option, &dquo;m=s&dquo; => \$matchStringArg, &dquo;d&dquo; => \$DEBUG, &dquo;e&dquo; => \$deEscape, 
     &dquo;n&dquo; => \$viewOnlyNewFiles, &dquo;w&dquo; => \$setNewFilesDate, &dquo;l&dquo; => \$showMatchingLine,
     &dquo;p=s&dquo; => \$choiceCount);

if ($Help_Option){ &showHelp;}

if ($DEBUG){  # Open debug output file
    open DEBUG_FILE, &dquo;>$debugFileName&dquo; or die &dquo;open of $debugFileName failed: $!&dquo;;
    $debugDate = `date`;
    print DEBUG_FILE &dquo;Starting debug output to $debugFileName at $debugDate&dquo;;
    print DEBUG_FILE &dquo;Command Line Options:\n&dquo;;
    if (defined $matchStringArg){
 print DEBUG_FILE &dquo;   matchStringArg = $matchStringArg\n&dquo;;
    }
    if (defined $DEBUG){
 print DEBUG_FILE &dquo;   DEBUG = $DEBUG\n&dquo;;
    }
    if (defined $deEscape){
 print DEBUG_FILE &dquo;   deEscape = $deEscape\n&dquo;;
    }
    if (defined $viewOnlyNewFiles){
 print DEBUG_FILE &dquo;   viewOnlyNewFiles = $viewOnlyNewFiles\n&dquo;;
    }
    if (defined $setNewFilesDate){
 print DEBUG_FILE &dquo;   setNewFilesDate = $setNewFilesDate\n&dquo;;
    }
    if (defined $showMatchingLine){
 print DEBUG_FILE &dquo;   showMatchingLine = $showMatchingLine\n&dquo;;
    }
    if (defined $choiceCount){
 print DEBUG_FILE &dquo;   choiceCount = $choiceCount\n&dquo;;
    }
}

while (1){ # We keep looping through choices until we exit

    # clear out results from last run (if there was one)
    foreach $key (keys %matchCount){delete $matchCount{$key};}  
    foreach $key (keys %matchList){ delete $matchList{$key};} 
    $totalMatches = 0;
    $matchedFileCount = 0;

    # Handle -n option
    if ($viewOnlyNewFiles){
 $now = time;
 if (-e $timeFileName){   # If there"s timestamp file, use it
     open TIME, &dquo;<$timeFileName&dquo; or die &dquo;open of $timeFileName for read failed: $!&dquo;;
     $lastTimeChecked = <TIME>;
     close TIME;
 }
 else {  # Otherwise force user to create a timestamp file
     unless ($setNewFilesDate){
  print &dquo;-n option invalid since timestamp file \&dquo;$timeFileName\&dquo; was not found.  Use -w option to establish. Exiting.\n&dquo;;
  exit;
     }
 }
    }

    # Handle -w option
    if ($setNewFilesDate){   # user has said to create or update timestamp file
 $now = time;
 open TIME, &dquo;>$timeFileName&dquo; or die &dquo;open of $timeFileName for write failed: $!&dquo;;
 print TIME $now;
 close TIME;
    }

    if ($matchStringArg){
 # User used -m option to select a custom match, skip first
 # loop through files since we know what to match

 if ($DEBUG){print DEBUG_FILE &dquo;user selected -m: matchStringArg = $matchStringArg\n&dquo;;}

 $matchString = $matchStringArg;
    }
    else {
 
    # Cycle through all the files, look at the first line (contains
    # the match string from PasteScrape).  We"ll use this to present
    # the user with a list of matches to select from.

 $totalMatches = 0;

 foreach $fileName (@ARGV){  # we go through each of the files specified by user
     unless (-e $fileName){   # Just in case of a user typo or something
  print &dquo;Couldn"t find $fileName, exiting\n&dquo;;
  exit;
     }

     if ($DEBUG){ print DEBUG_FILE &dquo;in file examination loop: fileName = $fileName\n&dquo;;}

     if ($viewOnlyNewFiles){  # user only wants to see new files. Compare this file to timestamp
  @fileStats = stat ($fileName);
  if ($DEBUG){print DEBUG_FILE &dquo;file access date = $fileStats[9], lastTimeChecked = $lastTimeChecked\n&dquo;;}
  if ($fileStats[9] < $lastTimeChecked){
      next;
  }
     }

     # Now, open the file and examine the first line 
     open FILE, &dquo;<$fileName&dquo; or die &dquo;open of $fileName failed: $!&dquo;;
     $firstLine = <FILE>;
#     if ($firstLine eq ""){die &dquo;attempt to read $fileName for -s failed: $!&dquo;;}
     unless (defined $firstLine){next;}

     $firstLine =~ /.+matched\s+\&dquo;(.+)\&dquo;/;
     if ($DEBUG){print DEBUG_FILE &dquo;  matched = $1\n&dquo;;}

     # Keep track of how many files &dquo;match&dquo; each match string
     $matchCount{$1}++;
     $totalMatches++;

     close FILE;
 } # foreach $fileName ...

 # Now that we"ve looked at each of the requests files, show
 # them to user and see which &dquo;match&dquo; is of interest
 unless ($DEBUG){system ("/usr/bin/clear");}
 print &dquo;$totalMatches total primary matches (as identified by PasteScrape):\n&dquo;;
 $matchIndex = 0;
 foreach $match (sort byCount keys %matchCount){
     $matchArray[$matchIndex] = $match;
     print &dquo;$matchIndex --> ($matchCount{$match}) $match\n&dquo;;
     $matchIndex++;
 }
 print &dquo;$matchIndex --> Provide a custom search string\n&dquo;;

 print &dquo;Select a matching expression to review (\#, \&dquo;w\&dquo; or \&dquo;q\&dquo; to quit): &dquo;;
 $inLine = <STDIN>;
 if ($inLine =~ /q/i){  # User requested quit
     exit;
 }

 if ($inLine =~ /w/i){  # User requested we reset the timestamp
     $now = time;
     open TIME, &dquo;>$timeFileName&dquo; or die &dquo;open of $timeFileName for write failed: $!&dquo;;
     print TIME $now;
     close TIME;

     print &dquo;New timestamp written, exiting\n&dquo;;
     exit;
 }

 # So now, the user should have selected which match string to
 # review the files which match.  User selects the number of
 # the match string
 chomp ($inLine);
 unless ($inLine =~ /^\s*\d+\s*$/){   # test for a simple digit input
     unless ($inLine =~ /^\s*$/) {    # exit on empty line, but no error msg
  print &dquo;Didn"t recognize \&dquo;$inLine\&dquo; as a valid choice (must be an integer.) Exiting\n&dquo;;
     }
     exit;
 }
 $matchSelection = int ($inLine);

 if ($DEBUG){print DEBUG_FILE &dquo;matchSelecton = $matchSelection\n&dquo;;}

 unless (($matchSelection >= 0) and ($matchSelection <= ($matchIndex))){ # range check user selection
     print &dquo;\&dquo;$matchSelection\&dquo; isn\"t a valid selection. Exiting\n&dquo;;
     exit;
 }

 if ($matchSelection == $matchIndex){  # User selected custom search string
     print &dquo;Search string: &dquo;;
     $matchString = <STDIN>;
     chomp ($matchString);
     if ($DEBUG){ print DEBUG_FILE &dquo;matchString = $matchString (user provided)\n&dquo;;}
 }
 else {
     $matchString = $matchArray[$matchSelection];    # determine the selected match string from list
     if ($DEBUG){ print DEBUG_FILE &dquo;matchString = matchArray[$matchSelection] ($matchArray[$matchSelection])\n&dquo;;}
 }
    }  # else (present potential matches to user)


    # We"ve shown the user all the match strings and/or the user has
    # told us which one to look at.  Now cycle through all the files
    # again, and if the first line (or any line, with -l) matches the
    # user selected, add it to the list to present the user.  There
    # may be hundreds of matches, so we need to present them in
    # batches.

FILE_LOOP:
    foreach $fileName (@ARGV){

 if ($DEBUG){ print DEBUG_FILE &dquo;fileName = $fileName\n&dquo;;}

 if ($viewOnlyNewFiles){ # as before, user may only want to consider &dquo;new&dquo; files.
     @fileStats = stat ($fileName);
     if ($DEBUG){print DEBUG_FILE &dquo;file access date = $fileStats[9], lastTimeChecked = $lastTimeChecked\n&dquo;;}
     if ($fileStats[9] < $lastTimeChecked){
  next;  # skip files which are not &dquo;new&dquo;
     }
 }

 open FILE, &dquo;<$fileName&dquo; or die &dquo;open of $fileName failed: $!&dquo;;
 $firstLine = <FILE> ;  # contains the &dquo;match&dquo; string
 unless (defined $firstLine){next;}  # skip if empty
 if ($firstLine !~ /matched/){next;} # File is not in right format, skip it
 if ($showMatchingLine){  # User wants to see the matching line, not the first line in the file
     while ($inLine = <FILE>){
  if ($inLine =~ /$matchString/i){
      chomp($inLine);
      $matchList{$fileName} = $inLine;
      $matchedFileCount++;
      close FILE;
      next FILE_LOOP;
  }
     }
 }
 else {
     $secondLine = <FILE>;  # this will give user a hint as to contents of the file
     unless (defined $secondLine){next;}  # skip if empty
     chomp($secondLine);
     close FILE;
     $firstLine =~ /.+matched\s+\&dquo;(.+)\&dquo;/i;  # Does this file match the requested &dquo;match&dquo; strong
     unless (defined $1){
  if ($DEBUG){print DEBUG_FILE &dquo;  --> failed to find match in \&dquo;$firstLine\&dquo;\n&dquo;;}
  next;
     }
     if ($DEBUG){print DEBUG_FILE &dquo;  matched = $1\n&dquo;;}
     if ($matchString eq $1){   # we have a match.  Set the second line aside to show user
  $matchList{$fileName} = $secondLine;
  $matchedFileCount++;
     }
 }
    }

    if ($DEBUG){
 foreach $matchFile (keys %matchList){
     print DEBUG_FILE &dquo;$matchFile --> $matchList{$matchFile}&dquo;;
 }
    }

# We"ve collected the names of all the files which have the &dquo;match&dquo;
# string in their first line.  We"ve also collected the second line
# (or first matching line) from each of these files.  The second line
# will often allow the user to determine what type of contents are in
# a file.  Present the list to the user and let her select which ones
# to view using the unix &dquo;less&dquo; command.  User input is the # of the
# entry to show, user can select multiple entries.

    unless ($DEBUG){system ("/usr/bin/clear");}
    print &dquo;Found $matchedFileCount &dquo;;

    if ($matchedFileCount == 0){  # ghads, I hate special cases  :-)
 print &dquo;matches\n&dquo;;
    }

    $pickID = 0;
    $totalMatchCount = keys %matchList;
    $matchesShownCount = $choiceCount;
    foreach $matchFile ( keys %matchList){
 if ($pickID == 0){   # print at top of the screen listing next set of matches
     $matchesLeft = $totalMatchCount - $matchesShownCount;
     if ($showMatchingLine){
  if ($matchesLeft <= 0){
      print &dquo;files which contain \&dquo;$matchString\&dquo; ...\n&dquo;;
  }
  else {
      print &dquo;files which contain \&dquo;$matchString\&dquo;. $matchesLeft are left after this group ...\n&dquo;;
  }
     }
     else {
  if ($matchesLeft <= 0){
      print &dquo;files identified by PasteScrape as containing \&dquo;$matchString\&dquo; ...\n&dquo;;
  }
  else {
      print &dquo;files identified by PasteScrape as containing \&dquo;$matchString\&dquo;. $matchesLeft are left after this group ...\n&dquo;;
  }
     }
 }
 $pickList[$pickID] = $matchFile;
 $escapedString = &filterEscapeString($matchList{$matchFile});
 print &dquo;$pickID ($matchFile) >> $escapedString\n&dquo;;   # print each file"s info for user
 $matchesShownCount++;
 $pickID++;

 # We only show choiceCount files at a time, to avoid scrolling
 # choices off the screen.

 if ($pickID >= $choiceCount){  # We"ve completed a batch.  Now see which ones the user wants to see
     print &dquo;\nSelect matches to review (separate by \&dquo;,\&dquo; or \&dquo;.\&dquo;, \<cr\> for next group, \&dquo;\*\&dquo;, \&dquo;q\&dquo; to quit): &dquo;;
     $inLine = <STDIN>;
     if (length ($inLine) > 1){
  if ($inLine =~ /q/i){  # User requested quit
      exit;
  }

  if ($inLine =~ /\*/){ # &dquo;wildcard&dquo; ... user wants to view them all
      $inLine = &dquo;0&dquo;;
      foreach $i (1 .. $pickID - 1){  # yeah, it"s a hack - build up a fake user input
   $inLine .= &dquo;, $i&dquo;;
      }
  }

  @selected = split (/,|\./,$inLine);  # parse user input.
  foreach $selectedID (@selected){
      $selectedID =~ s/\s+//g;  # lose extraneous spaces in input

      if ($DEBUG){ print DEBUG_FILE &dquo;select = $selectedID, &dquo;;}

      if (($selectedID !~ /^\d+$/) or ($selectedID > $pickID - 1) or ($selectedID < 0)){ 
   next;    # range check user input, skip if out of range
      }

      $selectedFileName = $pickList[$selectedID];
      if ($deEscape){  # If user has requested filtering, do it now
   $selectedFileName = &filterEscapeFile($selectedFileName);
      }
      system (&dquo;/usr/bin/less -i -p \"$matchString\" $selectedFileName&dquo;); # show file to user
  }
     }

     # Prepare for next &dquo;batch&dquo; of files to consider
     $pickID = 0;
     @selected = "";
     unless ($DEBUG){system ("/usr/bin/clear");}
 }
    }

    if ($pickID != 0){
 print &dquo;\nSelect matches to review (separate by \&dquo;,\&dquo; or \&dquo;.\&dquo;, \<cr\> for next set, \&dquo;\*\&dquo; for all, \&dquo;q\&dquo; to quit): &dquo;;
 $inLine = <STDIN>;
 if (length ($inLine) > 1){
     if ($inLine =~ /q/i){  # User requested quit
  exit;
     }

     if ($inLine =~ /\*/){ # user wants to view them all
  $inLine = &dquo;0&dquo;;
  if ($DEBUG) {print DEBUG_FILE &dquo;starting in wildcard(2): inLine = $inLine\n&dquo;;}
  foreach $i (1 .. $pickID - 1){
      if ($DEBUG) {print DEBUG_FILE &dquo;in wildcard loop(2): i = $i, inLine = $inLine\n&dquo;;}
      $inLine .= &dquo;, $i&dquo;;
  }
     }


     @selected = split (/,|\./,$inLine);
     foreach $selectedID (@selected){
  $selectedID =~ s/\s+//g;
  if (($selectedID !~ /^\d+$/) or ($selectedID > $pickID - 1) or ($selectedID < 0)){ next;}
  $selectedFileName = $pickList[$selectedID];
  if ($deEscape){
      $selectedFileName = &filterEscapeFile($selectedFileName);
  }
  system (&dquo;/usr/bin/less -i -p \"$matchString\" $selectedFileName&dquo;);
     }
 }
    }

    if ($matchStringArg){  # We only loop once if user invoked with -m
 if ($DEBUG) {print DEBUG_FILE &dquo;done with -m, exiting\n&dquo;;}
 exit;
    }
}

if ($DEBUG) {print DEBUG_FILE &dquo;Fell into exit outside while(1) loop!!!!\n&dquo;;}
print &dquo;unexpected exit!\n&dquo;;
exit;


sub filterEscapeFile{
    my $fileToFilter = pop;
    my $tmpCopyFile = "/tmp/LogViewTmp";
    my $inLine = "";
    my $outLine = "";

    open IN_FILE, &dquo;<$fileToFilter&dquo; or die &dquo;open of IN_FILE ($fileToFilter) failed: $!&dquo;;

    open OUT_FILE, &dquo;>$tmpCopyFile&dquo; or die &dquo;open of OUT_FILE ($tmpCopyFile) failed: $!&dquo;;

    print OUT_FILE &dquo;Original unfiltered file: $fileToFilter --> &dquo; or die &dquo;write of Original Filename to to $tmpCopyFile failed: $!&dquo;;
    
    while ($inLine = <IN_FILE>){
 $inLine =~ s/\&quot;/\"/g;
 $inLine =~ s/\&amp;/\&/g;
 $inLine =~ s/\&lt;/\</g;
 $inLine =~ s/\&gt;/\>/g;
 $inLine =~ s/\&ldquo;/\&dquo;/g;
 $inLine =~ s/\&rdquo;/\&dquo;/g;
 $inLine =~ s/\&lsquo;/\"/g;
 $inLine =~ s/\&rsquo;/\"/g;
 $inLine =~ s/\&hellip;/…/g;

 $inLine =~ s/\e/<ESC>/g;

 print OUT_FILE $inLine or die &dquo;write to $tmpCopyFile failed: $!&dquo;;
    }

    close IN_FILE;
    close OUT_FILE;
    return $tmpCopyFile;
}

sub filterEscapeString{
    my $stringToFilter = pop;

    $stringToFilter =~ s/\&quot;/\"/g;
    $stringToFilter =~ s/\&amp;/\&/g;
    $stringToFilter =~ s/\&lt;/\</g;
    $stringToFilter =~ s/\&gt;/\>/g;
    $stringToFilter =~ s/\&ldquo;/\&dquo;/g;
    $stringToFilter =~ s/\&rdquo;/\&dquo;/g;
    $stringToFilter =~ s/\&lsquo;/\"/g;
    $stringToFilter =~ s/\&rsquo;/\"/g;
    $stringToFilter =~ s/\&hellip;/…/g;

    if (length ($stringToFilter) > $lineLength){
 $stringToFilter = substr ($stringToFilter, 0, $lineLength);
    }

    $stringToFilter =~ s/\e/<ESC>/g;


    return $stringToFilter;
}

sub byCount {
    return $matchCount{$a} <=> $matchCount{$b};
}

sub showHelp {
    print<<endHelp

Use this program to review files saved by the PasteScrape program.
Files are reviewed using the \&dquo;less\&dquo; program.

$0: [-h] [-d] [-e] [-n] [-w] [-l] [-m <matchstring>]  <files to view>
-h: Show this help message
-d: Save debug output to $debugFileName
-e: Convert common escape characters back to normal (e.g. \&dquo;\&lt\;\&dquo; to \&dquo;\<\&dquo;)
-n: View only files created since last timestamp was saved to timestamp file
-w: Save current time into timestamp file
-l: When listing matches, show first match in file, not first line in file
-m: Only show files which contain <matchstring>
-p: Print <line-count> matches for second set of pages (default is 40)

Normally, there are two sets of pages shown.  The first page shows the
various matches which were identified by PasteScrape. It also shows
how many files PasteScrape saved for each match.  When you select a
match from this page, the second set of pages will provide a list of
all the files which contain this match along with a line from each
file to help you identify files of interest.  Those files you select
will then be shown to you via the &dquo;less&dquo; program.

Using the -m option skips the first page and takes you directly to the
second set of pages.  When combined with -l, the entire contents of
each file will be searched for <matchstring>, otherwise all matches
will be based on the primary match identified by PasteScrape.

The followng options will be available on the first page:
<\#>: Select the match to view by specifying its number
\&dquo;w\&dquo;: Write the timestamp file and quite the program (see the -n option)
\&dquo;q\&dquo;: To quit the program
You will also have the option to specify a custom search string or
regular expression

After selecting the number of a match to view (or if using -m), you
will be presented with a list of files which match your request.

By default, the first line of each matching file is shown (since this
will often identify the type of file).  With the -l option, the first
matching line in the file will be shown instead.  Please note that
since the -l option searches the entire file for matches, it may
identify more files to review than were identified by PasteScrape
(e.g. a file identified by PasteScrape as containing &dquo;Password&dquo; may
show up when you request matches for &dquo;Username&dquo;, since it contains
both.)

When presented with a list of matches, the following commands are
available:
<\#>: Select matches to review (select by number and separate by \&dquo;,\&dquo; or \&dquo;.\&dquo;)
\<cr\>: To move to the next page of matches (or back to the first page if done) 
\&dquo;\*\&dquo;: To select all the files shown
\&dquo;q\&dquo;: To quit the program

A common usage would be: $0 -n -e -l 2013-04-\*/\* 

endHelp
 ;
    exit;  # We always exit after showing help
}

After collecting files for a day or so, running PasteView can be pretty interesting. Keep in mind that the richness of the files you collect (and how much disk space you fill) is dependent on the contents of the keywords.txt files used by PasteScrape.pl. BTW, if keywords.txt contains the line ".*" (no quotes), you'll collect all the files publicly available. :-)

Both the programs take a "-h" command line option to provide a help page.

Thursday, April 4, 2013

We need non-resolvable domain names

Cute.

As ICANN starts to roll out extended domain names, folks are starting to notice potential collisions with domain names that have long been used on internal networks (e.g. ".corp".) This leads to all sorts of problems when those domains suddenly start resolving to addresses outside the internal network.

Some of those problems include significant security problems, for example with certificates.

This article nicely lays out the problem: http://arstechnica.com/security/2013/04/possible-security-disasters-loom-with-rollout-of-new-top-level-domains/

The obvious solution is for ICANN to designate certain domain names as reserved for internal use, similar to RFC 1918 non-routable IP addresses. As suggested in the letter referenced in the article linked above, surveys of internal domains already in use provides a list of likely candidate.

Thursday, March 7, 2013

If Kinetic and Cyber had a baby, what would ...

I just couldn't resist pointing out this article in Foreign Policy:

http://killerapps.foreignpolicy.com/posts/2013/03/06/dod_panel_recommends_special_bomber_armed_cyber_deterrent_force

The short version is that the Pentagon is talking about building a kinetic force dedicated to responding to cyber needs. The most obvious mission would be to retaliate for cyber attacks against our infrastructure.

But I think the really interesting thing is that inevitably this force would also have an offensive mission.

This proposal is all about the bleeding of cyber into kinetic, and vice-versa.

Two more thoughts:

A good reason for this force is to have kinetic assets which are off the cyber grid. In other words, we would have "really" air-gaped assets which are part of an organization dedicated to surviving a cyber attack. For most of the Pentagon right now, cyber is a buzz-word which either means career advancement, or pain-in-the-butt make work (to be dispatched with as quickly and with as little work as possible.) I like the idea of an organization which deals with kinetic, but really "gets" cyber.
Cyber has already crossed over into the kinetic "real world". For example, it's very likely that Stuxnet required somebody on site to deploy it (via its USB attack vector.) Another example, rumor has it that Israel hacked the Syrian air defense grid for their 2007 raid on suspected nuclear materials - http://www.wired.com/dangerroom/2007/10/how-israel-spoo/

BTW, the Stuxnet example above points out that despite your best efforts, you can't guarantee the integrity of your air-gap. A organization which really understand cyber and defense-in-depth would understand that.

Update: Another example of the bleeding of kinetic and cyber is the tactic of "SWATing" somebody. SWATing is the trick of social engineering a police SWAT team into making an armed response to some victim's location. Right now, the state of the art is to spoof caller ID while making a 911 call designed to cause a highly intrusive response - typically the call will claim there's an armed hostage situation - inducing a SWAT team to respond to the victim's house. The potential for physical harm is obvious. This attack is clearly kinetic, although it might be more precisely described as a proxy kinetic attack.

One of the leading security researchers/journalist, Brian Krebs, was just the victim of such an attack. It's worth noting this kinetic attack was accompanied by a cyber DDOS against the site hosting his blog. In general, I encourage folks to follow Krebs' blog, his work is excellent - but be especially sure to check out his description of this attack: https://krebsonsecurity.com/2013/03/the-world-has-no-room-for-cowards/

Sunday, February 24, 2013

Certificate Owner Identity Theft

Here's yet more reason to believe that our digital certificate infrastructure is broken.

A CA named DigiCert has been issuing certificates for a company which went out of business in 2011. As recently as November 2012, they issued a certificate in the name of this defunct company - which not suprisingly was being used by malware designed for on-line banking fraud.

http://www.h-online.com/security/news/item/Certified-online-banking-trojan-in-the-wild-1808898.html

Reminds me of the (very) old trick of stealing the identity of somebody who died at an early age. :-(

Saturday, February 23, 2013

Tin Foil Hat News

Tin Foil Hat Query #1
Is it me or ... Is it a huge coincidence that certain of the *big* security companies seem to mostly find state sponsored malware unleashed by specific states? States other than that in which they're based.

I'm thinking, for example, of Kapersky Labs who seem to spend a lot of their time finding malware allegedly written by the US and our allies. (I would never be so crass as to mention their KGB connection ... oops)

I'm also thinking of Mandiant, who seem to have a remarkably good handle on what the Chinese are unleashing. (I would never be so crass as to mention their US Air Force connection ... oops)

Tin Foil Hat Query #2
Is it me or ... Is it a huge coincidence that within a couple of weeks

The White House releases the Presidential Order regarding protecting critical infrastructure
They start talking about addressing cyber theft of trade secrets (with China of course)
Mandiant just happens to release their blockbuster APT1 report about China?

Just asking ...

Wednesday, February 20, 2013

Cyberwar Bureaucracies

Mr. President, we must not allow... a mine shaft gap! (Dr. Strangelove)

Is there a security professional who hasn't seen the Mandiant "APT1" paper? If you haven't, I'm glad you recovered from your coma. I recommend that you look it over, or at least read a good summary. It seems a bit long at 76 pages, but it's a quick and relatively easy read.

It's all available (the report, the video, lists of domains, MD5s of malware, ...) at: http://intelreport.mandiant.com/

Yeah, there's even a video. It's a video capture of some of the hacker's screens while they're doing mundane hacker stuff (copying files, setting up proxies ...) It's actually pretty boring, kinda like watching an axe-murderer take out the garbage.

I thought it was a fascinating paper. It provides yet another example of how far the security industry has come in the last 5 years. Especially in terms of scale. The industry has matured to the point that some companies can see the big picture in a way that those of us in the trenches can never hope to. Mandiant was able to sift through tons of data, at hundreds of compromised customers, to draw the comprehensive picture they did in their report. That's fairly new and it's not something I can do at my job or in my little lab, no matter how smart I am.

The most interesting parts of the report (to me) were the conclusions Mandiant draws regarding the size and organization of PLA Unit 61398 (the Peoples Liberation Army organization responsible for the attacks). They suggest that Unit 61398 has a staff that numbers in the hundreds, perhaps even in the thousands. Allegedly, Unit 61398 is so big they need their own 12 story building. Mandiant goes on to claim that in order to support the scale of effort that they have observed, Unit 61398 needs a staff of programmers, system administrators, linguists, etc. ... not to mention the always requisite managers and financial personnel. :-)

Much as Stuxnet gave us a glimpse into just how technically sophisticated state-sponsored hackers are, this report highlights how hacking is becoming part of the bureaucratic landscape in some countries - with big budgets and big head-counts. I think that's a new perspective, and a new way to think about the adversaries we face as security professionals.

So if you buy into this story (more on that below), what are the implications? Well, for me, the primary implication is that China is certainly not alone. If they have an organization of hundreds of folks dedicated to attacking just English speaking targets, certainly other countries have similar organizations. Which countries have both the capabilities and the need to conduct such an effort? I'll leave it to you to draw up your own list, but I would suggest that any list start with the US and Russia.

Having waxed eloquent about how interesting this report is, it's important to keep in mind that Mandiant is making a bunch of assumptions when they come to their conclusions. I think they're pretty up-front about what is fact and what is a deduction, but still. For example, I didn't see anything which proves that Unit 61398 is using all 12 floors of "their" building, but Mandiant certainly seems to think they need the whole thing.

Here's a good posting by Jeffrey Carr, which raises some questions about this report:
http://jeffreycarr.blogspot.co.uk/2013/02/mandiant-apt1-report-has-critical.html

I'm not qualified to address his posting in detail (I'm not a trained intelligence analyst), but I certainly agree that what Mandiant has presented is not as rigorous as it could be. BTW, I thought some of the comments to his posting were useful and worth reading as well.

Monday, February 11, 2013

Fun video of an "Attack"

Definitely over the top, it kinda made me squirm to watch it ... but on the other hand it's way better tham most dramitizations of an attack.

They also got on my good side by highlighting two techniques which I think are not appreciated enough - distractor attacks and honeypots.

Probably a good one to casually show management (not surprising given it was made by Deloitte.)

Saturday, February 2, 2013

Pasting for Gold

A little while ago, I happened to go to pastebin.com, and after reading the latest hacker manifesto, I noticed on the upper right the list of "Public Pastes" in the last 20 seconds. Checking them out, I was surprised to discover that in almost real time I could take a sample of the files that folks were anonymously pasting to Pastebin. Fascinating! Lots of code snippets, various configuration files, a couple more hacker manifestos, chat logs, pointers to torrents, amateur fiction, schools assignments and lots of other stuff. It was really quite addictive.

BTW, keep in mind some of these things are Not Safe For Work - however it's all text - you won't have some sort of picture just pop up unless you follow a URL in a pasted file. But just to be clear, some of the text is quite offensive.

Digging a bit deeper, I found some perl code to monitor Pastebin for public posts, retrieving pastes that match keywords. Tweaking it a bit, I've got something which now runs in the background and does the monitoring for me. The program I started with is at this link:

http://malc0de.com/tools/scripts/pastebin.txt

My modified version:

#!/usr/bin/perl -w

#
#Simple perl script to parse pastebin to alert on keywords of interest. 
#1)Install the the LWP and MIME perl modules
#2)Create two text files one called keywords.txt and tracker.txt
#2a)keywords.txt is where you need to enter keywords you wish to be alerted on, one per line.
#3)Edit the code below and enter your smtp server, from email address and to email address. 
#4)Cron it up and receive alerts in near real time
#

########################################################################
# Downloaded 1-29-13 from http://malc0de.com/tools/scripts/pastebin.txt
# by DA - I'm not the author, but I'm afraid that I've had my way with it.
# Changes:
#     Removed email code
#     Added random sleep to be considerate 
#     Added infinite loop to be inconsiderate
#     Added write the matching paste to a separate file (writeHitToFile)
#     Added writting matching expression to writeHitToFile
#     Moved read of regex to inside main loop - catch changes on the fly
#     Added write log of hits to HitList.txt
#     Added getopt and cleaned up a bit
########################################################################

$DEL_DEBUG = 0;
$delayInterval = 5;  # Default max delay between queries to web site
$keyWordsFileName = 'keywords.txt';

use LWP::Simple;
use LWP::UserAgent;

use Getopt::Long;

GetOptions ("h" => \$Help_Option, "d" => \$DEL_DEBUG, "w=s" => \$delayInterval, "k=s" => \$keyWordsFileName );

if ($Help_Option){ &showHelp;}

my $ua = new LWP::UserAgent;
$ua->agent("Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1");

my $tracking_file = 'tracker.txt';

while (1){

    # Load keywords.  Check the file each loop in case they've changed.
    open (MYFILE, $keyWordsFileName);
    @keywords = <MYFILE>;
    chomp(@keywords) ;
    $regex = join('|', @keywords);
    close MYFILE;

#Set the date for this run
    my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
    my $datestring = sprintf("%4d-%02d-%02d",($year + 1900),($mon+1),$mday);
    $dir = sprintf("%4d-%02d-%02d",($year + 1900),($mon+1), $mday);

    my $req = new HTTP::Request GET => 'http://pastebin.com/archive';
    my $res = $ua->request($req);
    $pastebin = $res->content; 

    my @links = getlinks();
    $linkCount = $#links;

    if ($DEL_DEBUG){print "\n";}  # Just a stupid formatting thing
    print "Starting new batch. Save-to dir is $dir. Keywords file is $keyWordsFileName. regex is: $regex\n";
    if ($DEL_DEBUG){ print "size of \@links: $linkCount\n";}
    if (@links) {
     foreach $line (@links){
         &RandSleep ($delayInterval);
         if  (checkurl($line) == 0){
          my $request = "http://pastebin.com/$line\n";
          my $link = $line;
          my $req = new HTTP::Request GET => "$request";     
          my $res = $ua->request($req);
          my $content = $res->content;
          my @data = $content;
          if ($DEL_DEBUG){#print "-------------------------------------------------\n";
              print "checking ($linkCount) - http://pastebin.com/$line ... ";
              $linkCount--;
          }
          foreach $line (@data){
              if ($content =~ m/\<textarea.*?\)\"\>(.*?)\<\/textarea\>/sgm){     
               @data = $1; 
               foreach $line (@data){
                   if ($line =~ m/($regex)/i){
                    $Match = keyWordMatch ($line);
                    storeurl($link);
                    if ($DEL_DEBUG){ print " matched $Match ...";}
                    &writeHitToFile ($link, $line, $Match);
                   }
               }
              }
          }
         }          
     }
    }
    else {
     die "fetch of links failed - can't say why\n";
    }
}

sub getlinks{
    my @results;
    if (defined $pastebin) {
        @data = $pastebin;
        foreach $line (@data){
            while ($line =~ m/border\=\"0\"\s\/\>\<a\shref\=\"\/(.*?)"\>/g){
                my $url = $1;
             push (@results, $url);        
         }
     }
    }
    
    return @results;
}

sub storeurl {
    my $url = shift;
    open (FILE,">> $tracking_file") or die("cannot open $tracking_file");
    print FILE $url."\n";
    close FILE;
}

sub checkurl {
    my $url = shift;
    open (FILE,"< $tracking_file") or die("cannot open $tracking_file");
    foreach my $line ( <FILE> ) {
     if ( $line =~ m/$url/i ) {
         if ($DEL_DEBUG){print "detected repeat check of $url ";}
         return 1;
     }
    }
    return 0;
}

sub RandSleep{
    my $maxSleepTime = pop;
    my $sleepTime = int rand ($maxSleepTime + 1); # Need the +1 since we'll never hit maxSleepTime otherwise

    if ($DEL_DEBUG){print "sleeping for $sleepTime\n";}
    sleep $sleepTime;
}

sub writeHitToFile{

    my $matchingExpression = pop;
    my $Contents = pop;
    my $url = pop;
    chomp ($url);

    unless (-e $dir){
     mkdir $dir or die "could not create directory $dir: $!\n";
    }

    if (-d $dir){
     open (HIT_FILE, ">$dir/$url") or die "could not open $dir/$url for write: $!\n";
     print HIT_FILE "http://pastebin.com/$url matched \"$matchingExpression\"\n" or die "print of url to $dir/$url failed: $!\n";
     print HIT_FILE $Contents or die "print of contents to $dir/$url failed: $!\n";
     close HIT_FILE;

     # Get the current time for the list file entry
     my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
     my $datestring = sprintf("%4d-%02d-%02d %02d:%02d",($year + 1900),($mon+1),$mday, $hour, $min);

     open (HIT_LIST_FILE, ">>HitList.txt") or die "could not open HitList.txt for append: $!\n";
     print HIT_LIST_FILE "$dir/$url - http://pastebin.com/$url matched \"$matchingExpression\" at $datestring\n" or die "print of hit to HitList.txt failed: $!\n";
     close HIT_LIST_FILE;
    }
    else {
     die "$dir exists but is not a directory!\n";
    }
}

sub keyWordMatch{
    my $matchingLine = pop;

    foreach $check (@keywords){
     if ($matchingLine =~ m/$check/i){
         return $check;
     }
    }
    return "No Match";
}

sub showHelp {
    print<<endHelp
$0: [-h] [-d] [-w <Max Wait Interval in seconds>] [-k <Keywords File>]
Check whether <remote-port> on <remote-host> is listening
-h: Show this help message
-d: Print debug output
-w <wait seconds>: Max wait in seconds between fetches.  Each fetch is delayed a random amount between 0 and this value. Default is 5 seconds.
-k <filename>: Name of file with keywords to monitor for.  Each line of the file is text or a perl regular expression. Default is \'keywords.txt\'

Track progress via \"tail -f HitList.txt\"
endHelp
     ;
    exit;  # We always exit after showing help
}

In my version, I make a point of throttling my accesses since I don't want to abuse their site, so I'm likely only sampling a portion of what's posted to Pastebin.

Even with sampling I've been pulling up a lot of data - e.g. in one 24 hour period I ended up grabbing 1313 pastes which contain the word "password" in them ... several of which document compromised accounts. In that same period there were 28 pastes with the word "passwd", and 200 with the word "anonymous" in them. When I have time, this will all go into my password cracking lists.

Just as an example of the sort of thing which turns up, without trying very hard, I came across a portion of the 55,000 Twitter accounts which were compromised last year. (Like the echo of a scream - they're still bouncing around on the Internet.) Other things I've stumbled across include the latest call to action by Anonymous (operation #fema) and several password lists posted from recently hacked sites. As yet another example, I think I came across source code for a couple of programs which appear to be part of Windows NT.

Probably the biggest problem with all this is that there's too much data to easily sort through. The next round of modifications my program will be to try to find ways to make sorting through the results more efficient.

The bottom line of all this is that monitoring Pastebin can give you a very interesting view into some portions of the Internet. Of course, there are probably many other, similar, places you can productively monitor.

To use my program you may need to install the LWP perl module (although it appears to be installed by default). And then let'er rip!

Here's some more discussion about monitoring Pastebin: http://isc.sans.org/diary/SCADA+hacks+published+on+Pastebin/12088 https://isc.sans.edu/diary/Quick+Tip%3A+Pastebin+Monitoring+%26+Recon/12091