Monday, January 16, 2012

Extract links from web pages

One or two weeks ago, I tried to download some videos from a website under Linux. Of course it was not possible to simply download the videos using the browser - the videos were secured. A challenge for a hacker. ;) After a quick search in the Ubuntu Software Center I found get-flash-videos. It is a small Perl command line program, capable for retrieving flash movies from a wide variety of movie sites. But sadly it just worked one or two times for me. On the other day, it tried the program, it stopped working. And so began the search for a solution...

For Windows I am using URL Snooper 2 from Mouser software. It is a great utility. A very powerfull tool in combination with a download manager like the Firefox extension down them all. You can get every video from the web with ease.

But I did not find any tool like URL Snooper for Linux and so - I searched the web for some "inspiration". And finally - I found ngrep. It is an command line tool - a bit old (not maintained since 2006) - but easy to use.
Combined with some Perl commands and "down them all" it is quite powerfull. Not as easy and powerful like URL Snooper, but good enough to download the movies I wanted.
And the best: the command fits into one single line. ;) (okay, if you have a big display, where 177 chars will fit... but it is one line). Of course it could be optimized further and some chars could be saved.

Because this code is for the extraction of movies from an adult website, I will not post the link here. ;)
Here is the "proof of concept":
sudo ngrep -n 1 -q -d eth0 '(/key.*flv).*Host: ([\d.])' 'tcp and dst port 80' | perl -ne 'm/.*(\/key.*flv).*Host: (\d+.\d+.\d+.\d+).*/; if ($2){ print "http://$2$1\n"; exit; }'
Take it as a source for your inspiration. Try to figure out what the line of code is doing. Ngrep and Perl can be your friends!