<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Mad Philosopher &#187; Search Results  &#187;  denyhosts</title>
	<atom:link href="http://madphilosopher.ca/?s=denyhosts&#038;feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://madphilosopher.ca</link>
	<description>Because being mad is all the rage. A personal weblog of Darren Paul Griffith.</description>
	<lastBuildDate>Sun, 25 Jul 2010 02:12:09 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>URL Scraper Script</title>
		<link>http://madphilosopher.ca/code/url-scraper-script/</link>
		<comments>http://madphilosopher.ca/code/url-scraper-script/#comments</comments>
		<pubDate>Sun, 19 Jun 2005 18:12:00 +0000</pubDate>
		<dc:creator>Darren</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://madphilosopher.ca/url-scraper-script/</guid>
		<description><![CDATA[Description
I wrote the url_scrape.py command-line filter to scrape HTML, XML, and plaintext documents for the URLs that they contain. It reads from standard input and gives the results on standard output, one URL per line.
The heart of the script is the following regular expression, in Python:

url_pattern = re.compile('''[&#34;']http://[^+]*?['&#34;]''')

which basically looks for URLs  between quotes [...]]]></description>
			<content:encoded><![CDATA[<h4>Description</h4>
<p>I wrote the <code>url_scrape.py</code> command-line filter to scrape HTML, XML, and plaintext documents for the URLs that they contain. It reads from standard input and gives the results on standard output, one URL per line.</p>
<p>The heart of the script is the following regular expression, in Python:</p>
<pre>
url_pattern = re.compile('''[&quot;']http://[^+]*?['&quot;]''')
</pre>
<p>which basically looks for URLs  between quotes (<code>""</code> or <code>''</code>) that start with <code>http://</code>.  </p>
<p>The script is most useful if you mix and match it with <code>sort</code>, <code>uniq</code>, and <code>grep</code>. Unix geeks know the drill. For example:</p>
<pre>
[beaker] ~> wget -O - http://madphilosopher.ca/ | ./url_scrape.py | sort | uniq
http://coralcdn.com/
http://del.icio.us/madphilosopher
http://del.icio.us/madphilosopher/comments
http://denyhosts.sourceforge.net/
http://en.wikipedia.org/wiki/Slashdot_effect
http://geourl.org/near/?p=http://madphilosopher.ca/
http://gmpg.org/xfn/
http://gmpg.org/xfn/11
http://home.online.no/~kafox/blogfiles/
http://iloveradio.org
http://madphilosopher.ca
http://madphilosopher.ca/2003/01/
http://madphilosopher.ca/2003/02/
http://madphilosopher.ca/2003/03/
http://madphilosopher.ca/2003/04/
...
http://madphilosopher.ca/images/delicious_links.gif
http://madphilosopher.ca/images/geourl.gif
http://madphilosopher.ca/images/get_firefox.gif
http://madphilosopher.ca/images/rss2.gif
...
</pre>
<h4>The script</h4>
<p><a href="/doc/url_scrape_py.txt">Download the <code>url_scrape.py</code> script.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://madphilosopher.ca/code/url-scraper-script/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DenyHosts on FreeBSD</title>
		<link>http://madphilosopher.ca/2005/06/denyhosts-on-freebsd/</link>
		<comments>http://madphilosopher.ca/2005/06/denyhosts-on-freebsd/#comments</comments>
		<pubDate>Tue, 07 Jun 2005 15:45:54 +0000</pubDate>
		<dc:creator>Darren</dc:creator>
				<category><![CDATA[FreeBSD]]></category>

		<guid isPermaLink="false">http://madphilosopher.ca/2005/06/denyhosts-on-freebsd/</guid>
		<description><![CDATA[On a tip from the news site RootPrompt, I discovered a small security utility called DenyHosts which is for Linux systems to help thwart ssh server attacks. It examines the sshd logs and looks for multiple failed login attempts. It then collects the IP addresses of the offending hosts and writes them out to /etc/hosts.deny [...]]]></description>
			<content:encoded><![CDATA[<p>On a tip from the news site <a href="http://rootprompt.org/">RootPrompt</a>, I discovered a small security utility called <a href="http://denyhosts.sourceforge.net/">DenyHosts</a> which is for Linux systems to help thwart ssh server attacks. It examines the sshd logs and looks for multiple failed login attempts. It then collects the IP addresses of the offending hosts and writes them out to <code>/etc/hosts.deny</code> so that these hosts will be blocked from further access to the machine. </p>
<p>Since the server in question is running <a href="http://www.freebsd.org/">FreeBSD</a>, which uses a combined allow/deny syntax in <code>hosts.allow</code> and doesn&#8217;t use <code>hosts.deny</code>, I had to modify the DenyHosts script script slightly to get it to work in the FreeBSD context. Basically, I configured DenyHosts to write to a dummy <code>hosts.deny</code> file and then wrapped it in a <code>cron(8)</code> script to concatenate this dummy file with a <code>hosts.allow.template</code> file. Thus <code>hosts.allow</code> is dynamically generated with the dynamic deny rules first and the static allow rules last. </p>
<p>It seems to be working so far. <img src='http://madphilosopher.ca/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><b>Update from the comments:</b> FreeBSD is now supported in the latest version of DenyHosts.</p>
]]></content:encoded>
			<wfw:commentRss>http://madphilosopher.ca/2005/06/denyhosts-on-freebsd/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
