<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: North by northwest</title>
	<atom:link href="http://www.buzzmachine.com/2006/07/08/north-by-northwest/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.buzzmachine.com/2006/07/08/north-by-northwest/</link>
	<description>by Jeff Jarvis</description>
	<lastBuildDate>Fri, 10 Feb 2012 12:21:30 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Robert Feinman</title>
		<link>http://www.buzzmachine.com/2006/07/08/north-by-northwest/#comment-87512</link>
		<dc:creator>Robert Feinman</dc:creator>
		<pubDate>Sat, 08 Jul 2006 19:35:14 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/index.php/2006/07/08/north-by-northwest/#comment-87512</guid>
		<description>Early in my professional career I, and some colleagues, did work on connections between scientific documents. We called it clustering at the time. Citations were a very good predictor of relevance, as were certain patterns of key words. The computer power and storage capacity weren&#039;t enough in those days to do more than organize specialized scientific subspecialties. 

Now that the horsepower is available it is sad to see that the algorithms used for analysis are rather unsophisticated. One of the reasons sites like Google work is because most people are satisfied with a close match to what they are looking for, rather than requiring high relevance. This weakness extends to our spy agencies as well, as their need to use wholesale harvesting of electronic data illustrates. If they really had algorithms for discovering patterns of illegal activity they would be doing targeted data capture.</description>
		<content:encoded><![CDATA[<p>Early in my professional career I, and some colleagues, did work on connections between scientific documents. We called it clustering at the time. Citations were a very good predictor of relevance, as were certain patterns of key words. The computer power and storage capacity weren&#8217;t enough in those days to do more than organize specialized scientific subspecialties. </p>
<p>Now that the horsepower is available it is sad to see that the algorithms used for analysis are rather unsophisticated. One of the reasons sites like Google work is because most people are satisfied with a close match to what they are looking for, rather than requiring high relevance. This weakness extends to our spy agencies as well, as their need to use wholesale harvesting of electronic data illustrates. If they really had algorithms for discovering patterns of illegal activity they would be doing targeted data capture.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

