<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Open Data</title>
	<atom:link href="http://www.buzzmachine.com/2007/03/13/open-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.buzzmachine.com/2007/03/13/open-data/</link>
	<description>by Jeff Jarvis</description>
	<pubDate>Wed, 03 Dec 2008 04:33:26 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Rufus Pollock</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-346056</link>
		<dc:creator>Rufus Pollock</dc:creator>
		<pubDate>Mon, 26 Mar 2007 18:02:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-346056</guid>
		<description>Open data is to media what open source is to technology. Open data is an approach to content creation that explicitly recognizes the value of implicit user data.

The analogy goes further: we can simply port the open source definition to create an open data definition or, more broadly, an open knowledge definition:

&lt;a href="http://www.opendefinition.org/" rel="nofollow"&gt;http://www.opendefinition.org/&lt;/a&gt;.

I think this is a nicer way of going about defining open data than talking about 'recognizing the value implicit in user data' which is fairly vague (plus what about all the others types of data from geographic to genomic?). Even in the context of companies providing various data services to users it seems to me the main point of open data is to reduce lock-in -- not necessarily to recognize to the value inherent in the data itself.</description>
		<content:encoded><![CDATA[<p>Open data is to media what open source is to technology. Open data is an approach to content creation that explicitly recognizes the value of implicit user data.</p>
<p>The analogy goes further: we can simply port the open source definition to create an open data definition or, more broadly, an open knowledge definition:</p>
<p><a href="http://www.opendefinition.org/" rel="nofollow">http://www.opendefinition.org/</a>.</p>
<p>I think this is a nicer way of going about defining open data than talking about &#8216;recognizing the value implicit in user data&#8217; which is fairly vague (plus what about all the others types of data from geographic to genomic?). Even in the context of companies providing various data services to users it seems to me the main point of open data is to reduce lock-in &#8212; not necessarily to recognize to the value inherent in the data itself.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matthew Hurst</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-344863</link>
		<dc:creator>Matthew Hurst</dc:creator>
		<pubDate>Thu, 15 Mar 2007 14:28:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-344863</guid>
		<description>I think there is some confusion here between object data and meta data which is obtained via analytics. While there is no real debate about how owns the object data (and there are many models which work by simply taking that data and exploiting it with no permission what so ever - search for example) there seems to be some debate here about who owns the meta data. I suspect that vendors will get value from opening up how they do stuff - e.g. how an influence metric is computed - and derive revenue from the fact that they can do it. In other words, the barrier here may well be simply the scale of the task. For example, it would make sense to disclose how one computes influence, but it still requires a huge amount of infrastructure and historical data to deliver accurate and reliable results.

With things like text analytics, the key is going to be proving the accuracy of the method. It is one thing to extract a bunch of company names or ticker symbols from social media, but how accurate is it? Is there a bias to one type of blog over another? Convincing people of this is a key challenge.

As for opening up data, a number of institutions including Intelliseek/BuzzMetrics and TREC have offered data sets for analysis in (academic) research contexts. Again, one needs to consider the 'owning' of the data with the cost of aggregation and distribution. Sure, we all own our blog posts, but I don't own the infrastructure and distribution channels that various institutions invest in to acquire and analyse that data. Thus I can't perform induction on my ownership and claim ownership of all blog data, the infrastructures that aggregate it and so on.

(BTW, I've worked with Kate in the past but won't damn her with praise here ;-)</description>
		<content:encoded><![CDATA[<p>I think there is some confusion here between object data and meta data which is obtained via analytics. While there is no real debate about how owns the object data (and there are many models which work by simply taking that data and exploiting it with no permission what so ever - search for example) there seems to be some debate here about who owns the meta data. I suspect that vendors will get value from opening up how they do stuff - e.g. how an influence metric is computed - and derive revenue from the fact that they can do it. In other words, the barrier here may well be simply the scale of the task. For example, it would make sense to disclose how one computes influence, but it still requires a huge amount of infrastructure and historical data to deliver accurate and reliable results.</p>
<p>With things like text analytics, the key is going to be proving the accuracy of the method. It is one thing to extract a bunch of company names or ticker symbols from social media, but how accurate is it? Is there a bias to one type of blog over another? Convincing people of this is a key challenge.</p>
<p>As for opening up data, a number of institutions including Intelliseek/BuzzMetrics and TREC have offered data sets for analysis in (academic) research contexts. Again, one needs to consider the &#8216;owning&#8217; of the data with the cost of aggregation and distribution. Sure, we all own our blog posts, but I don&#8217;t own the infrastructure and distribution channels that various institutions invest in to acquire and analyse that data. Thus I can&#8217;t perform induction on my ownership and claim ownership of all blog data, the infrastructures that aggregate it and so on.</p>
<p>(BTW, I&#8217;ve worked with Kate in the past but won&#8217;t damn her with praise here <img src='http://www.buzzmachine.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Carson</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-344795</link>
		<dc:creator>Jonathan Carson</dc:creator>
		<pubDate>Tue, 13 Mar 2007 21:51:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-344795</guid>
		<description>Jeff - 

We do give away a huge amount of data and analytics, primarily through our Blogpulse.com site, but also through presentation at conferences like the one you are at with Kate (who, by the way, is quite bright and definitely "gets it"), open publication in academic journals, or informal blog posts like the ones you referenced on Matt's blog.

In fact, we put many of our analytic techniques on Blogpulse before we put them into client products.  Floodgate is a good example of this; the first use of the Floodgate technology was in the "Blogpulse Live" application debuted a number of months ago on Blogpulse.com.  We have a whole second iteration of that technology planned for a forthcoming update of Blogpulse, and all of this will happen before those technologies are ever used in client deliverables. Many of our approaches for influencer/social network analysis have also been debuted through Blogpulse.

Jonathan Carson
CEO
Nielsen BuzzMetrics</description>
		<content:encoded><![CDATA[<p>Jeff - </p>
<p>We do give away a huge amount of data and analytics, primarily through our Blogpulse.com site, but also through presentation at conferences like the one you are at with Kate (who, by the way, is quite bright and definitely &#8220;gets it&#8221;), open publication in academic journals, or informal blog posts like the ones you referenced on Matt&#8217;s blog.</p>
<p>In fact, we put many of our analytic techniques on Blogpulse before we put them into client products.  Floodgate is a good example of this; the first use of the Floodgate technology was in the &#8220;Blogpulse Live&#8221; application debuted a number of months ago on Blogpulse.com.  We have a whole second iteration of that technology planned for a forthcoming update of Blogpulse, and all of this will happen before those technologies are ever used in client deliverables. Many of our approaches for influencer/social network analysis have also been debuted through Blogpulse.</p>
<p>Jonathan Carson<br />
CEO<br />
Nielsen BuzzMetrics</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: SixRocks</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-344783</link>
		<dc:creator>SixRocks</dc:creator>
		<pubDate>Tue, 13 Mar 2007 19:47:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-344783</guid>
		<description>I think you're missing the point on what ClearForest is up to. What you describe as far as filtering tags is a lay up for them. Where their stuff gets interesting is in real time semantic analysis of web content. They are taking a "top down" approach to creating the semantic web by extracting meaning from messy text.

I'm one of many who have created mashups based on their web service. Take a look at http://sws.clearforest.com and poke around (and look at the cool mashups listed on the right hand side!).</description>
		<content:encoded><![CDATA[<p>I think you&#8217;re missing the point on what ClearForest is up to. What you describe as far as filtering tags is a lay up for them. Where their stuff gets interesting is in real time semantic analysis of web content. They are taking a &#8220;top down&#8221; approach to creating the semantic web by extracting meaning from messy text.</p>
<p>I&#8217;m one of many who have created mashups based on their web service. Take a look at <a href="http://sws.clearforest.com" rel="nofollow">http://sws.clearforest.com</a> and poke around (and look at the cool mashups listed on the right hand side!).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Publius</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-344781</link>
		<dc:creator>Publius</dc:creator>
		<pubDate>Tue, 13 Mar 2007 19:08:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-344781</guid>
		<description>Why is it our data?  Did we spend the time and money to collect it?</description>
		<content:encoded><![CDATA[<p>Why is it our data?  Did we spend the time and money to collect it?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JamesBruni</title>
		<link>http://www.buzzmachine.com/2007/03/13/open-data/#comment-344761</link>
		<dc:creator>JamesBruni</dc:creator>
		<pubDate>Tue, 13 Mar 2007 15:51:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.buzzmachine.com/2007/03/13/open-data/#comment-344761</guid>
		<description>Seth's ideas about putting "threads" in a "vault" to be sold to marketers, advertisers and PR agencies are ambitious (yet a long way off in the future).   His presentaton at NY Tech Meetup a few months back got a lot of reaction.  I don't know how his "Root Exchange" for mortgage leads is doing, but he's definitely got some financing, from notables such as Lew Rainieri, and others.</description>
		<content:encoded><![CDATA[<p>Seth&#8217;s ideas about putting &#8220;threads&#8221; in a &#8220;vault&#8221; to be sold to marketers, advertisers and PR agencies are ambitious (yet a long way off in the future).   His presentaton at NY Tech Meetup a few months back got a lot of reaction.  I don&#8217;t know how his &#8220;Root Exchange&#8221; for mortgage leads is doing, but he&#8217;s definitely got some financing, from notables such as Lew Rainieri, and others.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
