Posts Tagged ‘filters’

Mining Twitter for Gold

Tuesday, January 12th, 2010

Finding the 27% of Tweets that Have Value

A recent study by ReadWriteWeb has shown that only 27% of tweets contain information with some value. Many people will point to this and use it to dismiss Twitter as worthwhile platform. However, this number comes from Twitter’s flexibility. Some people use it to keep in touch with friends, others use it break news. Some use Twitter for advertising and others use it for sharing information they find on blogs.

It’s this last group that’s the most interesting. It’s the human web. It’s people finding information and sharing it that adds value where search engines can not.

The problem is finding the tweets that make up this 27% of the stream that holds information of value. Further, 27% doesn’t sound like much until you realize it’s 70+ million tweets per week. The best information on Twitter amounts to a needle in haystack.

This points to the growing need for filters and recommendation engines for the real-time web. Last week I posted on micro filters and I believe this post by ReadWriteWeb further emphasizes this need.

To leverage the value that Twitter and the whole real-time web hold, we need better tools. We need more filters that go beyond the basics; Twitter lists, follower lists, and individual favorites. For example, value can be attributed to the number of people sharing the same content or  the credibility and clout of those sharing it.

If the web is going to evolve beyond search, micro filters will play a huge part in it but filters alone are not the answer.

Recommendation systems are the other piece of the puzzle. They’re needed to understand user behaviors; what people like and don’t like, what they favorite, what they read, and what they share. Recommendation systems leverage this data and combine it with filters to find the best information that people want to read. This helps us to take a full advantage of the real-time web without becoming overwhelmed.

To solve the problem of finding the 27% of Tweets that have some value, filters will be used to narrow the stream of information. Then recommendation systems, which have some insight into our past behavior, will be able to narrow the focus even further by taking the information output by these filters and funnel it to us based on our interests. This means that we’ll all be giving up some privacy on the web but it’s a trade off we’ll need to make to keep up with the barrage of information.

Why Micro Filters are the Future of the Web

Thursday, January 7th, 2010

In 2010, we need to find a better way to filter the web. It’s growing exponentially every day to the point that only the largest server farms can keep up.

Twitter’s API can’t keep up with it’s own traffic. Soon this will change when the firehouse is opened up to everyone but it will just push the problem further downstream. Developers are eager to have access to the firehouse of data but they won’t be able to process it all, nor should they try. And this is only for one piece of the real-time web puzzle. Factor in Facebook, Google Wave, Linked-In’s upcoming API, many more, and it becomes next to impossible for one company to filter and analyze everything.

To resolve this problem, we need micro filters.

What is a Micro Filter?

A micro filter is a filter that has a unique purpose and is reusable and available to anyone.

One example of a micro filter is a Twitter list. These lists are filters that web applications can use to narrow the firehouse and make information gathering manageable. But there’s one problem. Twitter lists don’t filter the information in a meaningful way. You can’t grab every Twitter list on marketing and gather all the marketing tweets. A marketing twitter list can be as diverse as Twitter itself and can overlap with many other lists outside of marketing.

This is why we need multiple micro filters to get the information we want. A series of filters – when put together – would narrow the focus of information to the data you need for your web application or research project. Running your marketing twitter lists through a marketing filter would narrow the focus and give you the marketing information you need.

Creating Micro Filters

Creating micro filters is very complex. I used Twitter lists as an example but this is one of the easier filters to build. The complexity increases when you try to create the “marketing” filter in the example above. How do you know what information in a Tweet is related to marketing?

There are several ways to do this:

  1. Hash Tags: Hash tags are great identifiers but they’re not popular enough to filter on. Too much information would be lost.
  2. Open API’s: Take the links from each Tweet, convert the URL to it’s long format, reference it in Delicious, and look for marketing tags. This works but it has a couple of downsides. First, it requires a lot of processing time. Second, the link may not be tagged in Delicious yet.

There isn’t a perfect solution but it’s clear that a combination of tactics are needed to build this “marketing” filter – tactics that go well beyond individuals categorizing other individuals in a social networking platform such as Twitter lists.

Further, several micro filters could be  put together to keep narrowing the focus. You could add a third filter to the example above that shows all marketing information shared within 5 miles of you. This location filter would be the third micro filter and it could be used an many different situations.

In 2010, I expect to see more filters become available to help people focus on the topics that interest them most. Looking at Twitter, it’s clear that filtering is going to become the next big development as people gather more followers, share more information, and expand their presence across more social media platforms.

Currently, Twitter is an unreliable platform for contacting people as the API can’t handle the streams of information going to its most popular residents. Further, at close to 300 million Tweets per week, there’s a lot of great information getting lost in the noise and this isn’t just an issue on Twitter. It’s happening everywhere which is why micro filters are the future of the web.