New Ways To Search
There must be a better way to search the Web than typing in keywords and trawling through a long list of results just to find out if any of them are what you seek. If you know you want to check the features of a digital camera, you don't want to read tips on taking better pictures or hear about special offers on accessories. If you want the address of a business, you have to hope the company put it in a sensible place on its Website. Indeed, the Web, and most Websites, aren’t organized very well and there’s no consistent way to find specific categories of information. Four of the new tools launching at the Demo conference this week take very different approaches to the same problem, but they all use a mix of context, categorization, semantic analysis, and ontology to make sense of the mass of online information and present it more comprehensibly. Try them for yourself and let us know if any of them perform searches better than the big G does.
Evri: Details on Almost Everything
Evri (www.evri.com) started out as a tool rather like the “Sphere” widget that sites like CNN.com use to show related content on the same site and elsewhere on the Web. This technique is one way of finding pages on the same topic, such as the one you’re already reading, by using natural language techniques to understand what the page is about. Evri mines sites like Wikipedia, Amazon, and Freebaseto to build an extensive dictionary. It knows about 42 different things called “blue” and it knows that “ten” is a Pearl Jam album as well as a number. It uses rules like “people speak and cities don’t,” to avoid getting confused by phrases like “Chief Seattle spoke.” And it puts together the information it derives from Web pages and the connections it finds between them to build a page about each entity (a person, product, or thing) in the knowledge base. This combines search results with standardized information: everything from date of birth and family members to salary and nicknames (for people) to revenue and number of employees for companies.
Evri: Filter by Connections, Categories, and Activities
Evri shows the connections between what you asked about and other concepts, as well as quick details and search results. The search results include images and video, and you can filter them in three ways: by connections to other entities (stories about Microsoft that also involve Apple, for example); by the category the page falls into (stories about Windows 7 that concentrate on a key figure like Steve Ballmer or Steven Sinofsky, or iPhone stories about specific software companies); or by the activity involved (athletes competing, musicians releasing albums, or Steve Ballmer disparaging). For other kinds of entities, activities can be similar to categories and connections. Categories can also overlap, but the filters are still a useful way of discovering what kinds of stories are in the search results.
Evri: Toolbar Results
The Evri toolbar lets you search through Evri without visiting the site, but with both methods you're limited to entities that Evri already knows about, so you can search for iPhone but not iPhone apps or social network but not mass customization. The toolbar can also highlight all the entities on the page that Evri knows about. For example, you can hover over a highlighted word to get a concise version of the Evri page, with a definition, top-search results, connections, images, and videos (which you can view inside the popup). The search results are skewed more towards news than a Google search is for the same terms, while the Evri page for Windows 7 doesn’t have any links to pages on Microsoft.com, for instance.
Xmarks: Mining Bookmarks
A search engine doesn’t need to rely on powerful algorithms if it has enough people classifying sites for them. If you bother to bookmark a Web page in your browser, it means you think it's worth coming back to. When you store your bookmarks in Xmarks (previously Foxmarks) to synchronize them between different machines, the service takes note and scores popular sites based on how many people bookmark them. With 600 million bookmarks, that’s enough information to give a reasonably accurate rating and to do some more analysis.
Xmarks: Relying on Reputation
Type in a URL on the Xmarks site (there’s a preview here) or click the link that the Xmarks plugin puts in your location bar to see how frequently the page you are visiting is bookmarked, how highly it’s rated, and what sites Xmarks thinks are similar. That’s based on what other sites people bookmark in the same folder. A lot of people put bookmarks for Mint.com and Wesabe.com in a folder called Finance, so they show up as related sites and Xmarks tags them with the finance category. You can click through to find sites in a specific category once you've done a search, but you can’t simply browse by category. The Xmarks plugin (which is available for Firefox now and for IE and Safari in the future) also adds icons to Google searches to mark the three most popular results. Click the icon to get the same ratings and similar site suggestions. Because they’re based on bookmarks, the results in Xmarks tend to go to canonical sites rather than news stories. Everything in the social networking category is a link to a site like Facebook or Bebo, and similarly, sites that get highlighted in Google tend to be official and generic sites. This makes Xmarks as much a reputation service as a search tool.
Primal Fusion: Thought Networking
Primal Fusion is a technology rather than a product–the site isn’t even in beta yet, and although you can sign up for an account, you may have to wait to get access. Here’s how it works: You start by typing in keywords that the system uses to search Wikipedia (by default, because it’s a good source of concepts and related ideas). Rather than results, this gives you a “tag cloud” of ideas and concepts. Pick the ones that fit your interests and add them to your “remembered thoughts.” Switch to other sources like Flickr and Yahoo (there are only a handful of sources at the moment) to get results matching these concepts. The ones you choose to remember add new tags to your thought network and you can keep searching and refining the classifications. Stream of consciousness, anyone?
Primal Fusion: Moving Away From The Results Page
Instead of page titles, Primal Fusion shows you the tags and concepts that are associated with your keywords, which you can use to refine or expand your search. Strangely, you can’t really look at the results of your search and categorization from inside Primal Fusion. Instead, you click a button to create a Website with all the info (although this feature is a little unstable). In the future you’ll be able to create a document or an RSS feed instead. Semantic analysis of documents isn’t just complex as it creates an index much larger than the original document. That just wouldn’t work for the Web, so what Primal Fusion does is create a very compact semantic representation of what you’re thinking about, in real time. It then compares the semantic representation to the concepts it finds, dividing them into content that correlates with what you’re thinking about already and new topics. This is a very sophisticated system, but searches can be slow and with only a few sources and a basic interface, it’s very much a work in progress.
Ensembli: Just the RSS You Care About
With many thousands of RSS feeds available on the Web, you're never going to be able to keep up with everything, nor would you want to: not all the stories on your favorite feed will always interest you. Ensembli is an online feed reader, with a twist–instead of picking the feeds you want, you tell it the topics you're interested in. It searches across all the feeds it covers (currently only 1,200 but the company is adding more and you can request specific feeds) for that topic, but rather than just returning the most recent stories, it tries to show you the most relevant. At first, it does that by collaborative filtering and using agents that predict what you’ll be interested in based on other, similar users. But as you read, ignore, mark as favorites, and delete different stories, Ensembli creates a personal profile for you and uses that as the basis for what stories it sends you. The info it collects includes whether you favor or dislike certain sites, whether you look at shorter or longer pieces, whether you always look at newer stories and delete old ones to clear the page out, and which particular topics you look at in your results. It is all about your behavior.
Ensembli: No Pain, No Gain
With the limited number of feeds and short archive (two weeks so far), Ensembli doesn’t yet give comprehensive results and you won't see particularly personalized results until you've spent some time reading and rating stories. But getting the most relevant stories is worth a little effort. Ensembli quickly learns to stop presenting feeds you dislike, and once you have marked a few stories on the same topic, you will start to see more of those stories.
The Semantic Web?
All four of these tools try to make sense of the mass of information on the Web. Xmarks tells you what matters to other people, as long as you know a Website to start with. At the moment, that tends to be top-level pages, so perhaps Xmarks is most useful for checking which big-name sites are actually worth looking at. You could also use Xmarks to check which are legitimate.
If you want to track down information about a specific person or company, Evri is a quick way to do it. Although it’s the most advanced of the four tools, it still doesn’t have quite everything you want to know about and it doesn’t handle more abstract concepts.
But Primal Fusion is all about abstract concepts, and when the service eventually becomes more robust and handles more sources, it will be a very interesting way to explore ideas.
Ensembli has to learn about you, so you need to use it as a way of following topics in which you have an ongoing interest. Once it knows you, a new search will be more relevant to start with–as long as you think about it in the same way you think about those existing interests. It is an investment.
These brand new tools are helping us towards the holy grail of the semantic Web, if you think of semantic web as stuff that helps you find better results (the original definition was about having Websites make sense to computers so they can work more efficiently on your behalf). But all these tools need to develop much more before they become truly useful.