I read with some interest a fellow blogger's post on desktop search. This is probably one of the biggest challenges with productivity today, simply finding the information we know exists on our personal computers, home machines and servers.
In his post, Bert talks about desktop search tools from Google and Yahoo, and another commenter recommends X1. I've used each of these tools and found them to be relative good at helping me find files on my machines. The problem I have is that I have several databases and machines. For example, I often use a thumb drive to save a document at work so I can take it home or on the road with me to work on. Unless I just remember where that file is, it is hard or impossible to find using desktop search.
The real problem, at least in my mind, is that the metaphor for saving a document is a file. The challenge is that filing cabinets, which the metaphor is based on, can be easily opened and searched manually. One can eventually open every drawer and open every folder in a filing cabinet. There's a logical limit to how much stuff can go into a physical file cabinet as well. In a virtual file cabinet, the best way to search is to be exceptionally organized with file names and folder names, and use a brute force approach, which is what these desktop tools do. They frequently index all the files to make finding the files that exist on the network more effective.
We have another metaphor that we could use at some level - the del.ici.ous tagging model. Right now our Microsoft-centric office tools don't have a lot of tagging capability, but once the desktop applications catch up with what's interesting and helpful on the web, we'll be able to "tag" our files with keywords and find and retrieve files using tags and keywords. Then the problem will be a growing list of tags and keywords to manage.
Eventually what's needed is a corporate taxonomy of tags and keywords to ensure that we aren't creating new tags and keywords with every new document. Some structure is needed since we are creating so much new information all the time. Combining an information structure or hierarchy with the capability for tagging and using indexing search engines will provide a lot more search and find capability.
Then we'll need to know not just what data was saved and in which format, but where the data was created and where it was saved. As thumb drives, read/write CD burners and other data transport mediums and location become available, we don't want to search just our desktops, but any data store. So the hierarchy needs to include not just the tags and data sources, but also the time, date and store where the data was created and recorded.
We don't have this capability yet with desktop tools, but we can create these mechanisms for data stored in our corporate databases now. Once again the web and open source techniques will lead the way to greater usability in our existing software tools and business processes.



I made a comment on the original, basically saying that keywords and computer-friendly file names are needed only because we don't support an interconnected personal information space...
Posted by: Matthew Cornell | May 23, 2006 at 04:16 PM