Warning: This web at toLearn.net/marketing/ is two years old, it's unattended, and the links are rotting. However, in June 2000, the server recorded over 10,000 page requests during more than 3,000 visitor sessions from dozens of countries. Thus, I'm reluctant to take it down completely.

Get much of the info new and fresh:

Ricci Street | MBA 604 | marketing
computers | design | discussion forum


topbar.gif (10780 bytes)

peel1.gif (5014 bytes)
juice.gif (1744 bytes)
monobar.gif (1022 bytes)

monobar.gif (1022 bytes)

oranlogo.gif (4389 bytes) isearche.gif (2216 bytes)

monobar.gif (1022 bytes)

Search Engines

As of Spring 1998, six or seven search engines dominate the lists in numbers of search requests they process. They have bots (from robots) to automatically, systematically, and frequently scour the Web. The pages they encounter are stored in a database and indexed. An article in the April 3, 1998, Science magazine estimates that the searchable Web has 360 million pages and that none of the top six covers much more than a third of them. Northern Light is the newcomer, and it charges for any indepth searches; many surveys list Webcrawler instead among the top engines.

To learn everything you need to know about search engines, you should read all the material at Search Engine Watch.

A recent survey by Relevant Knowledge confirms the general findings of the Science article and raises these questions.

The Questions The Answers
which 34% does Hotbot cover? who knows?
will running two different Hotbot searches add up to 68%? no, considerably less
will running two different searches on Excite add up to more than 14%? yes, depending on the bots' frequency
will running the same search on Hotbot and Exite add up to more than 34%? yes, but not much

We do have a pretty good idea of some of what's not covered: pages that do not have links to them from the home page of a domain name or from any page linked to the home page. They are sitting on a server but are available only if you type (or bookmark) the whole URL. If you don't quite understand what I mean by that, rest assured the search engines are great if they give you what you want. If they don't, they're only the beginning of your search.

If you've been doing this Internet stuff for a while, it probably won't surprise you to learn that common knowledge is often wrong. For example, Yahoo is not a search engine. It's a subject directory compiled by humans that covers maybe 1% of the possible sites. You can browse through it, but you can't search it. You can, however, launch a search while you're at Yahoo.

opinp.gif (941 bytes)
Yahoo is not a search engine

It probably also won't surprise you to learn that some "search engines" in fact are meta-engines. Most meta-engines are links to send your search terms to one or more of the top six. MetaCrawler, the one I use most, will collate the results, eliminate the duplicates, and present the rest to you. For many reasons, if only their sense of humor, Dogpile is my favorite when I need to dig a little deeper. Mother Load has a special search for corporate marketing sites. The whole process happens faster than you could slide open two drawers of a library's card catalog.

monobar.gif (1022 bytes)

MetaSites

The search engine sites basically take the keywords you submit and compare them to the index of the database of pages their bots have brought back. The actual engine, the bot, has already done its work and is out there doing it again for the next update. The database is stored in many large hard drives, so it can take a couple of moments to collect the results, rank them, and display them, often with banner ads relevant to the topic. That is, if you search for Cancun, you're likely to see a banner ad for an airline or Travelocity.

How many matches did you get?

If you got too many or too few, you may want to re-search by adding or omitting keywords. Go to one of the top six engines and try three searches:

opinp.gif (941 bytes) red very general; note the number of matches
opinp.gif (941 bytes) travel much more specific, but note the number of matches
opinp.gif (941 bytes) desalinization specialized, yet note the number of matches

Conclusion: Information Overload. These search engine sites' databases are so huge that one-word searches are not very helpful. Try two or three words at the same time. Note the differences in search conventions: at one site AND links two words; at another site, only + will do that job.

How are the matches ranked?

Each engine does it a little differently. Knowing how yours does it will help you choose your search terms. If exactly what you're looking for is displayed on the first page, congratulations.

Beyond the search engines

The rest of the time, you're going to have to develop more strategies. Many people at this point turn to specialized searches. Some of them, you can pay for. lexnexlogo.gif (3925 bytes)Lexis-Nexis, in Dayton, Ohio, has a million and a half sbuscribers. Half of them use the service each month, some of them extensively. What's there? Almost 1.5 billion documents. The visual tour of their facility is most enlightening. Their data base is larger than the web itself, and it's available over the Internet via the telnet protocol (telnet://) rather than the hypertext protocol (http://), so it's not on the web. You will not always end up with nicely formatted and illustrated .htm pages. But you will end up with a lot of information.

dialoglogo.gif (3530 bytes)As the Dialog Corporation's home page says, "Quantity of data, by itself, is of little consequence. The challenge is to help people isolate data of real value from an exponentially rising tide of information.

Yes, they're expensive, but they may be worth the money in time saved. Some of this information is copyright-protected and available only from these sources.

The longest list of specialized search engines that I know is Beaucoup, where you'll find over a thousand in dozens of categories.

Let's say that you're designing a web for the Kenmore Police. Your audience is Joe Smith, a Kenmore citizen who had a not-so-wonderful contact with an officer and who gets on the Web to learn more about the department. One of your objectives might be to give Joe the resources he needs to overcome his TV-induced mirage about police officers and to learn how crimes are really solved.

To tailor that infoormation for Joe, you would scour the Web for the best resources, link to them, and then provide some text explaining where the links go and what you want Joe to get out of them, that is, the reason for his going there.

You might go to a search engine and use the term "forensic," a technical term you picked up and which Joe Smith might not know. The results from the search engine won't end your search. They only begin it.

You would soon find your way to Zeno's Forensic Page. You would find your way there because many other sites link to it and it claims to be the web's best resource. Then you can continue your search because Zeno has done for you what you are trying to do for Joe Smith. You may well visit the Forensic Science Society and their Forensic WebLinks Search, which might get you to the Roanoke County, Virginia, Police Department or the American Society of Questioned Document Experts, which would be way down any list of search engine responses to the term forensic. But it might have exactly what you're looking for. How do you know?

While you're searching, don't forget to look at the pages for whether you can use the images or any part of the pages (design, navigational devices, etc.) as a model for your site.

You must keep asking yourself, What would help Joe Smith understand how police really solve crimes? voslogo.gif (26107 bytes)

I'm trying to take you beyond the search engines to what are often called metasites. They have little content themselves other than links to other sites -- which themselves may be lists of links. As the Web grows, they are part of its maturation.

For example, I'll bet all of you can find something of interest at Voice of the Shuttle. As the work of one person, Alan Liu, VoS is highly selective, it has several dead links, and it has especially useful annotations. A more impersonal but much larger metasite is the World Wide Web Virtual Library. Note its marketing links page. (Your marketing links page, along with a sentence or two description, counts as extra credit for MBA 604.)

Some of the links at these metasites may well show up on a search engine's results page. But these sites themselves wouldn't show up as the result of that same search. Think backwards: what would you search for to get Voice of the Shuttle?

Part of your value as a professional researcher will be your carefully tended, up-to-date list of links to the topics you specialize in. You'll do for your professional topics what Zeno has done for forensic science students and what this web is doing for Medaille marketing students. Whether you share that information via a Web page is up to you.

Wise MBAs --  whatever their specialty or industry or job category or responsibility level -- will know how to search the Web effectively. What are they going to do, ask for the morning off to drive down to the Erie County Public Library?

Psst.... You at the card catalog. Your co-workers back at the office are firing up Netscape.

monobar.gif (1022 bytes)

Tip

At a large site, look for a search option on the home page. This course web has a search option, too. It's not a search engine, the robot part. It's just the word-match part, so it's fast, it's accurate, and it searches the full text of every page.

monobar.gif (1022 bytes)

Don't Read This ...

... if you're liable to get upset about the lack of privacy online.

Have you ever tried to go down to the courthouse to look at public records such as real estate transactions, court filings, etc.? If so, you realize two things:

opinp.gif (941 bytes) there's an amazing amount of information there that many people would rather have kept private
opinp.gif (941 bytes)

it's not easy to get to that information

Yes, it's "public", but there may be a certain wisdom in having it behind high counters. If you're an officer of the court or a licensed investigator, you know what to ask for and how to ask for it. If not, you'll find it considerably harder to access the material. Once you get your hands on it, it's not user-friendly.

What if every record from every courthouse in the country were available online with the speed of a search engine?

Guess what. It is available. knowxlogo.gif (1428 bytes)KnowX. Not all of it and not back very far and not up-to-the-minute current. It costs a little bit of money for each piece of info. You still have to know how to interpret it and what to do with it. But if I had a big enough budget to troll through those databases and then organize my results, I could probably find out things that lots of people wouldn't want know others to know.

Where will it stop?

Does KnowX make you feel more secure or less secure?

monobar.gif (1022 bytes)

Don't try this at home, kids

REVERSE EMAIL LOOKUP

enter your email address or a friend's. Chances are good (not totally assured) that yours or your friend’s name will come up. Now, click on that name if it is a link. More information will surface, likely the name, complete mailing address, even the phone number. And of course,  street map programs will pinpoint the house, given the street address.

enter that Email address in a good search engine and you will see likely see what activity that person has been involved in on the ‘Net (newsgroups and discussion groups posted to, websites constructed, and so on.)

And you wonder how someone got your name just from an Email address? I don’t wonder, and neither should you!

monobar.gif (1022 bytes)

FAQ

Frequently asked questions. For starters, how do you pronounce FAQ?

ef-a-que? Try using that in a sentence. Try making it plural. Okay, that one's no good.

fax? Immediate confusion with the machine for transmitting documents.

fak? Most people use this one: Look for the fak. But is the plural the same, as in deer: one deer, two deer; one fak, two fak? Two faxes? What about spelling the plural? How would you use the thing as an adjective?

Meanwhile, FAQes (?) are all over the Web. If you use a search engine, you aren't going to find most of them. However, the USENET newsgroups were the place to be in the ten years between

the mid-80's
when the Internet's protocols started getting standardized
&
the mid-90's
when the Web brought pictures to the Internet and threaded discussions started popping up on every site from A to Z (including this course web)

As the newsgroups grew in popularity, newcomers would end up asking the same questions over and over. So the oldtimers started making lists of frequently asked questions and posting them frequently to the newsgroup. Newcomers were encouraged to "read the FAQ" before asking a question.

Many of these FAQ are regularly maintained and updated in text form even though many have migrated to Web sites. However, the ones still in text form can make great summaries of the common knowledge about many topics. The best metasites I know are a site about FAQes, MIT's repository, and Ohio State's more selective one. You can also read the newsgroup news.answers, where most FAQes are periodially posted.

I find the FAQes as a whole to be strongest on topics related to computers, to hobbies such as dog breeding, and to well-established academic discliplines such as linguistics.

The accumulated texts of all the newsgroup postings may be huge. But they're all in one place and you can search them quickly at DejaNews. These searches are especially good for turning up experts who may well respond to a polite email enquiry.

monobar.gif (1022 bytes)

coming: the future of search engines

graphic displays and new methods of organization

must-see site GRAPHICAL THESAURUS highly recommended

http://www.thinkmap.com

monobar.gif (1022 bytes)

Link to TALK (discussion forum)

duobar.gif (1186 bytes)

top.gif (255 bytes)btmbar.gif (5494 bytes)
last update: September 24, 1998
http://toLearn.net/marketing/isearch.htm