Critical response to Precision among World Wide Web Search Services (Search Engines): ----- 6/24/97 1. Oral Defense Comments The committee that heard the oral defense of the paper suggested possible approaches for a future study of search services. One could try to study the diversity of sites that a search service returns. For example: within the first twenty links returned, how many different active servers are retrieved? Or how many Web pages are returned that are not part of the same network of reference. Another variable that one could try to study is the recency of pages, how long ago were the pages retrieved last updated? (That information may be difficult to find). Another suggestion was to focus on different subject areas and study how much information is even out there, and what its quality is. Then, are some search services better than others at finding the information in this area? ------ 7/22/97 Dear Sirs, I just finished reading your paper "Precision among World Wide Web Search Services (Search Engines): Alta Vista, Excite, Hotbot, Infoseek, Lycos", and I think you may have some problems with your results with respect to the Alta Vista "a" system. It turns out that when the Alta Vista advanced system is used the results are returned in no particular order. They are not scored, ranked, or sorted in any way! I noticed that the Alta Vista "a" system did very poorly on queries 2, 3 11, 12 and 13 for test 2 (as depicted in Table B2a) as well as for test 3. All of these queries were processed using the Alta Vista "a" search engine instead of the regular Alta Vista search engine. I think the reason this was done is because they were all Boolean queries except for query 13 which was a simple 2-word phrase query. I couldn't understand how the results (for those queries) could drop off so dramatically from the test 1 numbers to the test 2 numbers, compared with the Excite search results for the same queries. Then I tried the queries myself on the Alta Vista "a" search engine and found out that the results are not ranked. Surely this will start to mess things up when you apply the test 2 and test 3 metrics to unranked, unsorted results. Your first 20 documents are not the top 20 best matching documents. When I did the same queries on the regular Alta Vista search site, I got a completely different set of results in the top 20 documents returned. Boolean queries can be done with the regular Alta Vista search engine (and the results are ranked and sorted). I don't know if you have already discovered these facts, but I belive them to be true, and thought I would share them with you. -- Rick Hemphill Naval Command, Control and Ocean Surveillance Center San Diego, CA 92152 hemphill@nosc.mil Reply: Rick's criticism is valid. Therefore, the conclusion should be: if you use Alta Vista in the advanced mode, make sure to use the ranking field. Also, I have not systematically tested to see if this is true, but in the future, I will lean toward proximity operators rather than boolean operators. So the poor ranking of Alta Vista in the study may be because we did not use proximity operators and ranking operators. Vernon Leighton