| United States Patent | 7,233,943 |
| Modha , et al. | June 19, 2007 |
A method of searching a database of documents, wherein the method includes performing a search of the database using a query to produce query result documents; constructing a word dictionary of words within the query result documents; constructing an out-link dictionary of documents within the database that are pointed to by the query result documents; adding the query result documents to the out-link dictionary; constructing an in-link dictionary of documents within the database that point to the query result documents; and adding the query result documents to the in-link dictionary.
| Inventors: | Modha; Dharmendra Shantilal (San Jose, CA), Spangler; William Scott (San Martin, CA) |
| Assignee: |
International Business Machines Corporation
(Armonk,
NY)
|
| Appl. No.: | 10/660,242 |
| Filed: | September 11, 2003 |
| Application Number | Filing Date | Patent Number | Issue Date | ||
| 09690854 | Oct., 2000 | 6684205 | |||
| Current U.S. Class: | 1/1 ; 707/999.003; 707/999.01; 707/E17.108; 715/234 |
| Current International Class: | G06F 17/30 (20060101); G06F 7/00 (20060101) |
| Field of Search: | 707/2,3,5,6,10,104.1,4 715/513,501.1,532 |
| 5787420 | July 1998 | Tukey et al. |
| 5787421 | July 1998 | Nomiyama |
| 5819258 | October 1998 | Vaithyanathan et al. |
| 5835905 | November 1998 | Pirolli et al. |
| 5857179 | January 1999 | Vaithyanathan et al. |
| 5864845 | January 1999 | Voorhees et al. |
| 5895470 | April 1999 | Pirolli et al. |
| 5920859 | July 1999 | Li |
| 6012058 | January 2000 | Fayyad et al. |
| 6038574 | March 2000 | Pitkow et al. |
| 6115708 | September 2000 | Fayyad et al. |
| 6122647 | September 2000 | Horowitz et al. |
| 6256648 | July 2001 | Hill et al. |
| 6298174 | October 2001 | Lantrip et al. |
| 6363379 | March 2002 | Jacobson et al. |
| 6389436 | May 2002 | Chakrabarti et al. |
| 6460036 | October 2002 | Herz |
| 6556983 | April 2003 | Altschuler et al. |
| 6684205 | January 2004 | Modha et al. |
| 6862586 | March 2005 | Kreulen et al. |
Kuo et al., Web Document Classification based on Hyperlinks and Document Semantics, Aug. 2000, PRICAI 2000 Workshop on Text and Web Mining, pp. 44-51. cited by examiner . Pirolli et al., Silk from a Sow's Ear: Extracting Usable Structures from the Web, 1996, CHI, pp. 1-9. cited by examiner . Terveen et al., Constructing, Organizing, and Visualizing Collections of Topically Related Web Resources, 1999, AT&T, pp. 67-94. cited by examiner . Chakrabarti et al., Enhanced hypertext categorization using hyperlinks, 1998, ACM, pp. 307-318. cited by examiner . Modha et al., Clustering Hypertext with Applications to Web Searching, 2000, ACM, pp. 143-152. cited by examiner . Gurrin et al., A Connectivity Analysis Approach to Increasing Precision in Retrieval from Hyperlinked Documents, pp. 1-10. cited by examiner . Neville et al., Clustering Relational Data Using Attribute and Link Information, pp. 1-6. cited by examiner . Sougata Mukherjea, Organizing Topic-Specific Web Information, pp. 133-141. cited by examiner . Chen, "Structuring and Visualizing the WWW by Generalised Similarity Analysis", In proceedings of Hypertext, 1997, pp. 177-186. cited by other . Foley et al, "Interactive Clustering for Navigating in Hypermedia Systems", ACM Press, 1994, pp. 136-145. cited by other . Modha et al., "Concept Decompositions for Large Sparse Text Data Using Clustering", 1999, pp. 1-32. cited by other . Silverstein et al., "Analysis of a Very Large Alta Vista Query Log", SRC Technical Note 26, 1998, pp. 1-17. cited by other . Chakrabarti, S., Dom, B., Indyk, P., "Enhanced Hypertext Categorization Using Hyperlinks", ACM Sigmond 1998, Seattle, Washington, pp. 1-12. cited by other . Kleinberg, Jon M., "Authoritative Sources in a Hyperlinked Environment", Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1998, IBM Research Report RJ 10076, May 1997, pp. 1-33. cited by other . Lawrence, Steve and Giles, C. Lee, "Searching the World Wide Web", Science, vol. 280, Apr. 3, 1998, pp. 98-100. cited by other . Larson, Ray R., "Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace", Proceeding of the 1996 American Society for Information Science Annual Meeting, pp. 1-13. cited by other . Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D., Kleinberg, J., "Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text", WWW7, 1998, pp. 1-14. cited by other . Bradley, P.S. and Fayyad, Usama M., "Refining Initial Points for K-Means Clustering", ICML, 1998, pp. 91-99. cited by other . Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghayan, P., Rajagopalan, S., Tomkins, A., Kleinberg, J.M., and Gibson, D., "Hypersearching the Web", Scientific American, Jun. 1999, pp. 1-8. cited by other . Weiss, R., Velez, B., Sheldon, M.A., Namprempre, C., Szilagyi, P., Duda, A., Gifford, D.K., "Hypursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering", ACM Hypertext, 1996, pp. 180-193. cited by other . Mukherjea, S., Foley, J.D., Hudson, S.E., "Interactive Clustering for Navigating in Hypermedia Systems", ACM Hypertext, Sep. 1994, pp. 136-145. cited by other . Chen, C., "Structuring and Visualising the Web by Generalised Similarity Analysis", ACM Hypertext, 1997. cited by other . Pirolli, P., Pitkow, J., Rao, R., "Silk from A Sow's Ear: Extracting Usable Structures from the Web" ACM, SIGCHI Human Factors Comput., 1996. cited by other . Chen, C., Czerwinski, M., "From Latent Semantics to Spatial Hypertex--An Integrated Approach", ACM Hypertext, 1998, pp. 77-86. cited by other . Botafogo, R.A., "Cluster Analysis for Hypertext Systems", ACM-SIGIR Jun. 1993, pp. 116-125. cited by other . Rasmussent, E., "Clustering Algorithms", Information Regrieval: Data Structures and Processes, 1992, pp. 419-442. cited by other . Hartigan, "Clusterin Algorithms," Wiley Publication, Chapter 4, 1975, pp. 84-107. cited by other . P. Willet, "Recent Trends in Hierarchic Document Clustering", Inform. Proc. & Management, 1988, pp. 577-597. cited by other . Frakes et al., "Information Retrieva:l Data Structures & Algorithms", Clustering Algorithms, Chapter 16, 419-442, 1992. cited by other . Chen et al., "From LatexSemantics to Spatial Hypertext An Integrated Approach", In Proceedings of Hypertext, 1998, pp. 77-86. cited by other . Weiss et al., "Hy Pursuit: A Hierarchial Network Search Engine that Exploits Content-Link Hypertext Clustering," In Proc. of Hypertext, 1996, pp. 180-193. cited by other. |