Chapter 8: Knowledge Sources for WSD
Eneko Agirre, Mark StevensonAbstract
This chapter explores the different sources of linguistic knowledge that can be employed by WSD systems. These are more abstract than the features used by WSD algorithms, which are encoded at the algorithmic level and normally extracted from a lexical resource or corpora. The chapter begins by listing a comprehensive set of knowledge sources with examples of their application and then explains whether this linguistic knowledge may be found in corpora, lexical knowledge bases or machine readable dictionaries. An analysis of knowledge sources used in actual WSD systems is then presented. It has been observed that the best results are often obtained by combining knowledge sources and the chapter concludes by analyzing experiments on the effect of different knowledge sources which have implications about the effectiveness of each.Contents
8.1 Introduction. 217
8.2 Knowledge sources relevant to WSD.. 218
8.2.1 Syntactic. 219
Part of speech (KS 1) 219
Morphology (KS 2) 219
Collocations (KS 3) 220
Subcategorization (KS 4) 220
8.2.2 Semantic. 220
Frequency of senses (KS 5) 220
Semantic word associations (KS 6) 221
Selectional preferences (KS 7) 221
Semantic roles (KS 8) 222
8.2.3 Pragmatic/Topical 222
Domain (KS 9) 222
Topical word association (KS 10) 222
Pragmatics (KS 11) 223
8.3 Features and lexical resources. 223
8.3.1 Target-word specific features. 224
8.3.2 Local features. 225
8.3.3 Global features. 227
8.4 Identifying knowledge sources in actual systems. 228
8.4.1 Senseval-2 systems. 229
8.4.2 Senseval-3 systems. 231
8.5 Comparison of experimental results. 231
8.5.1 Senseval results. 232
8.5.2 Yarowsky and Florian (2002) 233
8.5.3 Lee and Ng (2002) 234
8.5.4 Martínez et al. (2002) 237
8.5.5 Agirre and Martínez (2001a) 238
8.5.6 Stevenson and Wilks (2001) 240
8.6 Discussion. 242
8.7 Conclusions. 245
Acknowledgments. 246
References. 247