Word Sense Disambiguation In Language Analysis

2195 Words9 Pages

1 Introduction
Ambiguity is a fundamental characteristic of every language of which the English
Language is not an exception. A considerable number of English words have more than one meaning. The meaning of word intended by a speaker or writer can be inferred considering the context of usage.
For example, consider the following sentences: (a) My bank account yields a lot of interest annually (b) The children are playing on the bank of the river. Based on the context of usage of the word “bank” in the two sentences above, we can infer that the first instance i.e sentence (a) is referring to a financial institution while the second instance, sentence (b) is referring to a sloping land beside a river. However, human identification of the right …show more content…

Basically, the output of any word sense disambiguation system is a set of words sense-tagged with the right synonymous word (if any). Considering the instances in the examples above, the sentences can be sense-tagged as follow: (a) My bank/financial institution/banking concern account yields a lot of interest annually.
(b) The children are playing on the bank/sloppy land of the river.
Word Sense Disambiguation relies on knowledge. This means it uses a knowledge source or knowledge sources to associate the most appropriate senses with the words in context. Ideally, Word Sense Disambiguation is a means to an end but not usually the end itself, enhancing other tasks in different fields and application development such as parsing, semantic interpretation, machine translation, information retrieval and extraction, text mining, and lexical knowledge acquisition.
Approaches to word sense disambiguation may be knowledge-based (which depends on some knowledge dictionary or lexicon), supervised (machine learning techniques to train a system from labelled training sets) or unsupervised (based
on …show more content…

Micheal Lesk (1986) invented this approach named gloss overlap or the Lesk’s algorithm. It is one of the first algorithms developed for the semantic disambiguation of all words in unrestricted texts. The only resource required by the algorithm is a set of dictionary entries, one for each possible word sense, and knowledge about the immediate context where the sense disambiguation is performed. The idea behind the Lesk’s algorithm represents the starting seed for today’s corpus-based algorithms. Almost every supervised WSD system relies one way or the other on some form of contextual overlap, with the overlap being typically measured between the context of an ambiguous word and contexts specific to various meanings of that word, as learned from previously annotated data.
The main idea behind the original definition of the algorithm is to disambiguate words by finding the overlap among their sense definitions. Namely, given two words, W1 and W2, each with NW1 and NW2 senses defined in a dictionary, for each possible sense pair W1 i and W2 j, i = 1..NW1, j = 1..NW2, we first determine the

More about Word Sense Disambiguation In Language Analysis