It provides the bully with the capability of embarrassing or hurting the target person in an online forum or post, inciting violence in some way, especially in the world of social networking sites. Although cyberbullying doesnt happen in person, the effect of it in a psychological aspect or emotional aspect is just as devastating or even more, in the long term.
Cyberbullying is a more intense form of the traditional bullying, going well beyond the confined physical regions of a workplace or a school,often leading to a victimn having no escape from it. It provides the bully with the capability of embarrassing or hurting the target person in an online forum or post, inciting violence in some way, especially in the world of social networking sites. And in the electronic world, bullies can always be anonymous. For example, by use of dummy mail IDs, fake names in IM, chat rooms, text messages and many other places on the internet and hence remain completely unknown; this can relase them from the otherwise socially applicable constraints on the behaviour.Also, more than often, these ’places’ on the internet are
…show more content…
It matches certain words with a list of characters or patters i.e regular expressions. Without looking at the actual intent of the commmunicated sentence, it removes or censors away all the patterns which match. In this work, I propose the addition of two modules to the existing systems: the Part-Of-Speech tagger and a module to exploit the context by the use of the Natural Language Tool Kit in Python. This process makes relation between words understandable and thus increases the efficiency of the software used for detecting cyberbullying. The accuracy of the software is found to be …show more content…
Sentence Parsing uses input from POS tagger to create links between verbs and nouns to provide contextual meaning of sentence. A classifier can be trained to work out which suffixes are most informative. To find the most common suffixes:
Next, a feature extractor function is defined that checks a given word for these suffixes:
Feature extraction functions behave like tinted glasses, highlighting some of the properties
(colors) in our data and making it impossible to see other properties. The classifier will rely exclusively on these highlighted properties when determining how to label inputs.
In this case, the classifier will make its decisions based only on information about which of the common suffixes (if any) a given word has.
Now, it is used to train the classifiertagged words = text5.tagged words(categories=’chat’) featuresets =[(pos features(n), g) for (n,g) in tagged words] size = int(len(featuresets) * 0.1) train set, test set = featuresets[size:], featuresets[:size] classifier = nltk.DecisionTreeClassifier.train(train set)
Here, a frequency distribution of the words is created and then the last, second last and the third last letters in them are extracted. Then the first 100 keys of the