You are getting feed back from your customers , the feed back is in the form of text and there is a question at the last, an objective question, yes/ no type or will you recommend or not type.
You have say a 1000 s of such feed backs. Do you think it is easy for a human being to sort, find and get the sentiments of your customers ?
Here comes the algorithm of NLP- Natural Language Processing.
Python is used to understand the scenario.
Pre processing the input data
Before you input the data make sure you give the tsv, ie the tab separated file , TSV file as the CSV contains comas which the classifier mis interprets it. Secondly make sure you use a code to avoid the double quotes, "quoting=3".
Clean the text
- Choose the appropriate words that reflect the positive sentiments such as like, love, happy ,etc [ the tenses liked , loved should be grouped to get minimum no. of words for computation]
- use the library function re ie. import re
- re.sub() function will help to remove the special characters
- lower() function converts all the characters into lower characters
- Till now we have seen that we need to get a sentence, remove the special characters, convert into lower cases.
- Convert the sentences into words. use a package nltk which does this function.
- use stopword to choose only the relevant word in the language which represent the sentiment
- We need to separate the steam and the root word . For instance , liked need to be taken as 'like'. as separate function PorterStemmer is available for this activity.
No comments:
Post a Comment