Sentiment analyzing is
analyzing a sentence to find out that the sentiment of it is positive or
negative. Part of the project task of our team is analyzing the sentiment of
the comments of 4 domestic mobile phones: Xiaomi M4, Smartisan T1, Huawei
Honor6, Meizu MX4.
Here are the key steps for algorithm:
1. Read the text and tokenize it.
2. In every sentence, find the sentiment word, and record
its feature (positive or negative) according to the sentiment dictionary and
position.
3. Find the adverb of degree before the sentiment word. When
we find one then stop searching. And we will set weights for adverbs of
different degrees. And the weights will multiply the sentiment value (assume
the primary sentiment value of every sentiment word is 1)
4. Find all the negation word before the sentiment word. If
the number of negation word is odd, then the sentiment value will multiply -1.
If the number is even, multiply 1.
5. If there is ‘!’ in the sentence, every ‘!’ will add 2
sentiment value to the corresponding feature.
6. Print out positive and negative value and the corresponding
percentage of every sentence.
7. Add all the sentiment value up and print out the positive
and negative value and the corresponding percentage of the whole text.
8. Calculate the average and variance of the positive and
negative sentiment for the text.
And during the programming, I
came up with some problems:
1.
Python is a
little troublesome for processing Chinese characters. The encode information
should be presented in Unicode.
2.
Python
sometimes can’t input the data in txt file completely.
3.
As the Internet
words are much different from the standard sentiment dictionary. We should add
and edit some words in the sentiment dictionary after we study the linguist habits
of the netizen. That will promote the accuracy of the analyzing outcome.
4.
There is
some difference of sentiment value between analyzing the whole text and
analyzing every sentence and add them together. I think there might be some
unnecessary values at the boundary of two sentences. For example, the sentiment
word in the beginning of a sentence will look for adverb of degree in the end
of the previous sentence.
We are still working on optimizing the outcome. Hope we can achieve our goals.