Tracking how a change in a service affects Telecom Customers feeling using
Sentiment Analysis ‘Naïve Bayes’
Tracking the effect of change a service on the telecom customer feeling is very important analysis for Telecom Companies. As a result of fast growth and severe competition, customer retention and managing high churn rate are the most important challenges faced by telecom companies today. Customer retention can be achieved by identifying the feeling of the telecom customers after changing a service and take care about telecom customers by modify the services that reach low score of customer willing. This paper was done by using a combination among four stages of text preprocessing, personality analysis, and sentiment analysis and chat bot system is created to achieve the needed task. This paper show the effect of using the personality traits (agreeableness, emotional range) with sentiment analysis that help for reaching to a full description about customer feeling. The proposed solution achieved accuracy of 95% of determining the customer feeling. Combining the Sentiment Analysis ‘Naïve Bayes technique’ in the natural language processing and personality insights pre learning stage and adding a feedback using the obtained results achieve higher accuracy than using the traditional sentiment analysis techniques.
With the growth of telecom companies such as Etisalat ,Orange and Vodafone in our country, this are cause of increasing the telecom customer data. Computational linguists have taken advantage of these data, mostly addressing prediction tasks such as sentiment analysis, personality analysis and emotion detection. A few works have also been devoted to predicting what the customer feeling about the new service. Prediction tasks have many useful applications ranging from tracking opinions about service to identifying the best and bad service and predicting of the telecom customer satisfaction and so on.
What we need: – Telecom companies get a bad rap when it comes to customer experience. All too often, clients feel that service falls short of their expectations, and that complaints seem to be falling on deaf ears. Yet despite poor customer sentiment, few telecom companies have made customer centricity a priority. So in this paper we need to show how the using of customer textual data that come from some tools such as AI chat bot help on determine if the telecom customers are willing or not with the new service, if not, how we can help them?, what is the percent of success and accept the new service. What are the services that a telecom customer does not need? So we can help the telecom companies to live day per day with its customers from the behavioral side. If you know aspects and themes in each response, you can also answer questions like: For how long do people react negatively to a change in a service, or do they really love the new feature added? In nowadays we observe the raise of telecom companies so, how can we help telecom companies to prevent its customer from migrate to another telecom company because of the bad of service?
By analyzing the sentiment more accurately, and in particular finding the services that telecom customer are really unhappy about, you can:
Focus more on what will make a difference.
Help users to find what they needs; help increasing telecom customer’s satisfaction.
Help Telecom Company to produce the best service for its customers.
Make the company work easier, Help Company to keep of their customers.
Help Telecom Company for improve its customer care by keep track what the percent of acceptance that the new service generates.
If you know aspects and themes in each response, you can also answer questions like: For how long do people react negatively to a change in a service, or do they really love the new feature added?
With this in mind that the symptoms of this problem are very dangerous ones, telecom companies ignoring rate for the symptoms leads to migration of its telecom customers. So detecting the problem automatically from telecom customer conversation using sentiment analysis and personal insights is a vital process to give early warnings before it gets dangerous. Help Telecom Company to understand its weakness point for re-correcting it, help all kind of user to contact with the telecom company by developing the slang language chat bot. preventing telecom customers from leaving the telecom company to another one.
The problem entities are how understand and use the agreeableness and emotional range personality traits and its sub traits, the sentiment analysis, emotional values of the telecom customers as showing in figure1, and how we use this traits for reaching to detailed report about customer feeling.
Figure 1: showing the cooperation between customer traits (agreeableness and Emotional range) and sentiment analysis in tracking telecom customer feeling.
conversation data after changing a service to determine the success of the service and willing of the customers about it, how the success of change in a service can see in customer conversation text as we know that “The pen is mightier than the sword”, So what we are trying to do is to automatically recognize the feeling of the telecom customer using a mix of personality analysis and some sort of ML sentiment analysis, NLP and textual AI algorithms, figure2 show this processes. Doing the automatic recognition of the telecom customer sentiment is expected to increase the efficiency to be higher than 95%.
The assumption behind this methodology is that Textual data especially those expressing concerns, frustrations and acceptance from customers are rich in knowledge which needs to be mined for insights. Passed on (Pang and Lee, 2008) Sentiment analysis is based on categorizations of particular words as ‘positive’ or negative. Algorithms based on presenting conversations in response to such emotional words have to be ‘trained’ on this data. For sentiment analysis in particular, there are many issues with training data, because the procedure depends on the assumption that words are most often associated with particular feelings. Sentiment analysis algorithms can have difficulty identifying when a word is used sarcastically, for example ,Sentiment analysis, unlike classical text mining which focuses on topical words, picks only sentiment signals for real time analysis.
On the other hand and according to (Brent W. Roberts and Daniel Mroczek, 2008), a common assumption is that personality traits act like metabolic set points. People may stray briefly from their biological propensity, but they will then tend to drift back to their genetically driven set point. Under these types of models, one would expect to find a negative or null association between time and mean-level change, because any change will represent short-term fluctuations that disappear as people return to their need point so in this work we try to use the customer personality insights and compute the agreeableness and emotional Range traits that help in tracking customer feeling. The proposed system is built over a state of the art machine learning algorithm used for learning process of sentiment analysis. It is developed by combining these four factors: (1) text preprocessing, (2) sentiment analysis, (3) personal analysis and (4) reporting stage and a chat bot system is created to achieve the needed task as showing in figure 2.
the textural data of telecom customer conversations on chat bot, pass through set of NLP algorithms, for doing the need data preprocessing, computing bag of words and set of text preprocessing, then we can passing it to the pre trained naïve bays classifier for identify the customer sentiment, on the other hand we use the IBM personality insights (PI) API for computing the customer big five traits, feeding the two result to a pre trained ML algorithm will increase the accuracy to reach to values higher than 95%.we will discuss this in detail, then we can use all of this results for helping telecom customers get what they need.
In this article, we focus exclusively on mean-level feeling and individual feeling of the telecom customer after any change, because these indices most directly reflect increases or decreases in telecom customer population, we show how to follow the customer feeling towards the new service, how to use the customer traits (agreeableness and Emotional Range) to track telecom customer feeling from the textural data, so we develop an AI chat bot used slang language in communication with the telecom customers. We will show how to improve correct measuring of telecom customer feeling by using the customer personality analysis with the sentiment analysis.
(M Dachyar, A Rusydina, 2015) This study uses IBM SPSS AMOS 21 software for measuring telecom customer satisfaction and its relationship towards telecom services. Recent research works by (Sujata Joshi 2014) used the Factor Analysis; the results were enhanced a little, but not enough to increase the performance. Using two established Recurrent Neural Network algorithms, viz. Elman Recurrent Neural Network (ERNN) and Jordan Recurrent Neural Network (JRNN) by (Balakumar Vijayaraman, Swarnalatha Chellappa, 2016 ), Sentiment analysis has gain much attention in Investigating the customer care in recent years (Sampriti Sarkar, 2018 ; Stephen Nabareseh,Eric Afful-Dadzie, Petr Klímek, Zuzana Kominkova Oplatkova ,2014 ; ), nowadays if you need to understand the customer personality you may use the personality insights that made a change from one person to another (Brent W. Roberts and Daniel Mroczek, 2009 ; ), we found that customer traits relate to the display of positive emotions by the service provider et la (Hwee Hoon Tan, Maw Der Foo, and Min Hui Kwek, 2017).
For computing the telecom customer personality traits et la(Barbara Plank, Dirk Hovy, 2015) we will use the IBM Watson personality insights service (IBM PI API) and sentiment analysis (Mika V.Mäntylä Daniel Graziotin, Miikka Kuutila, 2017) to identify the telecom customer felling after changing a service.
All the above along with (Apoorv Agarwal Boyi Xie Ilia Vovsha Owen Rambow Rebecca Passonneau, 2011) trial are the triggers for the methodology used in this paper. What has been done here is considered an innovative four-stages technique that made use of the above trials, by taking the accuracy of (Tony Mullen and Nigel Collier, 2004) and enhancing the algorithm to be faster and effective by adding the pre learning step made by ) Gurvinder Singh and Rajinder Singh, 2017 ( ,stage is the post identification process using nearest neighbor machine learning technique. Uniqueness comes from combining sentiment analysis, personality insights, learning and feedback processes together.
Until now all of the related research that study the customer sentiment used the Factor Analysis et al (Sujata Joshi 2014) or neural networks et al (Balakumar Vijayaraman, Swarnalatha Chellappa, 2016), but we analyze the telecom customer conversation using the IBM Watson personality insights service with sentiment analysis (naïve Bayes) and learning process for identify the telecom customer felling with changing a service. We used weka tool for benchmark the used technique and reaching to the best algorithms.
Materials and methods
These assumptions were set in order to start the experiment:
Using the effective sub traits of the agreeableness(Agrsub) and emotional range(ERsub) will give a good description about customer feeling and needs.
Using the personality insights will increase the final accuracy.
The mixing of sentiment analysis with Personality analysis will give the higher result.
Using chat bot that used slang language will improve efficiency by providing the textural data that we need about Telecom customer.
In this paper we used a combination among four stages as showing in figure 1:
in this stages we prepare the text to feed it to the naïve Bayes in a set of sub-stages as showing in figure 2, (1) tokenize the sentences within the text for splitting the text to a set of sentences, (2) tokenize and do minimal processing for words in each sentence, (3) stop words eliminating using the python package that called NLTK, (4) Bag of words computation, where each unique word in a text will be represented by one number. (5) TF-IDF Computation, TF-IDF stands for term frequency-inverse document frequency, and the tf-idf weight is a weight often used in information retrieval and text mining. This weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus. Variations of the tf-idf weighting scheme are often used by search engines as a central tool in scoring and ranking a document’s relevance given a user query. Then the data ready to feed to Naïve Bayes.
In the sentiment analysis stage we used the naïve Bayes classifier for classification between the positive and negative emotions. Really we use this process to tack care of the sentiment of the last customer text not all of the text. This process passed through a set of sub stages. First dataset collection and we discuss this on the special part of dataset, Second Preprocessing stage, Third and last train test split in this stage we use the k-fold algorithm for splitting the data into 10 folds, that go in 10 iteration; in each iterate used 9 folds in training process and one fold for testing, this process increase the accuracy to be upper than 92%. All of this work was developed using python, and the processing is done on one machine.
We found that we can depending on the personality traits for tracking the customer sentiment after any change based on(Yaou Hu ,Hyun Jeong Kim;2018)( Ninette van Aarde ,Deon Meiring ,Brenton M. Wiernik ;2017). So we supposed that we will use personal analysis process in our work but which ones of the big five traits that will be useful? For answering this question we depending on (Panagiotis Adamopoulos , Anindya Ghose, Vilma Todri;2018 )( McCarthy, Megan H. Wood, Joanne V. Holmes, John G; 2017).
Effective traits & its Sub Traits
We find that the agreeableness and Emotional Range will help in following customer sentiment. For defending our work from weakness we do not used the two traits itself only but, we used the effective sub traits of each one(Straightforwardness, Cooperation, Trust) and (Fiery, Prone to Worry, Sensitivity to stress) respectively, for finding the effective sub traits we made an experiment a set of sentiment conversation and each time note the result of each one. After hundreds of times we found that we can depend on (Straightforwardness, Cooperation, Trust, Fiery, Prone to Worry, Sensitivity to stress) as the effective ones on tracking of customer felling as showing in table 1, table 1 show the huge deference between the effective (Cooperation, trust etc…) and not effective (Modesty ,Altruism etc…) sub traits.
Sub traits Pos Neg Nur Pos Neg Nur
Modesty 41 55 41 63 26 67
Straightforwardness 34 35 77 56 10 71
Altruism 26 6 21 20 25 65
Sympathy 25 4 23 20 10 40
Cooperation 15 50 98 94 3 85
Trust 13 37 95 86 28 80
% Fiery 75 55 11 15 87 52
Prone to Worry 77 51 26 20 78 80
Sensitivity to stress 85 63 41 32 59 87
Impulsiveness 86 69 78 67 73 86
Self-consciousness 97 94 93 73 88 96
Melancholy 97 93 65 86 93 96
Table 1 showing a set of the result of testing positive, negative and neural conversation for finding the effective sub traits (the shaded ones)
In the personality analysis process we used IBM Watson personality insights service for computing the big five traits of the telecom customer, the text come from the telecom conversations on the AI chat bot, because of the telecom customer used slang language in chatting in the chat bot, we firstly take the customer conversation passing it through set of processes convert it from slang to standard language using slang dataset. Then sending the last 10 conversation to the IBM PI to compute the big five traits of the customer, after that we extract the result of the effective traits (Agreeableness, Emotional range and its sub-traits) from the returned JSON file. Figure 4 show three cases of positive, negative and natural sentiment and customer effective sub traits, improve our choice for the used sub traits that give deferent and useful value with the deference of the customer sentiment. That used with sentiment analysis in post processing.
Reporting the result
In this last process we feed the result of sentiment analysis and Personality Analysis processes to a pre learned Model that helps to find the meaning of this results. First we need to re arrange the negative sub treats with the negative value and the positive sub traits with the positive value of the sentiment analysis then we need to calculate the mean for each set. We found that in the positive cases the mean of agreeableness sub traits (Agrsub) scores low value (Mean=50) as showing in table 2 that calculated using the fowling Equation.
Agr=(???(Agr? sub))/(count(Agr sub))=(Straightforwardness + Cooperation + Trust)/3
ER=(???(ER? sub))/(count(ER sub))=(Firy + Pone to Wory + Sensetivity to stress)/3
Negative Positive Natural
Mean Agreeableness (Agr)% 90 13.66667 78.66667
Emotional Range (ER)% 26 74.66667 73
Table 2: showing the mean of the mean of the agreeableness and emotional range effective sub-traits results for customer conversation in case of positive, negative and natural conversations
Then we used the mean of each trait with the result of the sentiment analysis to give the telecom company a detailed report about the customer feeling i.e. (this customer in first was happy and agree with the service features but after adding the last feature he/she was very angry and sad ). Based on the result of the last stages we help the telecom companies to make a decision about the new service. We use this process for help the telecom company to understand its customer willing, sentiment and emotions for predicting the fate of its customers in the coming days.
Really in this article we used tow dataset. Because of we need to near to telecom customer, we have to use a chat system that understand the language and the accent that the most customer type so we used the slang language chat bot, on the other hand we need to use the IBM personality insights service in telecom customer personal analysis process but this service need take the string in a standard language, then we need to find a way to convert the string from slang to its standard, so we have create the slang dataset for help us in this process that works as a dictionary. The second dataset was the sentiment dataset that used for pre-learning the ML sentiment analysis algorithms. Coming, we will discuss the two in detail.
Slang To standard dataset
The slang dataset we use works as a dictionary that converts from slang to standard, such as “r u”to “are you”; for collecting this dictionary we use social media posts, plugs, tweets and comments that related to telecom companies and customer. This dictionary contains 3000 words in slang and its standard meaning. It was feed as life input for the algorithms.
Table 1 shows some samples of the slang to standard dataset. Each word has its meaning in standard.
Are you here R u here
describe something you like it’s my cup of tea
A feeling of depression or sadness. Feeling blue
A proclamation of honesty For Real
“and there it is” or “and there you have it.” Bob’s your uncle
Table 3: shows some samples of the slang dataset.
Sentiment Analysis Dataset
For the sentiment analysis process we have to collect as large as possible sentiment dataset that used for pre learning the ML SA algorithms. We used a dataset of more than 15000 labeled conversations, comments and posts that have about 55% of the dataset have positive meaning and 45% have a negative meaning. Training dataset contains 12000 training conversation. Training data have a positive and negative sentiment; This dataset was collected from many sources (Nabil et al., 2015)1 (Abdulla et al., 2013)2 (Mourad and Darwish, 2013) 3 (Aly and Atiya, 2013) 4 (Banea et al., 2010)5, (Hady ElSahar and Samhaa R. El-Beltagy, 2011)6. The data gathering process was not easy; we target each telecom web pages, pages in social media and telecom advertisement on the internet.
id Sentiment Text
1 0 The only real effects work is the presence of all the animals, and the integration of those into the scenes is some of the worst and most obvious blue/green-screen work I’ve ever seen.
2 0 For a service that costs as much as this one does, I expect it to work far better and with greater ease than this thing does.
3 0 Not sure who was more lost – the flat characters or the audience, nearly half of whom walked out.
4 1 oh thank you!
5 1 This review is long overdue, since I consider A Tale of Two Sisters to be the single greatest film ever made.
6 1 Thanks, I need all the help i can get.
Table 4: Showing samples of sentiment analysis dataset.
The process of Tracking the effect of change a service on the telecom customer feeling implies us to find a way be which we can communicate with its customer because we know that “The pen is mightier than the sword” so we have to create an AI telecom chat bot that be near for all customer type by implement it using the slang language, by this way the telecom company can use it for offering the new service, then we can observe the customer conversation and analyzing it from the Psychological, emotional side using the sentiment analysis and personality analysis then by this way we can correctly track the effect of change a service.
In this paper we did not depend on the sentiment analysis only but we used the personality traits as will. We found that customer traits relate to the display of positive emotions by the service provider et la (Hwee Hoon Tan, Maw Der Foo, and Min Hui Kwek, 2017) if we used the big five(Agreeableness, Emotional Range) and its sub-traits i.e.(Self-consciousness, Sensitivity to stress , Prone to worry ; Trust , Modesty , Sympathy) and mixing its result to the sentiment analysis process and feeding it to a pre-learned ML Algorithm then we will help telecom companies to live day per day with its customer. We used the reporting stage that based on the result of the first two processes result (sentiment analysis process and personality analysis). This processes with each other will give us a completely, accurate and detailed description of the telecom customer feeling about the new service or the telecom company generally.
All previous works were use English in chat bot and they use standard English in the conversation with customers, but we will use the English with standard and slang in chat bot and to be near as possible to all type of customers but also will made it a sensitive and analytical chat bot that understand telecom customer emotions and personality to give the telecom company a full customer description.
Results and discussion
Telecom companies use its web page to offer its new services but it does not know what is the acceptance that it accurate. So we have proposed the chat bot system to communicate with customers and knowing those opinions of these services. During the conversation between telecom customers and AI chat bot, the texts of the conversation go through set of AI and Machine Learning algorithms. Then we can give the telecom company side a full description of the effect of change a service on the telecom customer feeling, what the agreeableness and sadness that it create i.e. (this customer in first was happy and agree with the service features but after adding the last feature he/she was very angry and sad)