Before jumping into the application of Deep Learning and NLP solutions for tackling the above problem, let’s establish the inherent problem present in the conversation intelligence.
Compared to news articles and other kinds of free-flowing text available on the internet, conversations have a different variety of issues to be solved from an engineering, mathematical and problem-solving standpoint.
Conversations are complex, and so is the data. Why?
There are a variety of reasons why the data derived from conversations are difficult to collect and use to extract information. Here are a few examples of why that could be:
- Conversations are random and dependent on state and user context.
- Meaning in conversations can be hierarchical, can be communicated in bits and pieces at irregular intervals with no inference.
- Conversations consist of topics that break at uneven points and can be disentangled. For example, we may speak about my trip to India for a while and then switch back to another experience. This involves many problems in understanding co-referencing and other aspects of structuring them.
- Conversational data are heavily subjective to the user present in the meeting. For a third party witnessing the conversation, it may take a long time and a lot of effort to comprehend.
- Error propagation is possible due to heavy bias and dependency on speech recognition.
- Context is an integral part of the conversation and is highly uncertain. It is synonymous with uncertainty similar to a non-zero sum game where it can’t be modeled as a fixed vector at any point in time. The modeling is always dependent on uncertain conditions.
- The inference AI model has to learn from a small amount of initialization data.
Conversation intelligence stands apart from traditional NLP where the hypothesis, situation and the inference communicated by the user are clearly stated and explained in a given flow. Whereas the conversational data is all across the place and has a clear mismatch with the existing embeddings and pre-trained models used on the scale by the NLP community. This makes conversation intelligence beyond a curve-fitting problem and it can be transposed as a problem of making decisions under uncertainty and incomplete information.
While deep learning problems when trained with advanced mechanisms like attention, multi-head attention tend to perform well on certain NLP tasks, in the domain of conversation, they fall prey to the problems of the train-test mismatch of conversation vs articles/other corpus. Secondly, it’s computation-intensive and costly to inject context.
Context can be intuitively understood as “What led to this point in the conversation and where it may lead next?”
As we still look for answers, let’s dig a little deeper into what both sides of the spectrum can fetch us, and a few of the approaches and aspects of hybrid learning might prove worthy.
More to follow…
Sign up here to start building!