Written by Scott Wilson
There’s no science fiction vision of intelligent machines that doesn’t come with the ability to converse. From HAL 9000 refusing to open the pod bay doors to etiquette and protocol droid C-3PO who never knew when to shut up, the human conception of artificial intelligence always assumes they can communicate with us in our own language.
Yet the ability to understand and converse is a hard problem for AI to overcome. Human language is a messy place. It’s full of difficulties like:
- Abstract concepts and assumptions
- Varied pronunciation, dialects, and slang
- Ambiguous meaning
- Inconsistent spelling and phonetic rules
- Nuance and secondary meanings
Machines, with a love of consistency and regularity, have a tough time with all of these. And that’s even before you get to the hard problem of imparting meaning and figuring out what to do with what they are told.
Solving those problems all falls into the field of NLP (natural language processing). And it’s making fantastic advances toward a world where talking with machines is something we do every day.
What Is Natural Language Processing (NLP)?
Natural language processing is about teaching machines to 1) understand spoken or written language, and 2) convey coherent information in the same way. It involves taking naturalistic speech or writing patterns from humans and allowing machines to interpret the concepts, or allowing machines to conceptualize and provide information to people in plain language.
Some natural language processing examples include:
- Email spam filtering systems
- Autocomplete
- Voice assistants like Apple’s Siri or Amazon Alexa
- Automated answering and customer support systems
- Language translation software
And, of course, everyone’s latest favorite, chatbots.
Of course, just explaining what it is used for doesn’t help much with the extremely tough math and science behind NLP methods and techniques.
In fact, NLP is an interdisciplinary field that mashes up hardcore math and statistics with linguistics.
Talk as the Test of Intelligence in Machines
To give some perspective on just how important to AI natural language processing is, it’s useful to take a look at one of the first tools proposed to determine whether or not a machine had achieved the ability to think.
The Turing Test was developed by Professor Alan Turing in 1950 as an idea called the imitation game. Turing’s proposal was that a human evaluator would simultaneously carry on a written conversation with two unknown partners: one also a human, the other a machine.
If the evaluator was unable to tell from the conversation which was which, the machine could be said to have passed the test.
It’s notable that passing the Turing Test wouldn’t involve knowledge or computational skill. A computer could be wrong as much as a human could about items of fact or calculation. But the skill it showed in putting together concepts, words, and sentences in a natural fashion would allow it to pass.
In some senses, the Turing Test has become irrelevant today. There’s no consistency in human interrogators; chatbots have long since cleared the bar with some people, such as the Google engineer who became convinced the language model he was testing had come alive.
But despite its weakness as a scientific indicator, the Turing Test is our most innate and basic test of thinking machines. After all, by what notion other than our conversation do we determine that people can think?
NLP runs in both directions. It takes in the technologies involved in speech recognition and in language generation. Translation is another subset of NLP; after all, if a machine can speak one language, why not all of them?
NLP Training, and Models: How Do Natural Language Processing Models Work?
For human beings, spoken language as a way to communicate is perfectly natural. Nothing seems simpler. But for machines, it’s a hard lift to get from data to dialogue.
There are two main computational paths to NLP: symbolic and statistical.
Symbolic NLP works the way most people would expect it to work: a program is given a dictionary and a set of grammatical rules in code and applies those to the words and sentences it encounters.
It turns out that language is a big conceptual space and that the rules aren’t as solid as your 3rd grade teacher told you.
Symbolic attempts found some early successes, but typically collapsed under the weight of complexity.
Enter statistical NLP.
Statistical NLP took advantage of machine learning models to analyze massive volumes of text and come up with statistical relationships between words in common use. Those relationships could be used to infer meaning or to create responses. With remarkably few coded rules, statistical models could be trained for far more flexible and accurate performance than symbolic NLP systems.
Even with the magic of machine learning, though, training and developing NLP models took time and still didn’t achieve human-level conversational skill or meaning extraction. It took a breakthrough in, of all things, computer vision for the next big thing in NLP to happen.
Natural Language Processing Techniques Got a Boost from Computer Vision Systems
That big thing was the creation of convolutional neural networks (CNNs) using deep learning algorithms. Convolutional neural networks weren’t new; in fact, the idea of using them for NLP dated back to the ‘80s and ’90s in research circles. But much of the AI community had discredited the idea. The computational horsepower available at the time made the approach untenable.
But around 2010, a student of Geoffrey Hinton’s named George Dahl figured out how to adapt Graphics Processing Units (GPUs) to quickly train up deep neural nets. Hinton and his collaborators used this combination of techniques to train picture recognition software on large volumes of labeled image data. Their algorithm absolutely destroyed the competition at the 2012 ImageNet Large Scale Visual Recognition Challenge… and cemented convolutional neural nets as the next big thing in AI.
In NLP, machine learning of this type has led to the development of large language models (LLMS).
The Magic of NLP Training Is Turning Words into Numbers
NLP models work because they can turn words and phrases into representational vectors of meaning.
Computers, even the most advanced machine learning algorithms, work on numbers, not words. So as a first step for bringing their horsepower to bear, putting word meaning into vectors, which can be computed, is essential in modern NLP.
LLMs are trained today primarily with self-supervised processes that have them create and then correct, with minimal prompting, responses to queries. The lightning speed of this on modern computing machinery works far faster than a human could. And it gives them the ability to learn from a vast corpus of work, drawing on a depth of word uses, styles, and examples.
Even an algorithm that starts off generating many mistakes has plenty of time and scope to hone itself to better language skills with enough words and processing power.
This training feeds vast amounts of written language through the algorithm, allowing it to develop feature relationships between words and phrases. While the machine doesn’t understand any of those things in a definitional sense, it can see how they relate to one another in common use. And that opens up capabilities such as:
- Tagging parts of speech
- Analyzing the sentiment and tone
- Evaluating multiple potential meanings
- Recognizing proper names
- Evaluating the semantic relationships between words and phrases
Put them all together, and you have developed a machine that can evaluate and create credible replies in any human language.
NLP Techniques Can Fill Many Common Demands
That’s such a clearly powerful ability that it’s almost tough to think of industries or activities where it wouldn’t be useful. But a few powerhouse applications for NLP have already emerged:
- Summarization - One of the most popular uses of NLP so far is to immediately review and summarize large amounts of text for users. Whether it’s legal case studies, film reviews, or definitions, large language models have instant access to information that would take any human decades to read, let alone process. So AI NLP is used to do it for you. NLP SEO (Search Engine Optimization) is due to become as big a field as SEO itself due to search engine adoption of NLP AI in generating results. This field can also cover editing, where a human-generated text is fed through AI to help polish it.
The new Turing Test might work directly opposite to how Turing originally imagined it—as AI NLP models have become far better at grammar and spelling than most people, spotting the machine might now involve picking the participant in the conversation that never makes a mistake.
- Translation - Once a model has been developed to accurately represent concepts in one language, there’s no reason it can’t be tied to another that represents the same concepts in other languages. This makes machine translation a key use for NLP. Combined with computer vision, it can even take written texts and turn them into readable output for speakers of other languages.
- Conversation - Of course, the big splash that NLP has made today comes courtesy of the simplest form of interaction: conversation. Chatbots were one of the earliest examples of natural language processing. They are also one of the most accessible to the average user. The experience of typing back and forth with a machine that offers relatable answers can be a magical experience. And it’s one with genuine utility in providing companionship, feedback, and interaction through the magic of anthropomorphism.
- Generation - This article was not written by AI. But it could have been. Sophisticated NLP generative transformer models can produce lengthy works that are internally consistent and representative of the bulk of published information on almost any topic. They can be set to produce any sort of tone they have been trained with. Publishers have been taking advantage of generative AI to produce news articles; lawyers have used it to write briefs for court.
Ghosts in the Machine Are a Major Challenge for Generative AI Programmers
As incredible as NLP is, and as fundamental as it will be in AI, it’s suffering from a major limitation today: it hallucinates like a goldfish swimming in LSD.
Hallucination is the word that the AI community has settled on to describe what in a human being you would simply call lying. But LLMs can’t lie; they lack a critical component of consciousness: they have no intent.
Yet hallucination isn’t really a good analog either, because AI can’t perceive, either. A hallucination in people is a perception of something that doesn’t exist that still seems real. But of course, nothing “seems” to a large language model.
Instead, it turns out that misstatements of training material are a statistical inevitability in natural language processing with transformers as they are currently built. Researchers have found that as much as half of references created by some chatbots are fabricated.
Better training data and more in-depth supervised training can reduce the problem, but not eliminate it. Some new breakthroughs will be needed before LLM-based NLP can be relied on in the most critical applications.
Of course, these are just various techniques that NLP can be used for. The actual industries and areas of application where those uses will find a role are vast.
Natural Language Processing is Finding a Range of Roles in Modern Industry
Something that’s happening in the world of artificial intelligence in general is also happening specifically in natural language processing. You’ll find that three general paths are forming around natural language processing jobs and education options.
Pure research and development roles in computer science. Like OpenAI, as well as many advanced university AI labs, these roles explore the cutting-edge in defining NLP for its own sake. They spend more time focused on the analytical and theoretical approaches to natural language processing and push the state-of-the-art forward in pure linguistic skills.
Useful application-oriented roles in business and government. These jobs aren’t trying for the next big scientific breakthrough. Instead, they are all about taking the latest and greatest NLP research and putting it to use in real-world use cases. These are the roles that are adapting chatbots to customer service roles and finding ways to summarize large volumes of documents in businesses.
NLP uses in highly specialized professional fields. Where language skills become a matter of key importance, like lawyers analyzing masses of case files or nursing professionals looking for signs of chronic disease in stacks of medical records, AI NLP positions require a command of both the latest linguistic deep learning skills and an understanding of the profession itself.
There’s plenty of play back and forth between these tracks, but chances are you’ll be able to look at most NLP jobs and see where they lean toward one or the other. Natural language processing companies will need engineers and scientists in all three areas to push the technology to the next level.
What Will Natural Language Processing Be Used for in the Future?
While those are the top jobs you’ll find for NLP on anyone’s list, there’s a more critical and basic role that the technology will play. For AI, NLP is the critical key to unlocking general communication abilities with humans. For any artificial intelligence, the ability to receive instruction and communicate results rests squarely on natural language processing.
Language comes first. It's not that language grows out of consciousness, if you haven't got language, you can't be conscious.
~ Alan Moore
It’s hard to imagine any kind of advanced AI tool that we will use regularly without a resort to natural processing of language. It’s already the primary way that ordinary people engage AI tools. Whether you are talking to your cell phone, a home automation system, or just shooting the breeze with ChatGPT, the ability to put thoughts into words effortlessly means language is our default interface.
Natural Language Processing in AI is What Makes Us See Machines as Intelligent
NLP has an outsized importance in artificial intelligence not just because of the capability it offers. It’s also one of the subtle tricks on human perception that allows machines to seem alive.
Anthropomorphism is the human tendency to perceive human characteristics in non-human objects.
We tell ourselves stories about the thoughts of pet dogs and teddy bears, see happy trees in paintings, and believe that malfunctioning printers are out to get us. Scientists think this tendency comes from our evolution as social creatures. We learn early and in-depth about interpreting thoughts and moods in other people. So we tend to apply those lessons elsewhere as a default.
Many chatbot developers play on this tendency, consciously or not, in crafting a tone and style of human conversation as the default mode.
But a chatbot is no more human than a rock that looks like it has a face. There are no motives, no understanding, no connections. In the famous words of a group of researchers, they are stochastic parrots, repeating back our words to us with statistically likely combinations.
Computers that expect to do some of our thinking for us will have to engage the same way. Whether you are going to order a burger or ask for advanced analysis of a chest X-ray, the latest and greatest AI systems have to be able to know what you mean.
Degrees and Certificate Programs in Natural Language Processing: How NLP Techniques Are Being Taught to AI Professionals Today
The foundational role for NLP in both the present and future of artificial intelligence means that natural language processing courses are a staple in AI degree and certificate programs today.
In some cases, you will find programs that even specialize in natural language processing, like a Master of Science in Artificial Intelligence with a Concentration in NLP.
More common, however, are programs that come at the subject from the other direction. A Master of Science in Computational Linguistics, or even a Bachelor of English in Linguistics Computational Linguistics track will have a full range of coursework that goes hard on the statistical, language, and computational angles of NLP.
AI-based NLP degrees are more likely to fall into the highly theoretical areas of deep learning and optimizing large language algorithm models.
Certificates are another educational option for anyone who has already put in their time learning and working with advanced computer science and machine learning methods. Alternatively, some certificate programs help linguistics boost their AI credentials.
These educational certificates come with college-grade courses… often the same courses taught in degrees offered at the same levels. They only have a handful of classes, however, and focus on specific subjects. A Natural Language Processing with Python Certificate will drill down on NLP Python techniques; a Certificate in Natural Language Technology helps linguistics get up to speed on NLP machine learning tech.
These may be offered at the graduate or postgraduate levels.
You’ll also notice that these degrees will tend to fall into the same three general tracks of AI careers that we noted earlier. Some will be more purely theoretical, suited for research and new breakthroughs in language processing. Some will have a more liberal arts angle, giving students the economic, social, and historical tools to make NLP business applications. And others may be focused entirely on expert professional applications, where NLP will have a huge impact in specialized fields.
NLP Course Examples in AI Degrees and Certificate Programs
So what kind of natural language processing tools do these educational programs have to offer? They will fall into buckets of both theory and practice.
On the theoretical side, you’ll see courses like:
- Linguistic Models of Syntax & Semantics for Computer Scientists
- Computation Models of Discourse and Dialogue
- Advanced Statistical Methods for Natural Language Processing
- Sociolinguistics
They will go hand-in-hand with other advanced AI theory like neural networks and machine learning algorithm theory.
On the practical side, you’ll learn natural language processing with python and other common tools in classes like:
- Deep Learning for NLP
- Information Extraction
- Automated Speech Recognition
- Language Processing Systems and Applications
There are plenty of practical projects where you will put these tools and your core knowledge to use. At the master’s and doctoral level, capstone and dissertation projects hammer in what you’ve learned by requiring in-depth research and demonstration of original thinking in NLP models.
Natural Language Processing Professional Certification
A professional certification is something a bit different from the educational certificate programs offered by colleges. The certificates described above offer a broad, college-grade education in specific elements of NLP. But a professional certificate is less about education and more about verification.
While professional certifications may include some required coursework, they are more narrow than educational certificates. On the other hand, they may be more up-to-date and more focused on specific tools, like Python natural language processing or similar.
The point of a professional certification is to validate your skills in NLP to potential employers. Typically, you must pass a test to demonstrate your knowledge. Some certifications require a level of educational achievement, or a number of years of experience in the field as well.
Today, only IABAC, the International Association of Business Analytics Certification, a European professional organization, offers NLP certification. Their Certified Natural Language Processing Expert certification helps you stand out in AI NLP applications through training and examination offered by various American training partners.
NLP Meaning Is Still Being Sought by Researchers
While the uses and approaches that define NLP today are clear enough, there’s still plenty of mystery to be found in the field.
One of the central questions is to what extent NLP driven by large language models actually exhibits reasoning skills. “On the Dangers of Stochastic Parrots,” by acclaimed AI researchers Dr. Timnit Gebru and Dr. Margaret Mitchell, computational linguistics professor Dr. Emily Bender, and Bender’s PhD advisee Angelina McMillan-Major takes the strong position that LLMs merely manipulate linguistic form and have no understanding of the language they contain.
We say seemingly coherent because coherence is in fact in the eye of the beholder.
~ Gebru et al, “On the Dangers of Stochastic Parrots
But as the paper points out, this all hinges on what exactly constitutes understanding. Certainly, LLMs don’t understand language in a human sense. But other researchers have found elements of NLP neural networks that seem to break down tasks into subtasks, to draw inferences and to build models of connections in conversation… certainly the building blocks of reasoning and comprehension.
Ultimately, the problem may be a philosophical one. As the Turing Test suggests, perhaps the only working natural language processing definition is in the words that flow back and forth. Any kind of coherence may be just an apparition.
And if that’s the case, NLP is on the doorstep of helping AI achieve something incredible.