Saturday 24 June 2023

Natural Language Processing Introduction: what is Natural Language Processing (NLP)?

 ou have heard of Natural Language Processing (NLP) but you don’t know what it is precisely, and what it is used for? In this post, I will try to help you understand Natural Language Processing with some examples.

What is Natural Language Processing (NLP)?

Natural Language Processing is a subfield of linguistics, computer science, and artificial intelligence. This is the processing of language, words and speech, by a computer.

It is about developing interactions between computers and human language, and especially about how to program computers to process and analyze large amounts of natural language data.

Don’t make the mistake: Natural Language Processing is not only linguistics! Linguistics aims at understanding foreign languages through softwares.

Natural Language Processing is based on rules. But rules are not enough: context is also very important. When a friend tells you: « What a wonderful spring! », is it the season or the water ? Here is another example: « I go to the bank. ». Is it about walking along the bank of the river or about taking money to the bank?

So Natural Language Processing needs lots of rules and dictionaries.

 What is Natural Language Processing for?

Thanks to Natural Language Processing, a machine can "understand" the contents of documents, including the contextual nuances of the language within them. A machine can also extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Challenges in natural language processing frequently involve speech recognition, natural language understanding (NLU), and natural-language generation (NLG).

Why is Natural Language Processing interesting?

The world is full of unstructured data (i.e. data that is not formatted for machines): it amounts to 70-90% of digital data. Natural Language Processing is a great way to process these huge volumes of data.

« AI will power 95% of customer interactions by 2025.»

Gartner

For companies, Natural Language Processing is a way to know their customers in an automated way and to create new opportunities (better knowledge, better targeting,...).

Natural Language Processing Use Cases

Here are some typical Natural Language Processing use cases:

  • Sentiment analysis: as a CEO you want to automatically measure the happiness of your customers by analyzing their messages
  • Text classification: automatically knowing what a piece of text is talking about
  • Speech recognition: converting audio into text for further processing
  • Chatbots: providing 24/7/365 customer support
  • Machine translation: reaching new markets in other languages with minimal investment
  • Autocompleting text: similar to what Gmail does to help you increase your employee’s efficiency through a centralized knowledge base, or to improve how your customers interact with your product or website
  • Spell checking: making sure that documents don’t contain errors. Especially relevant in industries with high compliance requirements such as banking, insurance and finance.
  • Keyword search: locating relevant information faster across all data facilities, leading to increased efficiency
  • Advertisement matching
  • Entity extraction: extracting information from large volumes of unstructured data
  • Spam detection:
  • Text generation: automatically creating new text, e.g. client contracts, research documents, training materials, and so on
  • Automatic summarization: adding summaries to existing documents
  • Questions answering: answering specific questions based on disparate information sources
  • Image captioning: annotating images
  • Video captioning: videos can take a long time to watch. Sometine it’s faster to extract information from them in the form of text and then process it further using other Natural Language Processing techniques like summarization

Natural Language Processing is not new!

During World War 2, Alan Turing created a machine to understand the coded messages sent by the nazies, called Turing’s machine.


Later, the Georgetown–IBM experiment was an influential demonstration of machine translation, which was performed during January 7, 1954. Developed jointly by the Georgetown University and IBM, the experiment involved completely automatic translation of more than sixty Russian sentences into English. It had only six grammar rules and 250 lexical items in its vocabulary.

Another interesting milestone was the ELIZA software, developed in 1966 the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. The most famous script, DOCTOR, simulated a psychotherapist and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatbots and one of the first programs capable of attempting the Turing test.

Conclusion

In this post, you discovered what natural language processing is and how it can be used in real life. Lots of challenges still exist but great progress have been made these last years in the Natural Language Processing field. Today, the maturity of Natural Language Processing encourages more and more companies to leverage Natural Language Processing in their product or in their internal organization.

No comments:

Post a Comment

Connect broadband

How to Predict Sentiment from Movie Reviews Using Deep Learning (Text Classification)

  Sentiment analysis is a natural language processing problem where text is understood, and the underlying intent is predicted. In this p...