Friday 30 June 2023

Fine-tuning GPT-J, the GPT-3 open-source alternative

 GPT-J may be the most powerful open-source Natural Language Processing model today (it's the only open-source alternative competing with GPT-3), you might find it too general and not perfectly suited to your use case. In that case, fine-tuning GPT-J with your own data is the key.

The Power of GPT-J

Since it's been released by in June 2021, GPT-J has attracted tons of Natural Language Processing users - data scientists or developers - who believe that this powerful Natural Language Processing model will help them take their AI application to the next level (see EleutherAI's website).

GPT-J is so powerful because it was trained on 6 billion parameters. The consequence is that this is a very versatile model that you can use for almost any advanced Natural Language Processing use case (sentiment analysis, text classification, chatbots, translation, code generation, paraphrase generation, and much more). When properly tuned, GPT-J is so fluent that it's impossible to say that the text is generated by a machine...

It is possible to easily adapt GPT-J to your use case on-the-fly by using the so-called technique (see how to use it here). However, if few-shot learning is not enough, you need to go for a more advanced technique: fine-tuning.

What is Fine-Tuning?

When it comes to creating your own model, the traditional technique is about training a new model from scratch with your own data. The problem is that modern models like GPT-J are so huge that it's almost impossible for anyone to train this model from scratch. EleutherAI said it took them 5 weeks to train GPT-J on TPUs v3-256, which means it cost hundreds of thousands of dollars...

Good news is that re-training GPT-J is not necessary because we have fine-tuning! Fine-tuning is about taking the existing GPT-J model and slightly adapting it. In the past, training traditional Natural Language Processing models from scratch used to take tons of examples. With the new generation Transformer-based models, it is different: fewer examples are necessary and can lead to great results. If you ever heard of "transfer-learning", this is what it is about.

How to Fine-Tune GPT-J?

Even if fine-tuning GPT-J is much easier than training the model from scratch, it is still a challenge for several reasons:

  • • It is a very compute intensive operation that can be painfully long on GPU. The best option is to use a TPU for that.
  • • The fine-tuning process takes some practice, some parameters should be tweaked, and you can easily end up with a suboptimal accuracy.
  • • Once you have your brand new fine-tuned model it's not over: you have to deploy it and reliably use it in production.

If you want to fine-tune GPT-J by yourself, here is how you could do it:

  • • Follow the how-to from the Mesh Transformer Jax team here.
  • • Make sure to perform the fine-tuning on a TPU V3 as you will run out of memory on a TPU V2. You can ask for a free TPU access for 1 month thanks to the TPU research cloud (TRC) program.
  • • Don't forget to turn your result into a slim GPT-J version that is more suited for production inference.

Fine-Tuning GPT-J on NLP Cloud

At NLP Cloud we worked hard on a fine-tuning platform for GPT-J. It is now possible to easily fine-tune GPT-J: simply upload your dataset containing your examples, and let us fine-tune and deploy the model for you. Once the process is finished, you can use your new model as a private model on our API.


GPT-J Fine-Tuning on NLP Cloud

The fine-tuning process itself is free, and then you need to select a fine-tuning plan depending on the volume of requests you want to make on your newly deployed model.

If you do not want to spend too much time on the fine-tuning and deployment operations, it is an option you might want to consider.

Conclusion

GPT-J is an amazing Natural Language Processing model. Mix it with few-shot learning and fine-tuning, and you will get a state of the art AI application!

Thursday 29 June 2023

GPT-3 open-source alternatives: GPT-Neo and GPT-J

 OpenAI's GPT-3 model gives great results, but it is not open-source and OpenAI's API is very expensive. GPT-Neo and GPT-J are 2 great open-source alternatives to GPT-3. How do they compare to GPT-3?

The GPT-3 Breakthrough

GPT-3 was released in May 2020 by the OpenAI research lab, based in San Francisco. As of this writing this is the biggest Natural Language Processing model ever created, trained on 175 billion parameters! See GPT-3's website here.

It can be used for various Natural Language Processing use cases, but it especially excels at text generation (give a small piece of text to the model with an expected text length, and let the model generate the rest of the text for you). The stunning accuracy of GPT-3 opens tons of new AI possibilities, like automatic marketing content creation, advanced chat bots, medical question answering, and much more. For more examples about what you can do with GPT-3, please see their documentation.

Problem: the model can only be used through an expensive paid API as a black box, and only for selected customers (new customers have to join a waitlist). The source code of GPT-3 belongs to Microsoft.

GPT-3 Pricing

OpenAI's API offers 4 GPT-3 models trained on different numbers of parameters: Ada, Babbage, Curie, and Davinci. OpenAI don't say how many parameters each model contains, but some estimations have been made (see here) and it seems that Ada contains more or less 350 million parameters, Babbage contains 1.3 billion parameters, Curie contains 6.7 billion parameters, and Davinci contains 175 billion parameters.

The more parameters the better the accuracy, but also the slower the model, and the higher the price. Their price is per token. Basically you can consider that 100 tokens are roughly equivalent to 75 words. They count the tokens you send in the input request plus the tokens generated by the model. On Davinci, for example, the price for 1,000 tokens is $0.06.

Example: sending a piece of text made up of 1000 tokens (roughly 750 words) and asking the model to generate 200 tokens, will cost you $0.072. If you want to send 20 requests per minute to their API, it will then cost you $64,250 per month...


GPT-3 Pricing

GPT-Neo and GPT-J

GPT-Neo has been released in March 2021, and GPT-J in June 2021, as open-source models, both created by EleutherAI (a collective of researchers working to open source AI).

GPT-Neo has 3 versions: 125 million parameters, 1.3 billion parameters (equivalent to GPT-3 Babbage), and 2.7 billion parameters. GPT-J has 6 billion parameters, which makes it the most advanced open-source Natural Language Processing model as of this writing. This is a direct equivalent of GPT-3 Curie.

The tests made on these models show great performances. When generating text with GPT-J, this is almost impossible to tell whether it has been written by a human or a machine...

How to use GPT-Neo and GPT-J?

GPT-Neo and GPT-J are open-source Natural Language Processing models, so everybody can download them and use them. Well, in theory...

GPT-J, for example, needs around 25GB of RAM to run + many CPUs. On CPUs, GPT-J is painfully slow though, so it is much better to perform inference with GPT-J on a GPU. As GPT-J needs around 25GB of GPU VRAM, it does not fit in most of the standard NVIDIA GPUs existing on the market today (that either have 8GB or 16GB of VRAM maximum).

These hardware requirements make it very impractical to test GPT-J and GPT-Neo, let alone use them reliably for inference in production with high-availability and scalability in mind.

So if you want to simply try GPT-J and GPT-Neo or use them for real in your production application, we do recommend that you use an existing Natural Language Processing API like NLP Cloud. As far as we know, NLP Cloud is the only API proposing GPT-J as of this writing (see NLP Cloud's text generation API here)! And you can also fine-tune GPT-J, so the model is perfectly tailored to your use case.


GPT-J API Example

GPT-3 vs GPT-J on NLP Cloud: pricing comparison

The GPT-3 API is very expensive. On the contrary, NLP Cloud tried to make their GPT-J API as affordable as possible, despite the very high computation costs required on the server side. Let's do the math.

Imagine that you want to perform text generation with GPT-3 Curie. You want to pass an input of 1000 tokens and generate 200 tokens. You want to perform 3 requests per minute.

The price per month would be (1200/1000) x 0.006 x 133,920 = $964/month

Now the same thing with GPT-J on NLP Cloud:

On NLP cloud, the plan for 3 requests per minute on GPT-J costs $29/month on CPU or $99/month on GPU, no matter the number of tokens.

As you can see the price difference is quite significant.

Conclusion

GPT-3 is an amazing model that really changed the Natural Language Processing game. For the first time in Natural Language Processing history, it's almost impossible to say whether the generated content is coming from a human or a machine, which leads many companies to integrate GPT-3 into their product or their internal workflows.

However, sadly, GPT-3 is far from easily accessible... EleutherAI made a great job at designing open-source alternatives to GPT-3. GPT-J is the best of these alternatives as of this writing.

If you want to use GPT-J, don't hesitate to have a try on the NLP Cloud API (try it here)!

Monday 26 June 2023

How to use GPT-3, GPT-J and GPT-NeoX, with few-shot learning

 GPT-3, GPT-J and GPT-NeoX are very powerful AI models. We're showing you here how to effectively use these models thanks to few-shot learning. Few-shot learning is like training/fine-tuning an AI model, by simply giving a couple of examples in your prompt.

GPT-3

GPT-3, released by OpenAI, is the most powerful AI model ever released for text understanding and text generation.

It was trained on 175 billion parameters, which makes it extremely versatile and able to understanding pretty much anything!

You can do all sorts of things with GPT-3 like chatbots, content creation, entity extraction, classification, summarization, and much more. But it takes some practice and using this model correctly is not easy.

GPT-J and GPT-NeoX

GPT-NeoX and GPT-J are both open-source Natural Language Processing models, created by, a collective of researchers working to open source AI (see EleutherAI's website).

GPT-J has 6 billion parameters and GPT-NeoX has 20 billion parameters, which makes them the most advanced open-source Natural Language Processing models as of this writing. They are direct alternatives to OpenAI's proprietary GPT-3 Curie.

These models are very versatile. They can be used for almost any Natural Language Processing use case: text generation, sentiment analysis, classification, machine translation,... and much more (see below). However using them effectively sometimes takes practice. Their response time (latency) might also be longer than more standard Natural Language Processing models.

GPT-J and GPT-NeoX are both available on the NLP Cloud API. Below, we're showing you examples obtained using the GPT-J endpoint of NLP Cloud on GPU, with the Python client. If you want to copy paste the examples, please don't forget to add your own API token. In order to install the Python client, first run the following: pip install nlpcloud.

Few-Shot Learning

Few-shot learning is about helping a machine learning model make predictions thanks to only a couple of examples. No need to train a new model here: models like GPT-3, GPT-J and GPT-NeoX are so big that they can easily adapt to many contexts without being re-trained.

Giving only a few examples to the model does help it dramatically increase its accuracy.

In Natural Language Processing, the idea is to pass these examples along with your text input. See the examples below!

Also note that, if few-shot learning is not enough, you can also fine-tune GPT-3 on OpenAI's website and GPT-J on NLP Cloud so the model is perfectly tailored to your use case.

You can easily test few-shot learning on the NLP Cloud Playground, in the text generation section. Click here to try text generation on the Playground. Then simply use one of the examples showed below in this article and see for yourself.

If you use a model that understands natural human instructions like ChatGPT or ChatDolphin, you might not have to use few-shot learning though! Read our dedicated guide about how to use ChatGPT and ChatDolphin: see the article here.

{% tr Tweet generation example on the NLP Cloud Playground %}
Tweet generation example on the NLP Cloud Playground

Sentiment Analysis with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:""",
    min_length=1,
    max_length=1,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

Positive

As you can see, the fact that we first give 3 examples with a proper format, leads GPT-J to understand that we want to perform sentiment analysis. And its result is good.

You can help GPT-J understand the different sections by using a custom delimiter like the following: ###. We could perfectly use something else like this: ---. Or simply a new line. Then we set "end_sequence" which is an NLP Cloud parameter that tells GPT-J to stop generating content after a new line + ###: end_sequence="###".

HTML code generation with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""description: a red button that says stop
code: <button style=color:white; background-color:red;>Stop</button>
###
description: a blue box that contains yellow circles with red borders
code: <div style=background-color: blue; padding: 20px;><div style=background-color: yellow; border: 5px solid red; border-radius: 50%; padding: 20px; width: 100px; height: 100px;>
###
description: a Headline saying Welcome to AI
code:""",
    max_length=500,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

<h1 style=color: white;>Welcome to AI</h1>

Code generation with GPT-J really is amazing. This is partly thanks to the fact that GPT-J has been trained on huge code bases.

SQL code generation with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Question: Fetch the companies that have less than five people in it.
Answer: SELECT COMPANY, COUNT(EMPLOYEE_ID) FROM Employee GROUP BY COMPANY HAVING COUNT(EMPLOYEE_ID) < 5;
###
Question: Show all companies along with the number of employees in each department
Answer: SELECT COMPANY, COUNT(COMPANY) FROM Employee GROUP BY COMPANY;
###
Question: Show the last record of the Employee table
Answer: SELECT * FROM Employee ORDER BY LAST_NAME DESC LIMIT 1;
###
Question: Fetch three employees from the Employee table;
Answer:""",
    max_length=100,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

SELECT * FROM Employee ORDER BY ID DESC LIMIT 3;

Automatic SQL generation works very well with GPT-J, especially due to the declarative nature of SQL, and the fact that SQL is quite a limited language with relatively few possibilities (compared to most programming languages).

Advanced Entity Extraction (NER) with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Text]: Fred is a serial entrepreneur. Co-founder and CEO of Platform.sh, he previously co-founded Commerce Guys, a leading Drupal ecommerce provider. His mission is to guarantee that as we continue on an ambitious journey to profoundly transform how cloud computing is used and perceived, we keep our feet well on the ground continuing the rapid growth we have enjoyed up until now. 
[Name]: Fred
[Position]: Co-founder and CEO
[Company]: Platform.sh
###
[Text]: Microsoft (the word being a portmanteau of "microcomputer software") was founded by Bill Gates on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. Steve Ballmer replaced Gates as CEO in 2000, and later envisioned a "devices and services" strategy.
[Name]:  Steve Ballmer
[Position]: CEO
[Company]: Microsoft
###
[Text]: Franck Riboud was born on 7 November 1955 in Lyon. He is the son of Antoine Riboud, the previous CEO, who transformed the former European glassmaker BSN Group into a leading player in the food industry. He is the CEO at Danone.
[Name]:  Franck Riboud
[Position]: CEO
[Company]: Danone
###
[Text]: David Melvin is an investment and financial services professional at CITIC CLSA with over 30 years’ experience in investment banking and private equity. He is currently a Senior Adviser of CITIC CLSA.
""",
    top_p=0,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

[Name]: David Melvin
[Position]: Senior Adviser
[Company]: CITIC CLSA

As you can see, GPT-J is very good at extracting structured data from unstructured text. This is really impressive how GPT-J solves entity extraction without any re-training even needed! Usually, extracting new types of entities (like name, position, country, etc.) takes a whole new process of annotation, training, deployment... Here, it's completely seamless.

Question Answering With GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Context: NLP Cloud was founded in 2021 when the team realized there was no easy way to reliably leverage Natural Language Processing in production.
Question: When was NLP Cloud founded?
Answer: 2021
###
Context: NLP Cloud developed their API by mid-2020 and they added many pre-trained open-source models since then.
Question: What did NLP Cloud develop?
Answer: API
###
Context: All plans can be stopped anytime. You only pay for the time you used the service. In case of a downgrade, you will get a discount on your next invoice.
Question: When can plans be stopped?
Answer: Anytime
###
Context: The main challenge with GPT-J is memory consumption. Using a GPU plan is recommended.
Question: Which plan is recommended for GPT-J?
Answer:""",
    min_length=1,
    max_length=20,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

GPU plan

Question answering works very well. It can also be achieved with other dedicated Natural Language Processing models, but maybe not with the same level of accuracy.

Tutorial Creation With GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Here is a tutorial about how to make a cake.

1. Take some flour.
2. Take some sugar.""",
    max_length=500)
print(generation["generated_text"])

Output:

Here is a tutorial how to make a cake.
        
                    1. Take some flour.
                    2. Take some sugar.
                    3. Take some butter.
                    4. Take some eggs.
                    5. Take some water.
                    6. Take some baking powder.
                    7. Take some vanilla.
                    8. Mix all together.
                    9. Bake in a pan.
                    10. Enjoy.
                    
Well, that's it. You can make this for your birthday or a party or you can even make it for your kids. They will love this.

As you can see it's quite impressive how GPT-J automatically follows your initial formatting, and the generated content is also very good too. You might create a proper cake out of this (not tried yet though).

Grammar and Spelling Correction with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I love goin to the beach.
Correction: I love going to the beach.
###
Let me hav it!
Correction: Let me have it!
###
It have too many drawbacks.
Correction: It has too many drawbacks.
###
I do not wan to go
Correction:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

I do not want to go.

Spelling and grammar corrections work as expected. If you want to be more specific about the location of the mistake in the sentence, you might want to use a dedicated model though.

Machine Translation with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Hugging Face a révolutionné le NLP.
Translation: Hugging Face revolutionized NLP.
###
Cela est incroyable!
Translation: This is unbelievable!
###
Désolé je ne peux pas.
Translation: Sorry but I cannot.
###
NLP Cloud permet de deployer le NLP en production facilement.
Translation:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

NLP Cloud makes it easy to deploy NLP to production.

Machine translation usually takes dedicated models (often 1 per language). Here all languages are handle out of the box by GPT-J, which is quite impressive.

Tweet Generation with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""keyword: markets
tweet: Take feedback from nature and markets, not from people
###
keyword: children
tweet: Maybe we die so we can come back as children.
###
keyword: startups
tweet: Startups should not worry about how to put out fires, they should worry about how to start them.
###
keyword: NLP
tweet:""",
    max_length=200,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

People want a way to get the benefits of NLP without paying for it.

Here is a funny and easy way to generate short tweets following a context.

Chatbot and Conversational AI with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""This is a discussion between a [human] and a [robot]. 
The [robot] is very nice and empathetic.

[human]: Hello nice to meet you.
[robot]: Nice to meet you too.
###
[human]: How is it going today?
[robot]: Not so bad, thank you! How about you?
###
[human]: I am ok, but I am a bit sad...
[robot]: Oh? Why that?
###
[human]: I broke up with my girlfriend...
[robot]:""",
    min_length=1,
    max_length=20,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

Oh? How did that happen?

As you can see, GPT-J properly understands that you are in a conversational mode. And the very powerful thing is that, if you change the tone in your context, the responses from the model will follow the same tone (sarcasm, anger, curiosity...).

We actually wrote a dedicated blog article about how to build a chatbot with GPT-3/GPT-J, feel free to read it!

Intent Classification with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""I want to start coding tomorrow because it seems to be so fun!
Intent: start coding
###
Show me the last pictures you have please.
Intent: show pictures
###
Search all these files as fast as possible.
Intent: search files
###
Can you please teach me Chinese next week?
Intent:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

learn chinese

This is quite impressive how GPT-J can detect the intent from your sentence. It works very well for more complex sentences. You can even ask it to format the intent differently if you want. For example you could automatically generate a Javascript function name like "learnChinese".

Paraphrasing with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: Algeria recalled its ambassador to Paris on Saturday and closed its airspace to French military planes a day later after the French president made comments about the northern Africa country. 
[Paraphrase]: Last Saturday, the Algerian government recalled its ambassador and stopped accepting French military airplanes in its airspace. It happened one day after the French president made comments about Algeria.
###
[Original]: President Macron was quoted as saying the former French colony was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France.
[Paraphrase]: Emmanuel Macron said that the former colony was lying and angry at France. He also said that the country was ruled by a "political-military system".
###
[Original]: The diplomatic spat came days after France cut the number of visas it issues for citizens of Algeria and other North African countries.
[Paraphrase]: Diplomatic issues started appearing when France decided to stop granting visas to Algerian people and other North African people.
###
[Original]: After a war lasting 20 years, following the decision taken first by President Trump and then by President Biden to withdraw American troops, Kabul, the capital of Afghanistan, fell within a few hours to the Taliban, without resistance.
[Paraphrase]:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True,
    min_length=0,
    max_length=50)
print(generation["generated_text"])

Output:

French President Emmanuel Macron hopes the diplomatic tension with Algeria will calm down. He wants to make progress in the relations with the former French colony. President Macron says he has a good relationship with his Algerian counterpart and he is glad that they have a cordial relationship. The president was quoted as saying that Algeria was ruled by a "political-military system" with an official history that was based not on truth, but on hatred of France. France cut the number of visas it issues for citizens of Algeria and other North African countries.

GPT-J made modifications to our paragraph, while keeping the main sense, which is what paraphrasing is about. You could perfectly encourage GPT-J to return more original paraphrases, by passing different examples in the input, and by playing with API parameters like temperature, top_p, repetition penalty...

Summarization with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Original]: America has changed dramatically during recent years. Not only has the number of graduates in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering declined, but in most of the premier American universities engineering curricula now concentrate on and encourage largely the study of engineering science.  As a result, there are declining offerings in engineering subjects dealing with infrastructure, the environment, and related issues, and greater concentration on high technology subjects, largely supporting increasingly complex scientific developments. While the latter is important, it should not be at the expense of more traditional engineering.
Rapidly developing economies such as China and India, as well as other industrial countries in Europe and Asia, continue to encourage and advance the teaching of engineering. Both China and India, respectively, graduate six and eight times as many traditional engineers as does the United States. Other industrial countries at minimum maintain their output, while America suffers an increasingly serious decline in the number of engineering graduates and a lack of well-educated engineers. 
(Source:  Excerpted from Frankel, E.G. (2008, May/June) Change in education: The cost of sacrificing fundamentals. MIT Faculty 
[Summary]: MIT Professor Emeritus Ernst G. Frankel (2008) has called for a return to a course of study that emphasizes the traditional skills of engineering, noting that the number of American engineering graduates with these skills has fallen sharply when compared to the number coming from other countries. 
###
[Original]: So how do you go about identifying your strengths and weaknesses, and analyzing the opportunities and threats that flow from them? SWOT Analysis is a useful technique that helps you to do this.
What makes SWOT especially powerful is that, with a little thought, it can help you to uncover opportunities that you would not otherwise have spotted. And by understanding your weaknesses, you can manage and eliminate threats that might otherwise hurt your ability to move forward in your role.
If you look at yourself using the SWOT framework, you can start to separate yourself from your peers, and further develop the specialized talents and abilities that you need in order to advance your career and to help you achieve your personal goals.
[Summary]: SWOT Analysis is a technique that helps you identify strengths, weakness, opportunities, and threats. Understanding and managing these factors helps you to develop the abilities you need to achieve your goals and progress in your career.
###
[Original]: Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.
Jupiter is primarily composed of hydrogen with a quarter of its mass being helium, though helium comprises only about a tenth of the number of molecules. It may also have a rocky core of heavier elements,[21] but like the other giant planets, Jupiter lacks a well-defined solid surface. Because of its rapid rotation, the planet's shape is that of an oblate spheroid (it has a slight but noticeable bulge around the equator).
[Summary]: Jupiter is the largest planet in the solar system. It is a gas giant, and is the fifth planet from the sun.
###
[Original]: For all its whizz-bang caper-gone-wrong energy, and for all its subsequent emotional troughs, this week’s Succession finale might have been the most important in its entire run. Because, unless I am very much wrong, Succession – a show about people trying to forcefully mount a succession – just had its succession. And now everything has to change.
The episode ended with Logan Roy defying his children by selling Waystar Royco to idiosyncratic Swedish tech bro Lukas Matsson. It’s an unexpected twist, like if King Lear contained a weird new beat where Lear hands the British crown to Jack Dorsey for a laugh, but it sets up a bold new future for the show. What will happen in season four? Here are some theories.
Season three of Succession picked up seconds after season two ended. It was a smart move, showing the immediate swirl of confusion that followed Kendall Roy’s decision to undo his father, and something similar could happen here. This week’s episode ended with three of the Roy siblings heartbroken and angry at their father’s grand betrayal. Perhaps season four could pick up at that precise moment, and show their efforts to reorganise their rebellion against him. This is something that Succession undoubtedly does very well – for the most part, its greatest moments have been those heart-thumping scenes where Kendall scraps for support to unseat his dad – and Jesse Armstrong has more than enough dramatic clout to centre the entire season around the battle to stop the Matsson deal dead in its tracks.
[Summary]:""",
length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True,
    min_length=20,
    max_length=200)
print(generation["generated_text"])

Output:

Season 3 of Succession ended with Logan Roy trying to sell his company to Lukas Matsson.

Text summarization is a tricky task. GPT-J is very good at this, as long as you give it the right examples. The size of the of the summary, and the tone of the summary, depend very much on the examples you created. For example, you might not create the same type of examples, whether you are trying to make a simple summary for kids, or an advanced medical summary for doctors. If the input size of GPT-J is too small for your summarization examples, you might want to fine-tune GPT-J for your summarization task.

Zero-shot text classification with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Message: When the spaceship landed on Mars, the whole humanity was excited
Topic: space
###
Message: I love playing tennis and golf. I'm practicing twice a week.
Topic: sport
###
Message: Managing a team of sales people is a tough but rewarding job.
Topic: business
###
Message: I am trying to cook chicken with tomatoes.
Topic:""",
    min_length=1,
    max_length=5,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

food

Here is an easy and powerful way to categorize a piece of text thanks to the so-called "zero-shot learning" technique, without having to declare categories in advance.

Keyword and Keyphrase Extraction with GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
Keywords: information, search, resources
###
David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23. 
Keywords: searching, missing, desert
###
I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
Keywords: document, understand, keyphrases
###
Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
Keywords:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

paragraphs, transformer, input, errors

Keyword extraction is about getting the main ideas from a piece of text. This is an interesting Natural Language Processing subfield that GPT-J can handle very well. See below for keyphrase extraction (same thing but with multiple words).

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Information Retrieval (IR) is the process of obtaining resources relevant to the information need. For instance, a search query on a web search engine can be an information need. The search engine can return web pages that represent relevant resources.
Keywords: information retrieval, search query, relevant resources
###
David Robinson has been in Arizona for the last three months searching for his 24-year-old son, Daniel Robinson, who went missing after leaving a work site in the desert in his Jeep Renegade on June 23. 
Keywords: searching son, missing after work, desert
###
I believe that using a document about a topic that the readers know quite a bit about helps you understand if the resulting keyphrases are of quality.
Keywords: document, help understand, resulting keyphrases
###
Since transformer models have a token limit, you might run into some errors when inputting large documents. In that case, you could consider splitting up your document into paragraphs and mean pooling (taking the average of) the resulting vectors.
Keywords:""",
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

large documents, paragraph, mean pooling

Same example as above except that this time we don't want to extract one single word but several words (called keyphrase).

Product Description and Ad Generation With GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""Generate a product description out of keywords.

Keywords: shoes, women, $59
Sentence: Beautiful shoes for women at the price of $59.
###
Keywords: trousers, men, $69
Sentence: Modern trousers for men, for $69 only.
###
Keywords: gloves, winter, $19
Sentence: Amazingly hot gloves for cold winters, at $19.
###
Keywords: t-shirt, men, $39
Sentence:""",
    min_length=5,
    max_length=30,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

Extraordinary t-shirt for men, for $39 only.

It is possible to ask GPT-J to generate a product description or an ad containing specific keywords. Here we're only generating a simple sentence, but we could easily generate a whole paragraph if needed.

Blog Post Generation With GPT-J

import nlpcloud
client = nlpcloud.Client("gpt-j", "your_token", gpu=True)
generation = client.generation("""[Title]: 3 Tips to Increase the Effectiveness of Online Learning
[Blog article]: <h1>3 Tips to Increase the Effectiveness of Online Learning</h1>
<p>The hurdles associated with online learning correlate with the teacher’s inability to build a personal relationship with their students and to monitor their productivity during class.</p>
<h2>1. Creative and Effective Approach</h2>
<p>Each aspect of online teaching, from curriculum, theory, and practice, to administration and technology, should be formulated in a way that promotes productivity and the effectiveness of online learning.</p>
<h2>2. Utilize Multimedia Tools in Lectures</h2>
<p>In the 21st century, networking is crucial in every sphere of life. In most cases, a simple and functional interface is preferred for eLearning to create ease for the students as well as the teacher.</p>
<h2>3. Respond to Regular Feedback</h2>
<p>Collecting student feedback can help identify which methods increase the effectiveness of online learning, and which ones need improvement. An effective learning environment is a continuous work in progress.</p>
###
[Title]: 4 Tips for Teachers Shifting to Teaching Online 
[Blog article]: <h1>4 Tips for Teachers Shifting to Teaching Online </h1>
<p>An educator with experience in distance learning shares what he’s learned: Keep it simple, and build in as much contact as possible.</p>
<h2>1. Simplicity Is Key</h2>
<p>Every teacher knows what it’s like to explain new instructions to their students. It usually starts with a whole group walk-through, followed by an endless stream of questions from students to clarify next steps.</p>
<h2>2. Establish a Digital Home Base</h2>
<p>In the spirit of simplicity, it’s vital to have a digital home base for your students. This can be a district-provided learning management system like Canvas or Google Classrooms, or it can be a self-created class website. I recommend Google Sites as a simple, easy-to-set-up platform.</p>
<h2>3. Prioritize Longer, Student-Driven Assignments</h2>
<p>Efficiency is key when designing distance learning experiences. Planning is going to take more time and require a high level of attention to detail. You will not be able to correct mistakes on the fly or suddenly pivot when kids are disengaged.</p>
<h2>4. Individual Touchpoints Are Game-Changers</h2>
<p>You can create these touchpoints through any medium you like: emails, video messages, phone calls, messages through your learning management system, comments on shared documents, etc.</p>
###
[Title]: 3 Best Free Image Converter Software Programs
[Blog article]: <h1>3 Best Free Image Converter Software Programs</h1>
<p>Best free image converters for JPG, BMP, PSD, TIF, GIF, RAW, and more</p>
<h2>1. XnConvert</h2>
<p>XnConvert is the Swiss Army knife of image converters. With XnView, you can convert any of around 500 image formats to your choice of around 80 others. If you have a rare image format that you can't open, XnView can probably convert it.</p>
<h2>2. CoolUtils Online Image Converter</h2>
<p>The CoolUtils Online Image Converter is just that—an image converter that exists completely online, no download required. Unlike some online image converters, this one does the image converting for you in real-time—no waiting on an email link.</p>
<h2>3. FileZigZag</h2>
<p>FileZigZag is another online image converter service that will convert most common graphics formats. Just upload the original image, choose the desired output, and then wait for the download link to appear on the page.</p>
###
[Title]: 4 Best Tips to Find a Girlfriend
[Blog article]: """,
    max_length=500,
    length_no_input=True,
    end_sequence="###",
    remove_end_sequence=True,
    remove_input=True)
print(generation["generated_text"])

Output:

<h1>Text Generation API</h1>
<h2>What is Text Generation?</h2>
<p>Text generation is the process of generating text from a block of text. It is not simply the inverse of summarization: a text generator will take a short piece of text as an input and generate a longer one.</p>
<p>For example you have the following text:</p>
Ahead of the opening of the meeting,
<p>In this case the goal of the API is to automatically generate the following:</p>
Ahead of the opening of the meeting, Prime Minister Justin Trudeau spoke to reporters at the Montreal airport where he was travelling to speak at the conference.
<p>So basically we want to leverage the output of the API as the input for the text generator, and we will then be able to generate some sentences based on the input we already have.</p>
<h2>Text generation with GPT-2</h2>
<p>We will leverage GPT-2 in order to have a starting point for the text generator. In this case, we will use the API as the input for the text generator, and we will use the output of the API as the input for GPT-2.
However, we want to fine-tune GPT-2 to generate more complex sentences and to have better results.</p>

Isn't it impressive? This generated blog article is small but you can generate much longer articles. The structure of the generated blog post really depends on the structure you used in your few-shot examples. In order to get more complex structures and more relevant content, fine-tuning GPT-J is the key.

Conclusion

As you can see, few-shot learning is a great technique that helps GPT-3, GPT-J and GPT-NeoX achieve amazing things! The key here is to pass a correct context before making your request.

Even for simple text generation, it is recommended to pass as much context as possible, in order to help the model.

Hope you found it useful! If you have some questions about how to make the most of these models, please don't hesitate to ask us.

Sunday 25 June 2023

Google Cloud Natural Language API: pricing and features comparison with the NLP Cloud API

 Google Cloud Natural Language is a cloud-based Natural Language Processing API that proposes several advanced Natural Language Processing models. Despite being an important actor on the Natural Language Processing market, it is important to carefully review Google Natural Language's offer in order to understand if it the best solution for you. How does Google Natural Language Processing API compare to the NLP Cloud api in terms of pricing and features?

Pricing: Google Natural Language VS NLP Cloud

Google Natural Language consider that a request should contain less than 1,000 characters. If your request contains more than 1,000 characters, it is considered as several requests. For example, if you are trying to classify a piece of text made up of 3,500 characters, this is considered as 4 requests.

The price goes from $0,0005 to $0,002 per request depending on the feature you are using.

The first 5,000 requests are free every month, and if you use their text classification model, you get more free requests (30,000 per month).

The more requests you send, the more you're charged.

Google Natural Language API Pricing
Google Natural Language API Pricing

NLP Cloud adopts a totally different pricing strategy.

NLP Cloud's pricing is flat, which means that you have a number of requests included in your plan. If you want to change the number of requests, you can upgrade or downgrade your plan anytime. The interesting thing with such a pricing is that it is predictable: you always know in advance how much you will be charged at the end of the month.

NLP Cloud proposes several plans, depending on the number of requests you need, no matter the Natural Language Processing model you are going to use, and no matter the number of characters contained in your request. For example, 15 requests per minute cost $29 per month on CPU servers, and $99 per month on GPU servers. See the pricing page here.

Concrete price example: text classification

Imagine that you want to classify pieces of text made up of 10,000 characters, at a rate of 15 requests per minute.

Google would consider that each request is actually equivalent to 10 requests (because each request can only be virtually made up of 1,000 characters). So at the end of the month you would perform 10 x 15 x 44,640 = 6,696,000 requests.

Taking into account their first 30k free requests, and their proportional pricing, you would eventually pay $3,140 per month.

For the same service you would pay $29 per month on NLP Cloud. The difference is quite impressive!

Google Natural Language is expensive and it is quite hard to predict how much you are going to be charged at the end of the month. Take time to do the math in advance in order to avoid surprises...

Features: Google Natural Language VS NLP Cloud

Google Natural Language develop their own in-house models, while NLP Cloud integrates the best open-source models available on the market. It's 2 different strategies, and both have pros and cons. Google have a perfect control on their models, but in return their models are black boxes: we don't know what is inside exactly.

Interesting feature: customers can fine-tune their own models on the Google Natural Language platform, and it is also the case on NLP Cloud. It is an interesting option if you think that the base models are not accurate enough and then should be tailored to your needs.

Also, Google have specific models dedicated to medical data analysis, that NLP Cloud don't have, so you might find it interesting if you're in the healthcare industry.

Now let's list all the proposed Natural Language Processing features.

Here are the Natural Language Processing features supported by Google Natural Language:

  • Sentiment analysis
  • NER (entity extraction)
  • Entity sentiment analysis
  • Text classification
  • Part-of-Speech (POS) tagging

And here are the Natural Language Processing features supported by NLP Cloud as of this writing:

  • Sentiment analysis
  • NER (entity extraction)
  • Text classification
  • Part-of-Speech (POS) tagging and dependency parsing
  • Question answering
  • Text summarization
  • Text generation (with GPT-J and GPT-Neo, the open-source equivalents of GPT-3)
  • Translation (English, French, Spanish, German, Dutch, Chinese, Russian, and Arabic)
  • Language detection
  • Tokenization
  • Lemmatization

As you can see, more Natural Language Processing features are supported on NLP Cloud, and more should come soon.

Conclusion

Google Natural Language is a major actor in the Natural Language Processing market. They propose interesting features like the ability to train your own models, or address medical vocabulary.

However their API is very expensive. For the same price, you can get at least 100 times as many requests on the NLP Cloud API.

In terms of features, NLP Cloud propose many interesting Natural Language Processing models that Google don't propose, like text summarization, question answering, text generation, translation, language detection, tokenization, lemmatization...

Last of all, Google's pricing make it extremely hard to predict in advance how much you are going to be charged at the end of the month, which is not the case with NLP Cloud's flat pricing.

I hope this article helped you properly compare Google Natural Language and NLP Cloud!

Saturday 24 June 2023

Natural Language Processing Introduction: what is Natural Language Processing (NLP)?

 ou have heard of Natural Language Processing (NLP) but you don’t know what it is precisely, and what it is used for? In this post, I will try to help you understand Natural Language Processing with some examples.

What is Natural Language Processing (NLP)?

Natural Language Processing is a subfield of linguistics, computer science, and artificial intelligence. This is the processing of language, words and speech, by a computer.

It is about developing interactions between computers and human language, and especially about how to program computers to process and analyze large amounts of natural language data.

Don’t make the mistake: Natural Language Processing is not only linguistics! Linguistics aims at understanding foreign languages through softwares.

Natural Language Processing is based on rules. But rules are not enough: context is also very important. When a friend tells you: « What a wonderful spring! », is it the season or the water ? Here is another example: « I go to the bank. ». Is it about walking along the bank of the river or about taking money to the bank?

So Natural Language Processing needs lots of rules and dictionaries.

 What is Natural Language Processing for?

Thanks to Natural Language Processing, a machine can "understand" the contents of documents, including the contextual nuances of the language within them. A machine can also extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Challenges in natural language processing frequently involve speech recognition, natural language understanding (NLU), and natural-language generation (NLG).

Why is Natural Language Processing interesting?

The world is full of unstructured data (i.e. data that is not formatted for machines): it amounts to 70-90% of digital data. Natural Language Processing is a great way to process these huge volumes of data.

« AI will power 95% of customer interactions by 2025.»

Gartner

For companies, Natural Language Processing is a way to know their customers in an automated way and to create new opportunities (better knowledge, better targeting,...).

Natural Language Processing Use Cases

Here are some typical Natural Language Processing use cases:

  • Sentiment analysis: as a CEO you want to automatically measure the happiness of your customers by analyzing their messages
  • Text classification: automatically knowing what a piece of text is talking about
  • Speech recognition: converting audio into text for further processing
  • Chatbots: providing 24/7/365 customer support
  • Machine translation: reaching new markets in other languages with minimal investment
  • Autocompleting text: similar to what Gmail does to help you increase your employee’s efficiency through a centralized knowledge base, or to improve how your customers interact with your product or website
  • Spell checking: making sure that documents don’t contain errors. Especially relevant in industries with high compliance requirements such as banking, insurance and finance.
  • Keyword search: locating relevant information faster across all data facilities, leading to increased efficiency
  • Advertisement matching
  • Entity extraction: extracting information from large volumes of unstructured data
  • Spam detection:
  • Text generation: automatically creating new text, e.g. client contracts, research documents, training materials, and so on
  • Automatic summarization: adding summaries to existing documents
  • Questions answering: answering specific questions based on disparate information sources
  • Image captioning: annotating images
  • Video captioning: videos can take a long time to watch. Sometine it’s faster to extract information from them in the form of text and then process it further using other Natural Language Processing techniques like summarization

Natural Language Processing is not new!

During World War 2, Alan Turing created a machine to understand the coded messages sent by the nazies, called Turing’s machine.


Later, the Georgetown–IBM experiment was an influential demonstration of machine translation, which was performed during January 7, 1954. Developed jointly by the Georgetown University and IBM, the experiment involved completely automatic translation of more than sixty Russian sentences into English. It had only six grammar rules and 250 lexical items in its vocabulary.

Another interesting milestone was the ELIZA software, developed in 1966 the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum. The most famous script, DOCTOR, simulated a psychotherapist and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatbots and one of the first programs capable of attempting the Turing test.

Conclusion

In this post, you discovered what natural language processing is and how it can be used in real life. Lots of challenges still exist but great progress have been made these last years in the Natural Language Processing field. Today, the maturity of Natural Language Processing encourages more and more companies to leverage Natural Language Processing in their product or in their internal organization.

AI on how the hailstorms can be captured and converted into some useful of energy and electricty , drinkable water etc.

 Capturing hailstorms for energy and water is a challenging but potentially promising idea. Hail is made up of ice, which can be melted to produce water, and the kinetic energy of the hailstones can be used to generate electricity.

There are a few different ways that hailstorms could be captured. One approach would be to use a large net to catch the hailstones. The hailstones would then be melted to produce water, and the kinetic energy of the hailstones could be used to generate electricity.

Another approach would be to use a hailstorm harvester. A hailstorm harvester is a large, cylindrical structure that is designed to collect hailstones. The hailstones would then be melted to produce water, and the kinetic energy of the hailstones could be used to generate electricity.

The challenges of capturing hailstorms for energy and water include:

  • The size and intensity of hailstorms can vary greatly, making it difficult to predict how much hail will be collected.
  • Hailstorms are often localized, meaning that they only occur in certain areas.
  • The cost of capturing and processing hailstorms can be high.

Despite the challenges, there are some potential benefits to capturing hailstorms for energy and water. These benefits include:

  • A renewable source of energy.
  • A source of drinking water in areas that are prone to drought.
  • A way to reduce the damage caused by hailstorms.

The potential benefits of capturing hailstorms for energy and water make it an area of research that is worth pursuing. However, more research is needed to develop cost-effective and efficient methods for capturing and processing hailstorms.

Here are some additional ideas for how hailstorms could be captured and converted into useful energy and water:

  • Use hailstorms to generate hydroelectric power.
  • Use hailstorms to power desalination plants.
  • Use hailstorms to generate compressed air energy storage.

These are just a few ideas, and there are many other possibilities. As research into hailstorm capture and conversion continues, it is likely that new and innovative ideas will emerge.

Friday 23 June 2023

Contextual targeting for privacy-friendly advertizing thanks to text classification Natural Language Processing API

 

 Don’t track the user, track interests

In today’s online advertising reality, effective marketing tactics rely on a variety of user tracking techniques, such as third-party cookies (and alternative storages) and device fingerprinting. But in a world of data leaks, GDPR, CCPA and increased data protection legislation inspired by those, this approach becomes obsolete. Safari and Firefox already built-in solutions to reduce cross-site tracking. Chrome also works on alternatives. So, the end of third-party cookies is soon. Apple’s Identifier for Advertisers (IDFA) soon will be accessible only for apps with explicit consent from the user. The disappearance of the possibility of cross-domain tracking makes advertisers return to contextual advertising.

In this article, I show you how to implement context targeting based on the Text Classification API provided by NLP Cloud. The approach described here can be easily adapted to any advertising technologies (such as ad servers, OpenRTB etc.).

Context targeting

Because advertisers won’t be able to target individual users by using third-party cookies, an easy prediction is contextual advertising campaigns rising again. This could be the only way to target user interests on a large enough scale. Contextual ads are based on content which the user is looking at right now, instead of their browser history or behavioral profile.

Contextual advertizing
(picture from What Is Contextual Advertising?)

It is supposed to be more interesting for users, as they’ll see ads that match with the topic of the website pages that they are visiting.

Give me a tag

Most ad serving technologies and ad networks support the passing of keywords or tags during the ad serving codes. Text is the core of the web and can be an extremely rich source of information. However, extracting context, tags and keywords from it, e.g. for advertisement or recommendation purposes, can be hard and time-consuming. But if you are the owner of even a medium-sized news site, beyond a few tags allocated by the editorial team, it will be difficult to extract all the relevant topics.

First attempts to automate this process have resulted in more or less hilarious screw-ups in the past:

Incorrect contextual advertizing
(picture from Bad Ad Placements Funny, If Not Yours)

Text classification of articles

Fortunately, advances in Natural Language Processing allow for much more accurate matches, in less time. Text classification is the assignment of categories or labels consistent with text content.

Let’s consider an example page with articles on a variety of topics:

Ad placement

Our goal is to have ad placements display banners thematically related to the article content.

Conditions that our solution must meet:

  • 1. We choose keywords and topics that are relevant to the ad campaign.
  • 2. The system analyzes the content of the displayed article and categorizes it.
  • 3. Relevant ads are chosen and placed.

Note that advertising systems and web development are outside the scope of this article, but the general concepts remain the same regardless of the tools and technologies used.

Text Classification API

My preferred solution in such cases is to separate the logic that handles text classification into a separate API. We have two options: create it ourselves or use a ready-made solution.

Preparing a simple text classification engine using Python and Natural Language Processing libraries is a task for one afternoon. But the problem arises in terms of accuracy and serving increased traffic. We need to somehow handle the growing user base and their clickstream.

If you are a website owner, you are unlikely to want to play with the machine learning models tuning and evaluation. So we will delegate as much as we can to an external solution. Note that we do not plan to send any user data here, only data belonging to the website. This makes the use of external contextual targeting tools much simpler from a user privacy perspective.

NLP Cloud is a provider of multiple APIs for text processing using machine learning models. One of these is the text classifier, which looks promising in terms of simple implementation (see docs).

NLP Cloud models

With the NLP Cloud API, you can try out which algorithm might be useful for a particular business case.

Integrate text classification with the content of the website

As the backend of the website is Python-based (Flask), we start by writing a simple client to the Natural Language Processing API:

import pandas as pd
import requests
import json


class TextClassification:
    def __init__(self, key, base='https://api.nlpcloud.io/v1/bart-large-mnli',):
        self.base = base
        self.headers = {
            "accept": "application/json",
            "content-type": "application/json",
            "Authorization": f"Token {key}"
        }

    def get_keywords(self, text, labels):
        url = f"{self.base}/classification"
        payload = {
            "text":text,
            "labels":labels,
            "multi_class": True
        }

        response = requests.request("POST", url, json=payload, headers=self.headers)
        result = []
        try:
            result = dict(zip(response.json()['labels'], response.json()['scores']))
        except:
            pass
        return result
        
tc = TextClassification(key='APIKEY')

print(
    tc.get_keywords(
        "Football is a family of team sports that involve, to varying degrees, kicking a ball to score a goal. Unqualified, the word football normally means the form of football that is the most popular where the word is used. Sports commonly called football include association football (known as soccer in some countries); gridiron football (specifically American football or Canadian football); Australian rules football; rugby football (either rugby union or rugby league); and Gaelic football.[1][2] These various forms of football share to varying extent common origins and are known as football codes.",
        ["football", "sport", "cooking", "machine learning"]
    )
)

Results:

{
    'labels': [
        'sport', 
        'football', 
        'machine learning', 
        'cooking'
    ], 

    'scores': [
        0.9651273488998413, 
        0.938549280166626, 
        0.013061746023595333, 
        0.0016104158712550998
    ]
}

Pretty good. Each label is assigned its relevance to the topic with no effort.

The plan is that the selection of banners to be displayed will be done by an ad serving system (decision will be based on the scores of the individually assigned labels). Therefore, in order not to expose the API keys and to have more control over the data, we will write a simple proxy:

@app.route('/get-labels',methods = ['POST'])
def get_labels():
    if request.method == 'POST':
        try:
            return tc.get_keywords(request.json['text'], request.json['labels'])
        except:
            return []

Campaigns

Let’s assume we have 3 ad campaigns to run:

Ad placement
Insurance company (keyword: insurance)

Ad placement
Renewable energy company (keyword: renewables)

Ad placement
Hairdresser (keyword: good look)

Let’s sketch a mechanism on the front-end, which will manage the display of an appropriate creative.

function displayAd(keyword, placement_id) {

    var conditions = {
        false: ' ',
        "insurance": ' ',
        "renewables": ' ',
        "good look": ' '
    }

    var banner = document.querySelector(placement_id);
    banner.innerHTML = conditions[keyword];

}

This is our adserver 🤪

Now using fetch, we will retrieve labels for the text of an article, that we get using its selector:

var text = document.querySelector("#article").textContent;
var labels = ["insurance", "renewables", "good look"];

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({"text":text,"labels":labels});

var requestOptions = {
    method: 'POST',
    headers: myHeaders,
    body: raw,
};

fetch("http://127.0.0.1:5000/get-labels", requestOptions)
    .then(response => 
    response.json()
    )
    .then(result => {
    if (result == []){
        console.log("self-promote");
        displayAd(false, "#banner");
    } else {
        var scores = result['scores'];
        var labels = result['labels'];

    if (Math.max(...scores) >= 0.8) {
        console.log("Ad success");
        var indexOfMaxScore = scores.reduce((iMax, x, i, arr) => x > arr[iMax] ? i : iMax, 0);
        displayAd(labels[indexOfMaxScore], "#banner");

    } else {
        displayAd(false, "#banner");
    }
    }
    })
    .catch(error => console.log('error', error));

Note that we only display the client ad if the score is above 0.8:

Math.max(…scores) >= 0.8

Otherwise, we display self-promotion.

This is of course an arbitrary value, which can be tightened and loosened as needed.

Ad placement
News about renewable energy source fits PV cell ads.

Ad placement
News about the dangers in the house can increase the intention to buy insurance.

Ad placement
Although an ad about insurance would have been suitable for the article, it was not displayed because the right level of relevancy was not achieved.

The careful reader will notice that the example of the hairdresser’s banner did not appear. This is because the subject matter of the articles is focused on serious world news, where fashion issues are not addressed. To be able to implement the campaign, you need to choose a different site or rethink your keyword strategy.

Performance

We can achieve fast page load thanks to this asynchronous function: fetch . However, at the same time, the ad will only show after the labels have been downloaded. For this reason and to reduce costs, it is best to implement some form of cache in a production environment.

An additional modification could be simply storing labels directly in the database. For infrequently updated articles, this certainly makes sense.

However, a solution based on a separate API, which we can feed to any text and get its labels, gives us the possibility to use JS code virtually on any page in near real-time, even without access to the backend!

Takeaways

The biggest challenge in using contextual targeting is using it on news websites. Many topics appear in the articles posted there, including those which are in line with the advertiser’s industry. But at the same time, the sensational, often sad overtones of the stories they contain are not a good place to advertise.

The text classification API by NLP Cloud, on the other hand, does a pretty good job of tagging texts, so we might as well repeat the whole process, this time keeping in mind to exclude texts with a given topic from having banners emitted on them (see the text classification API page)

Paper Plane

Thank you for reading. I hope you enjoyed reading as much as I enjoyed writing this for you.

Connect broadband

How To Compare Machine Learning Algorithms in Python with scikit-learn

 It is important to compare the performance of multiple different machine learning algorithms consistently. In this post you will discover...