ChatGPT is an advanced chatbot engine, based on the GPT-3 model from OpenAI. There are good open-source alternatives available today like GPT-J, GPT-NeoX, Bloom, and OPT. Let's investigate these alternatives.
ChatGPT: Features And Pricing
ChatGPT, released in December 2022, is an advanced chatbot engine created by OpenAI. ChatGPT is based on the very popular GPT-3 model, released in 2020. ChatGPT and GPT-3 were trained on huge amounts of data coming from the internet (Wikipedia, blogs, forums...).
What makes ChatGPT so powerful, compared to other traditional engines, is its size. Training such a huge AI model was very expensive, but the result is breathtaking: ChatGPT has a deep knowledge about many topics. For example ChatGPT can generate advanced scientific reports, it can code a whole application for a programmer, it can write news articles... and much more.
Fluency is also a great strength of ChatGPT. This AI model is able to talk very fluently like a human, even in many non English languages.
And last of all, ChatGPT is very good at understanding emotions and adapting its own tone accordingly.
For the moment OpenAI is still beta testing ChatGPT, so they did not announce any pricing for this product. No SLAs have been announced either. So ChatGPT is definitely not a production-ready product for the moment.
Generative AI Models: How They Work
ChatGPT is derived from GPT-3, a modern generative AI model based on the Transformer architecture. The transformer architecture is a specific type of neural network invented by Google in 2017. See more here.
Generative AI models are basically good at generating some text based on a specific input. Depending on your input, you can tell your AI model to do various things for you. For example you can ask your model to categorize a piece of text, extract specific entities from a piece of text, summarize large contents, paraphrase some content, answer questions... and of course act as a chatbot.
In order to understand how to leverage these generative AI models more deeply, we do recommend that you read our guide about how to use GPT-3 with few-shot learning: read it here.
ChatGPT is a GPT model that has been specifically instructed to behave like a chatbot. In the rest of this article we are going to explore open-source alternatives to ChatGPT. In order to use them in conversational mode you will either need to use few-shot learning for conversational AI or fine-tuning. Learn more about few-shot learning for conversational AI here. Learn more about fine-tuning here.
GPT-J And GPT-NeoX
GPT-J and GPT-NeoX are 2 open-source AI models created by a collective of researchers called EleutherAI, in 2021 and 2022.
GPT-J has 6 billions parameters, and GPT-NeoX 20B has 20 billions parameters. In comparison, GPT-3 from OpenAI has 175 billions parameters.
These were the 2 first open-source AI models considered as "large language models", and both gave impressive results. They were very complex and expensive to train. The server costs needed to train these models was in hundreds of thousands of dollars.
These models give very good results. Especially, GPT-J is a very good compromise between size and accuracy.
Use this Github repo to install GPT-J on your own server. In order to install GPT-J you will need a GPU with at least 16G of VRAM. You can also easily use GPT-J and GPT-NeoX on the NLP Cloud API.
OPT
OPT was released by Facebook in 2022. See more here.
This is a 175 billions parameters AI model (same size as GPT-3) that also has smaller distilled versions (66 billions, 30 billions, 13 billions, 6.7 billions, 2.7 billions, 1.3 billions, 350 millions, and 125 millions). These distilled versions try to keep decent levels of accuracy despite their smaller sizes. But of course the most impressive results can only be achieved with OPT 175 billions.
You can download the small distilled versions and install them by yourself. However the 175 billion version cannot be downloaded freely: a specific access must be requested and Facebook will only allow you to download and use the model for research purposes.
Bloom
Bloom is a 175 billion parameters AI model released by BigScience, a collective of researchers associated with the Hugging Face company. See more details here.
Bloom is interesting for 2 reasons.
First, it is a collective effort made by more than 1000 people (researchers, industrials, ...) in order to prove that the open-source community is able to create large language models equivalent to GPT-3. It was trained in Paris on the Jean Zay supercomputer, and it cost around 3 millions dollars to train.
Secondly, Bloom is the first true multilingual large language model ever created. It was trained on 46 different languages which makes it very good at generating text in multiple languages.
Bloom has several smaller distilled versions (7 billions parameters, 3 billions parameters, 1 billions parameters, and 560 millions parameters).
Installing the full 175B version is a challenge though as it requires around 350GB of GPU VRAM, which is not something one can easily afford.
Conclusion
ChatGPT is an amazing chatbot engine that is able to answer very advanced questions. This AI engine is actually even more relevant than most humans in many fields.
However, ChatGPT is only a mere demo for the moment, and one has no information about a future pricing or SLA... It is interesting to compare ChatGPT to open-source alternatives: GPT-J, GPT-NeoX, OPT, and Bloom. And no doubt that new open-source AI models are going to be released soon, with even better accuracy.
If you want to use GPT-J and GPT-NeoX, don't hesitate to have a try
No comments:
Post a Comment