Evolution of large language models
Large language models, or LLMs, have been revolutionary in shaping how we integrate artificial intelligence into our daily lives. ChatGPT, for instance, was initially constructed upon GPT3.5, a model trained on an extensive dataset comprising 175 billion parameters. By 2023, this parameter count soared to an astounding 1.7 trillion with the introduction of GPT-4. Both GPT-3.5 and GPT-4 have proven their efficacy in aiding and tackling intricate questions and tasks.
Today, users typically have the choice between GPT-3.5 Turbo and GPT-4 when using ChatGPT. GPT-3.5 Turbo has the advantage of being the least resource-intensive model and can respond faster. GPT-4 can be used in more complex tasks where, in many cases, you get a better answer at the expense of response time and resource usage.
Challenges:
The challenge with large language models lies in their demand for extensive computing power and specialized hardware. Typically, they operate on clusters of graphics cards and specialized processors to meet the memory requirements and processing power necessary for achieving satisfactory response times. This hardware is typically exclusive to large data centers, presenting both significant cost and resource-intensive challenges. For consumers like us, utilizing large language models currently entails relying on internet connectivity to transmit our data-to-data centers for processing. For service providers, this necessitates a continuous need to scale up or optimize the models to align with consumer demand.
This is why the launch of Microsoft's Phi-3 Mini is incredibly exciting. It's a compact language model with immense potential. While large language models have traditionally been trained on massive datasets, Phi-3 Mini has been meticulously crafted with the assistance of these larger models, enabling the filtering and optimization of its training data. The outcome? A "small" model with just 3.8 billion parameters that outperforms models ten times its size on tests. It's even approaching results comparable to GPT-3.5. What's truly remarkable is that it's compact enough to run on local devices, including smartphones.
Acknowledging model limitations
It's important to acknowledge, however, that this model does have its limitations. Its primary training focuses on understanding and reasoning in English exclusively. Furthermore, its training on a more specialized and narrower dataset means it lacks the breadth of knowledge found in larger models like GPT-4.
On the other hand, Microsoft has chosen to make this model openly available. This signifies that users have the opportunity to further train and customize it for specific tasks within their applications, even enabling local deployment. Another potential application is using this model as a proxy to alleviate traffic to larger models.
With the resource requirements we see in the largest models today, I believe we'll see an increase in smaller and specialized language models. I also think the large language models will remain, but perhaps not all questions need to go through the most resource-intensive models.
Try it yourself:
If you want to try Phi-3 yourself, you can test it in Azure AI Studio. If you want to download it and test it locally, you can find it on both HuggingFace and Ollama.
Don't hesitate to reach out to us at twoday for further information on utilizing AI safely and efficiently in your day-to-day tasks.