9/18/2024 9:04:22 PM | 3 minute read

Will 2025 be the year of the small AI model?

Get in touch

Gregory Helding

Senior Counsel - IP & Technology Law

Get in touch

Gregory Helding

Senior Counsel - IP & Technology Law

When AI broke out of science fiction and into our everyday lives, it was Large Language Models (LLMs) that led the way. While LLMs may have gotten all the attention, small AI models are emerging as the next “big” thing. Big tech is still pumping billions into training and running large models. Meanwhile, some startups are embracing the more affordable approach of small models. Venture capital firms are taking notice of the potential of small models. Arcee.AI, which develops and applies these compact models, raised $24 million in Series A financing in July of this year.

LLMs are trained on large general data sets. They know a little about a lot of things. Small models are less expensive to build out and operate and are more easily adapted to specialized applications. Rather than try to do everything, small models are personalized to perform a more limited set of day-to-day tasks for a particular business need. For example, Arcee.AI trained a small model that can answer tax questions for Thomson Reuters and built a career coach chatbot for Guild, an upskilling company.

Why are small AI models gathering steam?

Small models have several advantages that may give them an edge over large models in the coming years. Their key advantage - their size - is in the name. LLMs have billions of parameters and require massive amounts of data and computational power to train and run. Small models, on the other hand, can be trained effectively with less data and require much less computing power (and therefore energy) to run.

Ease and Cost of Deployment

Their size makes small models more suitable for deployment on edge devices or in resource-constrained environments. For example, a small business could run its own custom-trained model locally on its own hardware. In some instances, that hardware could be portable devices, such as laptops, tablets, or even smart phones.

Small models are focused on a specific use case, and are typically faster to train than LLMs, which can take weeks or even months to train on a single GPU. Small models can be trained in hours or days, allowing for faster iteration and improvement of specialized applications. They also require less training data, making them more suitable for applications where high-quality training data is scarce. For example, a business may want to train a model to output documents in their preferred format, meaning their training data is limited to their internal data. With a small model, this may not be a problem.

Adaptability

Adapting an LLM may require significant changes to the model's architecture. However, because they learn from a smaller dataset, small models can be adapted to different tasks by changing a few parameters. Small models can be developed using transfer learning techniques, which enable them to leverage pre-trained knowledge and adapt it to new tasks and applications. This flexibility makes small models more appealing in situations where requirements change rapidly or the problem domain is complex and evolving, for example, in fast-paced small business environments and startup operations.

Interpretability

One disadvantage of LLMs is that it can be difficult or impossible to determine how a model arrived at its answer. However, small AI models are often more interpretable than large LLMs, making it easier to understand how they arrive at their decisions. This can be a critical distinction in fields like law, where how an answer is arrived at needs to be understood.

What are the risks?

As industries continue to explore the potential of small AI models, understanding how to protect intellectual property in this rapidly evolving space becomes critical. Small AI models offer unique advantages for businesses, but they also raise new challenges for IP protection. If a business provides training data, will it remain confidential to the business? Does incorporating it into a customized small model affect ownership of that data? Who owns the newly trained model, and can it be relicensed to others, either by the company that produced the model or the company that commissioned its production? Looking to small models may be a smart business move, but those who do so must be prepared to navigate the legal complexities involved to safeguard their businesses.

“You can be more narrow and focused with your model when it’s a smaller model and really zone in on the task and use case,” McQuade said, “as opposed to having a model that can do everything and anything you need to do.”

www.itprotoday.com/...