Ticker

6/recent/ticker-posts

Microsoft unveils a radically different… and surprisingly powerful AI model

Microsoft unveils a radically different… and surprisingly powerful AI model

Microsoft researchers claim to have the most powerful 1-bit large language model (LLM), or "bitnet," to date. Called BitNet b1.58 2B4T, it's so efficient it can apparently run on a CPU.

All AI models are built around a set of weights and biases—numerical values that define the strength of the connections between the various virtual neurons that make up the network. These values are stored in numbers at floating point (or floats in programming jargon) whose number of decimal places, and therefore precision, depends on the number of bits assigned to it. A weight encoded in a 16-bit float, for example, will be significantly more precise than an 8-bit weight.

This can make a considerable difference in the inference process — the operations through which an already trained AI model makes predictions from new data. The more precise the weight values, the more the model can theoretically arrive at consistent and qualitative conclusions. But there is a downside: the more precise these parameters are, the more computing power and memory are required to process them.

This forces developers to make a stark choice: should they prioritize raw performance or efficiency? State-of-the-art LLMs, like GPT, traditionally opt for 16- or 32-bit parameters, thus prioritizing performance over necessary resources. But there is also another category of models, called bitnets, where the number of bits assigned to each parameter is reduced to the bare minimum to prioritize efficiency. In essence, these are compressed LLMs where each weight is encoded in a single bit. Instead of working with nuanced values, like 1.0494098344, they make do with three variants: -1, 0, and 1.

Finally, a high-performance bitnet

Traditionally, AI powerhouses have often ignored These bitnets were considered too inefficient compared to industry standards. But Microsoft now claims to have changed the situation with its BitNet b1.58 2B4T, which has approximately 2 billion 1-bit parameters. It seems to offer quite respectable performance, and even very impressive when you consider the limitations inherent to this type of model.

Certainly, it is very far from competing with OpenAI's GPT, the latest version of which uses approximately 1,750 billion 16-bit parameters. But Microsoft researchers claim that it outperforms Meta's Llama 3.2 1B, Google's Gemma 3 1B, and Alibaba's Qwen 2.5 1.5B on several common benchmarks.

And its advantages aren't limited to raw performance. It's also faster than any other model of the same size, and requires significantly less memory. In fact, it's so efficient that it can even run on a single CPU. This may not sound like much, but it's a huge difference compared to GPT and others, which use a large number of GPUs to perform a host of operations in parallel.

A product that is not yet mature, but promising

There is, however, one drawback: compatibility. On the page HuggingFace where the model is made available, Microsoft insists that specialized and highly optimized hardware is required to exploit the performance of this bitnet. In other words, there are still many obstacles to the democratization of these compressed LLMs.

But it is still substantial progress, and it will be interesting to see how far companies will be able to push the performance of these small models in the future.

Post a Comment

0 Comments