Reflection is a new AI model that can run on a good laptop and still beat GPT-4o in tests

Adobe Firefly AI image of a Llama looking in a mirror

(Image credit: Adobe Firefly AI image/Future)

There's a new heavyweight contender in the world of open-source AI models. Reflection 70B, developed by startup HyperWrite, is making waves with its innovative "reflection" mechanism that aims to fix a core problem with large language models — hallucination.

In early benchmarks, this souped-up version of Meta's Llama 3.1-70B Instruct architecture is already outperforming OpenAI's GPT-4o.

Reflection 70B introduces a novel approach to enhancing the reasoning capabilities and accuracy of language models. By assessing its own outputs before delivering a final response, Reflection 70B can detect errors in its reasoning and correct them on the fly. The result is a powerful open-source model that's pushing the boundaries of what's possible with AI today.

What is Reflection 70B?

Reflection 70B is a groundbreaking open-source language model developed by Matt Schumer and his team at HyperWrite. The full name is Reflection Llama 70B as it is based on the Meta Llama architecture.

The model's name is derived from two things, its size of 70 billion parameters, and its ability to "reflect" on its own outputs before providing a final answer. This reflection process is designed to enhance the model's reasoning capabilities and improve the overall accuracy of its responses.

I'm excited to announce Reflection 70B, the world’s top open-source model.Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.405B coming next week - we expect it to be the best model in the world.Built w/ @GlaiveAI.Read on ⬇️: pic.twitter.com/kZPW1plJuoSeptember 5, 2024

Reflection 70B stands out from other generative AI models due to its unique error identification and correction capabilities. The model's reflection mechanism allows it to assess the accuracy of its generated text before delivering outputs to the user. This is achieved.

0 through a technique called reflection tuning, which enables the model to detect errors in its own reasoning and correct them in real time.

Moreover, Reflection 70B has demonstrated exceptional performance across various benchmarks, including MMLU and HumanEval, consistently outperforming models from Meta's Llama series and competing closely with top commercial models like GPT-4o. The model recorded an impressive 99.2% accuracy on the GSM8k benchmark, which evaluates math and logic skills.

About HyperWrite, the creators of Reflection

HyperWrite, the company behind Reflection 70B, is an AI writing start-up led by Matt Shumer. The company offers a Chrome extension that uses AI to automate and accelerate the writing process, providing services such as autocompletion, text generation, and sentence rephrasing.

HyperWrite has raised a total of $5.4 million in funding from 10 investors, including Madrona Venture Group and Active Capital. The company's focus on developing powerful AI writing tools has positioned it at the forefront of the AI industry.

as expected - the independent re-evaluation of Reflection 70B by @mattshumer_ and @csahil28 shows much better results then the tests on the accidentally broken weights.what a huge win to have a two people project, based on a 70B model, score 2nd in GPQA. 🔥🔥🔥kudos guys! https://t.co/41TTvB2Ndl pic.twitter.com/kI3ShG51TDSeptember 8, 2024

Going forward, the future looks bright for Reflection 70B and HyperWrite. Shumer has already revealed plans for an even larger model, Reflection 405B, set to launch in the near future. This more powerful model is expected to push the boundaries of open-source AI even further.

Additionally, HyperWrite is working on integrating the Reflection 70B model into its primary AI writing assistant product. This integration will provide users with access to the model's advanced capabilities, enhancing their writing experience and productivity.

Want to take Reflection 70B for a spin right now? Following an announcement on X, the AI model is available as a free online demo on Railway. If you have a decent gaming laptop, you can also download the model for offline use through Hugging Face.

More from Tom's Guide

Back to MacBook Air

Apple

Asus

Lenovo

Intel Core M3

Intel Pentium

8GB RAM

16GB RAM

24GB RAM

128GB

256GB

512GB

1TB

Black

Brown

Grey

Silver

New

Refurbished

EMMC

SSD

Showing 10 of 33 deals

Filters☰

Apple MacBook Air M3

(15-inch 256GB)

$1,274.95

View

Asus Zenbook S 13 OLED

(OLED)

$1,399.99

View

Lenovo IdeaPad Duet 3

$369.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Silver)

Our Review

☆☆☆☆☆

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

Lenovo IdeaPad Duet 3

(128GB 8GB RAM)

$389.99

View

Apple MacBook Pro 14-inch M3 (2023)

(1TB Intel Core M3)

Our Review

☆☆☆☆☆

(256GB 16GB RAM)

Asus Zenbook S 13 OLED

(OLED)

$1,599

View

See more AI News

Ritoban Mukherjee is a freelance journalist from West Bengal, India whose work on cloud storage, web hosting, and a range of other topics has been published on Tom's Guide, TechRadar, Creative Bloq, IT Pro, Gizmodo, Medium, and Mental Floss.