Mathematical Reasoning and Modeling

AI Models Still Struggle With Reasoning — And Here’s Why

Forbes contributors publish independent expert analyses and insights. I write about the economics of AI. What looks like intelligence in AI models may just be memorization. A closer look at benchmarks ...

15d

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam

Nous Research's open-source Nomos 1 AI model scored 87/120 on the notoriously difficult Putnam math competition, ranking second among 4,000 human contestants with just 30 billion parameters.

Tech Xplore on MSN

AI agents debate their way to improved mathematical reasoning

Large language models (LLMs), artificial intelligence (AI) systems that can process and generate texts in various languages, ...

Geeky Gadgets

Google DeepMind AlphaProof AI solves advanced reasoning problems in mathematics

At the heart of this breakthrough lies AlphaProof, a sophisticated formal reasoning AI model developed by the brilliant minds at Google DeepMind. This innovative system has demonstrated an ...

How 2025 Recalibrated AI Models Race

In 2025, large language models moved beyond benchmarks to efficiency, reliability, and integration, reshaping how AI is ...

SiliconANGLE

Google DeepMind unveils AI models for solving advanced mathematical problems

Google DeepMind, Google LLC’s artificial intelligence research unit, today unveiled two new AI models that are capable of advanced mathematical reasoning for solving complex math problems, which ...

ExtremeTech

Microsoft Unveils Phi-4: New AI Model for Mathematical Reasoning

Phi-4 will compete with other small models such as GPT-4o mini, Gemini 2.0 Flash, and Claude 3.5 Haiku. Share on Facebook (opens in a new window) Share on X (opens in a new window) Share on Reddit ...

Mashable

Researchers created an AI reasoning model on par with OpenAI's o1 for less than $50

The floodgates have opened for building AI reasoning models on the cheap. Researchers at Stanford and the University of Washington have developed a model that performs comparably to OpenAI o1 and ...

Why complex reasoning models could make misbehaving AI easier to catch

In a new paper from OpenAI, the company proposes a framework for analyzing AI systems' chain-of-thought reasoning to understand how, when, and why they misbehave.

Geeky Gadgets

Deepseek-r1 vs OpenAI-o1 – AI Reasoning Performance Comparison

Deepseek, a Chinese company, has introduced its Deepseek R1 model, attracting attention for its potential to rival OpenAI’s latest offerings. Reportedly outperforming OpenAI’s o1 Preview in benchmarks ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results