• Superhuman AI
  • Posts
  • 🧮 AI takes on world's best math students

🧮 AI takes on world's best math students

ALSO: OpenAI is building a GPT-powered browser

Read time: under 4 minutes

Welcome back, Superhuman

Math prodigies, your reign of supremacy is under threat — and the challenger doesn't even need to sharpen its pencil. Find out why DeepMind’s new model is putting graphing calculators to shame.

Today’s Insights

  • AlphaProof earns silver medal at prestigious math competition

  • What makes Mistral’s Large 2 LLM unique

  • Everything else you should know today

  • 5 new AI tools to boost your productivity

  • AI-Generated Images: Inflatable Paris

NEXT IN AI

DeepMind’s new model solves Olympiad-level math problems

AI has already conquered top-rated chess, shogi, and Go players. Now, it’s coming for the world’s best mathletes. DeepMind announced its AlphaProof model was able to solve four out of six of this year’s International Math Olympiad problems. 

Each year, countries send a team of representatives to compete in the IMO — considered one of the world’s most prestigious mathematics competitions. If DeepMind’s model had been a real-life participant this year, it would have walked away with a silver medal.

Why it’s important: The IMO problems require not only a mastery of different theorems and principles but also a high level of creativity and unconventional thinking. AlphaProof’s accomplishment proves that AI is getting better at not simply mimicking its training data but actively applying it to solve complicated logic puzzles. 

Here’s how it works:

  • Most LLMs find patterns in their training data in order to generate new content

  • AlphaProof works differently: When presented with a problem, it generates multiple answers, then works backward to try to test their accuracy

  • Most of the potential solutions will fail, but one will inevitably turn out to be correct

  • The model got a boost from AlphaGeometry 2, an LLM that specializes in shapes

Room for Improvement: “It’s not perfect, we didn’t solve everything,” DeepMind VP of Research Pushmeet Kohli admitted in an interview with the New York Times. For instance, the model was stumped by a pair of questions related to combinatorics — the study of counting and arranging objects. 

Speed wasn’t exactly its strong suit either. While the students had four-and-a-half hours to solve each problem, it took AlphaProof three days to work through one particularly challenging calculation.

Still, mathematicians are shocked: AlphaProof was able to solve this year’s hardest problem — something that only five out of 609 contestants achieved. The tool could one day be used as a lab assistant to help mathematicians try out unusual hypotheses. And just as important, the new capabilities take us one step closer to AGI, the theorized point when AI surpasses human intelligence.

AI AT WORK, PRESENTED BY GUIDDE

How to make training videos in 5 minutes with AI

Need to onboard new hires, provide job training, or teach customers how to use a product?

Use AI to create instant informational videos and share them anywhere. It saves a ton of time and requires no design skills:

  1. Download Guidde, an AI-powered tool that records you doing a task and automatically creates a video with instructions

  2. Click the Guidde browser icon, and complete the task you need a guide for. Then click again to stop recording

  3. Visit the website to find the recording, along with captions and instructions automatically added

  4. Make any tweaks, or choose from over 100 voices and languages

  5. Share or embed it instantly, everywhere your team or customers are!

(You can even track views, add brand logos, and update guides over time to stay relevant.)

AI & RIVALRIES

Mistral reignites this week’s LLM rivalry with Large 2

Source: Mistral

They say good things come in threes, and Mistral’s here to prove it. The Olympic Games apparently didn’t stop the Paris-based startup from releasing the next generation of its flagship model, Large 2. That announcement came just a day after Meta unveiled Llama 3.1 — and a week after OpenAI showed off GPT-4o Mini. 

So what sets Mistral’s model apart? Large 2’s unique selling point is its ratio between size and performance. At 123 billion parameters, it’s making the case that bigger isn’t always better. Llama 3.1 405B is more than three times larger, but both models perform similarly on coding and math benchmarks. When it comes to reasoning, meanwhile, Large 2 is said to be competitive with leading models like GPT-4o and Anthropic’s Claude 3.5.

What else? Europe’s best-funded AI startup is also leaning into its model’s language skills. Large 2 is natively fluent in at least five languages and just got support for dozens more, including Arabic, Hindi, and Japanese. 

And in the debate over whether models should be closed or open source, Mistral is carving out a distinctive niche: Researchers and scientific nonprofits can access Large 2 at no cost, while for-profit companies will have to pay for it. That might be a winning strategy, especially as AI startups struggle to strike a balance between serving the public and turning a profit. 

Users can now try Large 2 for themselves by logging in to Mistral’s ChatGPT rival, Le Chat.

PRESENTED BY PIPEDRIVE

This AI writes sales emails for you (and they work)

Pipedrive AI helps 100k+ sales teams save time and boost win rates phasing out manual processes, identifying patterns, and recommending high-potential deals and priority actions.

  • Draft quick, compelling, and personalized emails from simple prompts in just a click.

  • Summarize threads with Al to save valuable time.

Get a 30-day free trial (no credit card required) + take 20% off your first year. Start here.

AI & TECH NEWS

Everything else you need to know today

Source: OpenAI

  • Search Showdown: After months of speculation, OpenAI announced its testing out an AI-powered search feature called SearchGPT — a move that likely has Alphabet and Perplexity sweating.

  • No Results Found: Browsers like Bing and DuckDuckGo will no longer be able to surface new Reddit posts due to the site’s exclusive AI partnership with Google.

  • Front and Center: Microsoft’s Bing is getting a redesign that makes its AI-generated answers more prominent, while placing traditional search results in the margins of the screen.

  • Silicon Tumble: The stock market had its worst day since 2022 Wednesday as some traders lost confidence that prominent AI companies would be able to turn their investments into revenue.

  • Maps Migrates: After more than 10 years as an iPhone app, Apple Maps can finally be accessed via a website on computer browsers.

😄 One Fun Thing: In 2021, Dutch artist Ard Gelinck edited photos of different celebrities so that it looked like they were sitting next to their younger selves. Last week, he took it to the next level by using Kling AI to turn those images into videos. The results left some viewers bewildered. “AI is fun but also really, really crazy and scary and also moving,” Gelinck said of his creation.

🧠 Brain Food: A groundbreaking study from researchers at the University of Cambridge found that feeding LLMs too much AI-generated content can lead to “model collapse,” when a model starts spewing gibberish. The results show that synthetic data alone likely isn’t enough to train today’s LLMs.

PRODUCTIVITY

5 AI Tools to Supercharge Your Productivity

✅ Hemingway Editor Plus: Fix wordy sentences, grammar issues, and more with the help of AI.

✅ FlexClip: Use AI to create clips smarter and faster — now with a background noise removal feature.

✅ Lately*: The world’s first Deep Social platform powered by Neuroscience-Driven AI™. Elevate your content from meh to WOW - faster than you can grab a coffee ☕

✅ Tern: Input your travel preferences, then receive a custom itinerary that’s perfectly suited to your tastes.

✅ Airtable Cobuilder: Instantly create an app that connects and streamlines your most critical data.

PS: Want more? Check out our Top 100 AI Tools.

* indicates a promoted tool, if any

PROMPT OF THE DAY

Friday Funday - Pair it up

Prompt: What would be a good [beverage] to drink with [meal]?

Examples: 
- What would be a good bottle of wine to serve with a rotisserie chicken dinner?
- What would be a good mocktail to drink with a crab cake?
AI-GENERATED IMAGES

Inflatable Paris

Source: @gengzibo123 on Midjourney

Midjourney Prompt: drone shot, Transparent inflatable tubes form [insert monument here, like Arc de Triomphe or Eiffel Tower], The scene exudes a surreal, whimsical vibe, purple tone blue sky with no cloud, surrealistic aesthetic, An Unreal Engine rendering in a cinematic style, purple tones, enormous tower, minimalistic aesthetic, best quality
--ar 3:4 --iw 0.5

Acquire new customers and drive revenue by partnering with us

Superhuman is the world’s biggest AI newsletter for businesses and professionals with 600,000+ readers working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.

🧞Your wish is my command

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Thanks for reading.

Until next time!

Zain & the Superhuman AI team

p.s. If you liked this newsletter, share it with your friends and colleagues here.