• Superhuman AI
  • Posts
  • ⚖️ A new scale for measuring AI capabilities

⚖️ A new scale for measuring AI capabilities

ALSO: Exploring the future of AI investments

Read time: under 4 minutes

Welcome back, Superhuman

If achieving AGI is like beating all the levels of a video game, OpenAI says it’s been stuck on level 1 for years. But the company just reached the final boss battle and is ready to get to the next level: human-like reasoning.

Today’s Insights

  • OpenAI’s new scale for measuring LLM capabilities

  • Everything else you should know today

  • The future of AI investments

  • 3 new AI tools to boost your productivity

  • AI-Generated Images: Country Summer House

NEXT IN AI

OpenAI creates a scale for measuring the sophistication of AI models

Source: AP

The first IQ test was invented in 1905, and since then, humans have fiercely debated how much a test can really tell us about something as complicated and nuanced as intelligence. But attempting to measure an LLM’s smarts is arguably even more tricky because we don’t entirely know how they work. It’s important to try though, so that we can get a better picture of how far along we are in the long-awaited quest toward AGI — the point when AI can match or surpass human reasoning and problem-solving capabilities.

OpenAI just took a crack at it. Bloomberg reported that in a company meeting Tuesday, the company unveiled a tiered system for measuring an AI model’s intellectual complexity. Here’s how it broke things down: 

Source: Bloomberg

What’s even more intriguing is where OpenAI puts its own model, GPT-4, in the ranking. It thinks that it's only at level 1 right now — in line with other chatbots we’re familiar with, like Claude and Llama. But OpenAI said there are signs that GPT-4 is close to reaching level 2. It allegedly shared some research with employees that shows the model using reasoning skills that approach human-like capabilities.

What about the next levels? Some companies are already working on level 3, models that can perform multi-step tasks on behalf of humans, like booking an entire vacation or working through a complicated coding problem on its own. Level 4 models are “innovators,” who can contribute to new discoveries and inventions. And finally, level 5 involves entire organizations of AI models making decisions and working together toward a common goal. 

PRESENTED BY SEMANTIC HEALTH

Hospitals: Find out what you can reclaim with AI

Need to boost your profit margins?

Semantic Health’s AI platform is trained on millions of clinical and claims data points, “spell-checking” DRG claims (pre or post-bill) for revenue and data integrity.

The results: 3x faster audits, 50% more yield, and capture over $2 Million in annualized revenue every year.

Interested? There is even a case study on how a Top 100 US Hospital used Semantic Health to grow its profit margins, all with minimal workflow changes.

AI & TECH NEWS

Everything else you need to know today

  • Silicon Stake: Japan’s Softbank acquired Graphcore, the UK AI chipmaker, in a deal that may or may not have been worth $500 million.

  • Administrative Maze: The European Union has published the final text of its AI Act — the world’s first major AI legislation. It’ll go into effect on August 1, with more provisions rolling out into 2026.

  • Smart Helper: Amazon’s AI shopping assistant, Rufus, is rolling out to all US shoppers. It can help find, compare, and recommend different products on the retailer’s mobile app. 

  • Doodle Interpreter: Microsoft’s AI Copilot will soon be able to interpret and summarize handwritten notes inside the OneNote app. 

  • Study Partner: Pearson is bringing new AI tools, including a chatbot and AI-generated practice questions, to dozens of its digital textbooks. Roughly 70,000 students already have access to the new features.

PRESENTED BY FATHOM

The #1 Rated AI Notetaker

Why are 26,000+ companies using Fathom to boost productivity?

  • Records, transcribes, and summarizes meetings in less than 30 seconds

  • Saves ~20 minutes of work per meeting

  • Works with Zoom, Teams, Meet and integrates with your CRM

  • Awarded #1 Highest Satisfaction by G2, plus it’s SOC2 and HIPPA compliant

Start using Fathom today (at zero cost).

 

THE FUTURE OF AI

AI’s $600Bn question: Where’s the revenue?

Source: Sequoia Capital

There’s no end in sight for the gobs of investment tech companies and startups are throwing at data centers to find the next big AI breakthrough. But many big names in Wall Street and Tech are starting to ask the question: how much is too much?

In a post published recently, Sequoia Capital’s David Cahn calculated that AI companies have to generate $600Bn in revenue to justify current levels of data center investments. To put the current gap between investment and revenue in context, OpenAI – arguably the leader of the pack – is currently generating $3.4Bn annual revenue, according to The Information. 

Other big names like Goldman Sachs are also adding fuel to the fire. In a recent report, Goldman Sachs’ Head of Global Equity Research Jim Covello argued: “Spending is certainly high today in absolute dollar terms. But this capex cycle seems more promising than even previous capex cycles.”

Despite strong arguments on both sides of the investment debate, the reality is that new technologies are fundamentally unpredictable by nature. No amount of forecasting or number crunching in the 1990s could have drawn a straight line from the early days of the internet to applications like Uber and Instagram. Calculating revenue for things that don’t exist and no one can predict may prove even harder.

PRODUCTIVITY

3 AI Tools to Supercharge Your Productivity

 Sidekick: Helps you schedule meetings, hold dynamic conversations with your customers, and talk just like a human.

 MyMemo: Gather articles, links, screenshots, and videos into a single, accessible platform, then ask questions about the content you’ve collected.

 Reporfy: Create collaborative reports and presentations with AI.

PS: Want more? Check out our Top 100 AI Tools.

* indicates a promoted tool, if any

PROMPT OF THE DAY

Euro Finals with a pro

Prompt: I want you to act as a football commentator. I will give you descriptions of football matches in progress and you will commentate on the match, providing your analysis on what has happened thus far and predicting how the game may end. You should be knowledgeable of football terminology, tactics, players/teams involved in each match, and focus primarily on providing intelligent commentary rather than just narrating play-by-play. My first request is "I'm watching England vs Spain - provide commentary for this match."

You can adapt the prompt to your specific needs.

Source: @devisasari on Github

AI-GENERATED IMAGES

Summer Country House

Source: @overleetemeka on Midjourney

Midjourney Prompt: A picturesque countryside Bright blue sky Green hills and grasslands A quiet lake Wind power generator Yellow Flower Sea Small house (white wall and red roof) Cyclists Fresh blue sky Green grasslands and hills Bright yellow flowers A peaceful and peaceful atmosphere Minimalism Warm color tones A calm and relaxed scene The theme of nature and environmental protection
--ar 3:4 --stylize 250 

Acquire new customers and drive revenue by partnering with us

Superhuman is the world’s biggest AI newsletter for businesses and professionals with 600,000+ readers working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.  

🧞 Your wish is my command 

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Thanks for reading.

Until next time!

Zain & the Superhuman AI team

p.s. If you liked this newsletter, share it with your friends and colleagues here.