A new model is said to best GPT-4o

ALSO: The cost of training new models

Read time: under 4 minutes

Welcome back, Superhuman

A three-year-old AI startup has allegedly just jumped ahead of its deep-pocketed competitors. In a year already packed with AI news, Anthropic claims its new Claude 3.5 Sonnet is raising the bar yet again.

Today’s Insights

  • Anthropic’s new model outpaces competitors

  • Chart: It’s getting harder to train new models

  • 5 new AI tools to boost your productivity

  • Everything else you should know today

  • AI-Generated Images: Klimt x Star Wars

NEXT IN AI

Anthropic releases a new model that beats GPT-4o

Source: Anthropic

Who can resist an underdog story? AI startup Anthropic has an estimated 375 employees. DeepMind has more than 2,500. And while OpenAI is now valued at more than $80 billion, Anthropic is reportedly worth about a quarter of that.

So, it’s all the more impressive that the three-year-old startup just released a new model, Claude 3.5 Sonnet, that it claims can outperform its competitors. Benchmarks are notoriously unreliable — there’s still no standardized way to measure models’ accuracy and efficiency. But by all indications, Anthropic’s latest release performs better than both Gemini 1.5 Pro and the much-hyped GPT-4o across several key text- and vision-based metrics.

Who’s it for? Claude has garnered a reputation as the most literary and intellectual model on the market — the go-to choice for those working on creative or writing-intensive projects. It also caters to firms by letting them fine-tune its models for specific purposes. 

What’s with the name? The company usually divides its models into threes: There’s the compact, efficiency-oriented Haiku, the balanced Sonnet, and the powerhouse Opus. In this case, it’s releasing only a new Sonnet model, with Haiku and Opus presumably coming soon. 

Source: Anthropic

What’s different? Most of the changes are relatively subtle: Claude is now better at math, reasoning, and code generation, among other skills. It’s also said to be 80% cheaper and twice as fast as Claude 3 Opus, its previous state-of-the-art model.

One notable addition is the new Artifacts feature, which lets you tweak text and images directly in the Claude platform itself. Most chatbots are relatively minimalistic, so this could be one of the first bridges between traditional chatbots, word processors, image editors, and other types of interfaces.

What it means: Anthropic can’t relax and bask in praise for too long. While impressive, its new model is only a couple steps ahead of its rivals — not quite a leap. AI enthusiasts are still clambering for a platform that can truly take things to the next level, like one that can multitask or solve high-level logic problems.

PRESENTED BY GALILEO

Finally: Instant, accurate, low-cost GenAI evaluations

Why are Fortune 500 companies everywhere switching to Galileo Luna for enterprise GenAI evaluations?

  • 97% cheaper, 11x faster, and 18% more accurate than GPT-3.5

  • No ground truth data set needed

  • Customizable for your specific evaluation requirements

PROMPT OF THE DAY

Friday Funday - Time Travel

Prompt: I want you to act as my time travel guide. I will provide you with the historical period or future time I want to visit and you will suggest the best events, sights, or people to experience. Do not write explanations, simply provide suggestions and any necessary information. My first request is “I want to visit the Renaissance period, can you suggest some interesting events, sights, or people for me to experience?“

You can adapt the prompt to your specific needs.

Source: Beebom

CHART

The skyrocketing cost of training AI models

The cost of training AI platforms is doubling every nine months, with little sign of leveling off, according to a recent analysis from the research organization Epoch AI. At this rate, training new models could soon cost more than $1 billion, especially when electricity and hardware expenses are factored in. And that’s without considering employee compensation, which can account for up to half of the total cost of training a new model.

Whether this trend will continue into the next decade remains an open question. Some analysts think AI companies will soon run out of data — and that both training costs and performance gains will start to plateau as a result. Others believe that as demand for AI chips goes through the roof, they’ll get even more expensive. That would likely make it more difficult for smaller startups to keep up with their higher-profile rivals.

For its part, Microsoft is working with OpenAI on a supercomputer that will reportedly cost upwards of $100 billion — and will be made up of millions of chips. That project would cost 100 times more than what it costs to run modern-day data centers, according to tech publication The Information.

PRODUCTIVITY

5 AI Tools to Supercharge Your Productivity

 Site Forge: Generate sitemaps, wireframes, and content for websites in minutes.

 Warp: Use plain English to accomplish multi-step workflows with AI that’s native to your Mac terminal.

 Summit: An AI life coach that helps you organize and track your goals, holds you accountable, and is there for you 24/7.

 AI Signature Generator: Generate personalized and professional eSignatures, then refine them with AI.

 Accorata: Find and track global pre-seed and seed startups at unmatched speed with this AI-powered deal sourcing platform.

PS: Want more? Check out our Top 100 AI Tools.

* indicates a promoted tool, if any

AI & TECH NEWS

Everything else you need to know today

Source: Target

  • AI Bullseye: Target is rolling out a new generative AI tool across 2,000 stores that will let employees ask questions about policies and procedures, like how to restart the retailer’s cash registers.

  • Cloned Crooners: Universal Music Group is partnering with AI startup SoundLabs to give its musicians access to a tool that lets them create AI models of their own voices for use in future recordings.

  • Bending Reality: Snapchat is working on a new AI model that will let you switch out your clothing, change your background, or turn text prompts into unique filters, effects, or characters.

  • On a Roll: Following the unlikely success of its AI-powered Ray-Ban smart glasses, Meta is building out a new department dedicated to developing new wearables.

😄 One Fun Thing: You’d have to have a heart of stone not to feel sorry for Unitree’s robot dog. A recent test video shows the four-legged machine bouncing back after repeated kicks, hits, and throws. It managed to perform a triple backflip despite the brutal battering.

🤩 Even More Fun: In a new study, DeepMind asked 20 comedians to try using chatbots to write jokes. They said the models served as valuable brainstorming companions but that the jokes they generated aren’t ready for primetime. Take for example: "I decided to switch careers and become a pickpocket after watching a magic show. Little did I know, the only thing disappearing would be my reputation!"

AI-GENERATED IMAGES

Gustav Klimt’s Star Wars

Source: Reddit user u/Liquid-glass with Midjourney

Prompt: Create a Gustav Klimt-inspired portrait of a [insert name of thing here], capturing the essence of Klimt's golden hues, intricate patterns, and symbolic elements, while integrating the unique features and charm of the [insert name of thing here again] from Star Wars

Acquire new customers and drive revenue by partnering with us

Superhuman is the world’s biggest AI newsletter for businesses and professionals with 600,000+ readers working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.  

🧞 Your wish is my command 

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Reviews of the day

Thanks for reading.

Until next time!

Zain & the Superhuman AI team

p.s. If you liked this newsletter, share it with your friends and colleagues here.