• Superhuman AI
  • Posts
  • Native image gen unlocks these five new capabilities

Native image gen unlocks these five new capabilities

ALSO: How to do research on any topic using Grok3

Read time: under 4 minutes

Welcome back, Superhuman. Many of us had yet another “feel the AGI” moment this week after trying out the new version of ChatGPT, which now features natural image capabilities. Today, we’ll dive into five unique use cases that are now unlocked thanks to the update.

Today’s Insights

  • Ideogram 3.0, ChatGPT meets MCP, and Alibaba’s latest release

  • Native image generation unlocks these five new capabilities

  • Tutorial: How to do research on any topic using Grok3

  • 5 new AI tools to boost your productivity

  • News, memes, what’s trending on socials, and more

TODAY IN AI

Ideogram 3.0 unlocks a new level of realism. Source: Ideogram

1. Ideogram 3.0 jumps to the top of major image benchmarks: A day after OpenAI announced native image generation capabilities, the company is already facing some stiff competition. Ideogram, a Toronto-based startup with backing from a16z, just released Ideogram 3.0, which it claims human testers prefer over rivals like OpenAI’s Dall-E 3 and Google’s Imagen 3. With over 4.3B style presets, it’s especially good at photorealism, text generation, and graphic design. You can try it here.

2. ChatGPT gets third-party support: OpenAI CEO Sam Altman announced the startup will now support MCPs — meaning you’ll be able to create a custom server that lets ChatGPT seamlessly interact with third-party apps. Plus, the Information reports that the company could soon invest billions in five exabytes (that’s 5 billion gigs) of data storage, giving its researchers eight times more compute than last year as it looks to triple its revenue to $12.7B. 

3. Alibaba’s latest model can do it all: The Chinese e-commerce giant just unveiled Qwen2.5 Omni, which can work seamlessly across text, audio, video, and images. As a fully multimodal model, it can analyze uploaded videos, detect objects in a photo, or turn text documents into speech, features that are usually only available with pricey, closed-source LLMs. You can watch a demo of the new model here or try out its audio and video features here.

PRESENTED BY SPINACH AI

Record your conversations and let AI agents do the rest

Spinach AI is not only the most accurate AI notetaker in the market, it also runs agents to automate post-meeting tasks.

  • Records, transcribes, and summarizes virtual, hybrid, and in-person meetings

  • Creates tasks and tickets in your tools: Jira, Linear, Asana, Monday, Trello, or ClickUp

  • Updates notes and action items in your CRM: HubSpot, Salesforce, Zoho, or Attio

  • Writes recap emails and generates documents based on your preferences

Backed by Y Combinator, Zoom & Atlassian, trusted by over 200,000 professionals.

FROM THE FRONTIER

Natural image generation unlocks these five new capabilities

GPT-4o now gives you a much deeper level of control over your creations. Source: OpenAI

GPT-4o now understands text and images in equal measure, giving you a much deeper level of control over your generations. That means common problems like glitched-out hands and typo-filled text might soon be a thing of the past. But more importantly, it also unlocks many new capabilities that just weren’t possible before.

Here are five examples of what’s possible: 

  • Home redecoration: Upload a photo of your room, then ask ChatGPT to restyle it. You can even combine elements from multiple photos — for example, bringing the hardwood floors from one photo into another.

  • Intuitive editing: OpenAI’s Patrik Goethe gave Da Vinci’s “The Last Supper” a Lego filter, with tech CEOs as its subjects. Another user transformed his headshot into the styles of Studio Ghibli, Wallace and Gromit, Rick and Morty, and Attack on Titan.

  • Vibe marketing: Upload a product photo, then create an entire ad campaign around it. Or, use already-existing ads as inspiration for the new ones.

  • Create complex diagrams: Break down complicated concepts into visuals and charts, or even comic strips and memes.

  • UI Mockups: Experiment with different user interface ideas for apps, webpages, and social media platforms.

THE AI ACADEMY

How to do research on any topic using Grok3

  • Go to Grok and sign in with your account.

  • Now turn on ‘Deep Search’, enter your prompt, and press Enter.

Sample Prompt: Analyze the current trends of [enter your topic]. What niches are most profitable, and what trends are shaping their future?

  • Wait for a few seconds and you’ll get a thorough research on your topic with all the references.

  • Use the findings to adjust your strategy, target profitable niches, and stay ahead of emerging trends in your industry.

PRESENTED BY INNOVATING WITH AI

Want to become an AI Consultant?

Innovating with AI just welcomed 200 new students into The AI Consultancy Project, their new program that trains you to build a business as an AI consultant:

  • Tools, frameworks, and a 6-month plan to build a 6-figure AI consulting business

AI & TECH NEWS

Everything else you need to know today

Amazon is testing out a new AI shopping assistant. Source: Amazon

🛍️ Secret Shopper: Amazon is testing out a new AI shopping assistant that gives you custom product recommendations while taking things like your budget and style preferences into account.

👀 Server Scoop: According to The Information, Nvidia is in talks to acquire Lepton AI, a startup that leases servers equipped with Nvidia's AI chips, in a deal reportedly valued at several hundred million dollars.

🏃 Race Against Time: The Trump administration blacklisted dozens more Chinese companies as the US tries to cut off the country’s rapid AI progress. The founder of a top Shanghai-based AI startup thinks China now trails the US by only three months.

🧑‍💻 Talk it Out: Popular AI app builder Bolt unveiled a new Discussion Mode that lets you “brainstorm, plan, and debug your project” without any coding experience.

🤝 Agent Alliance: In a “landmark deal” worth $100M, Databricks will now let its 10,000 clients build custom AI agents powered by Anthropic’s Claude 3.7 Sonnet, as both companies race to bring in more revenue.

PRODUCTIVITY

5 AI Tools to Supercharge Your Productivity

 Gemini 2.0 Flash: Create and edit images using simple conversations with new native image generation.

 Buzz Clip: Create viral AI TikToks and UGC ads in less than a minute.

 Datadog*: Start monitoring your entire stack in one, real-time observability platform and gain comprehensive visibility in minutes. Try it for yourself (at no cost).

 ClipZap: Clip, edit, and translate videos automatically with AI.

 Quick Mock: Turn any job description into a mock interview instantly with AI.

* indicates a promoted tool, if any

PROMPT OF THE DAY

Creative Writing Hooks

Prompt: I'm writing a piece on [topic] and I want the introduction to instantly grab attention. Give me 5 different types of creative writing hooks I can use—such as a bold statement, a surprising fact, a relatable scenario, a rhetorical question, or a short story. Each hook should draw the reader in emotionally or intellectually, making them want to keep reading. Tailor the tone to be [e.g., witty, dramatic, inspiring, casual] and ensure the hook connects smoothly to the main idea of the piece.

SOCIAL SIGNALS

What’s trending on socials today

🖥️ So Meta: Scale AI’s Riley Goodside prompted GPT-4o to create a “self-referential” screenshot of a Wikipedia page called “The Screenshot,” which describes itself in an endless loop.

🐎 Dark Horse: AI educator Angry Tom shared 10 impressive examples of Reve AI’s new image generator, which quickly jumped to the top of multiple leaderboards this past weekend.

📱 Proof of Concept: Entrepreneur Matthew Berman compiled some of the best “one-shot” demos and simulations made using Gemini 2.5 Pro’s upgraded coding capabilities, including a Rubik’s cube solver and lego building sim.

AI-GENERATED IMAGES

Infographics

ChatGPT Prompt: Create a detailed infographic on [enter your topic]

Acquire new customers and drive revenue by partnering with us

Superhuman is the world’s biggest AI newsletter for businesses and professionals with 1M+ readers and 2M+ followers on socials working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.

🧞 Your wish is my command

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Got more feedback or just want to get in touch? Reply to this email and we’ll get back to you.

Thanks for reading.

Until next time!

Zain & the Superhuman AI team