- Superhuman AI
- Posts
- š¼ļø Gemini goes multimodal
š¼ļø Gemini goes multimodal
ALSO: Learn how to use ChatGPT Deep Research from scratch

Read time: under 4 minutes
Welcome back, Superhuman. Alphabet is already back for more. Hot off the heels of Gemma 3, it just became the first company to hand out access to a fully multimodal image generator. Keep reading for some examples of what you can create with the powerful new feature.
Todayās Insights
Efficiency-focused AI, realistic ads, and hybrid models
Gemini gets a major overhaul
Tutorial: Learn to use ChatGPTās Deep Research from scratch
5 new AI tools to boost your productivity
News, memes, whatās trending on socials, and more
TODAY IN AI

Captionsā new tool lets you generate realistic influencer ads. Source: Captions
1. Cohere drops ultra-efficient model for businesses: Googleās Gemma 3 is only a few days old, and thereās already a new LLM giving it a run for its money. Cohere just launched Command A, which can run on only two Nvidia chips while being about as powerful as DeepSeek and OpenAIās latest offerings. The Toronto-based startup says the model is great for smaller businesses who donāt have dozens of pricey GPUs at their fingertips.
2. New platform generates ads that look eerily realistic: NYCās Captions is rolling out a new feature called Mirage that lets you create marketing campaigns with AI-generated influencers. You can either write a script or upload audio and have the avatar automatically sync up with it. Natural-looking body language and āmicro-expressionsā make each AI influencer look far more convincing than what weāve seen from similar tools, too. You can try it out here.
3. Boundary-pusher Nous Research drops two-in-one LLM: Known for its censorship-free models, Nous unveiled a preview version of DeepHermes, in 3B and 24B sizes. It lets you toggle between chain-of-thought and quick thinking ā becoming āone of the first models in the worldā to fuse both modes into a single interface. Itāll also share its entire thought process, unlike most closed-source rivals. (Hereās a link to try it out via the collectiveās Discord channel.)
PRESENTED BY GOOGLE
2025 will be āthe most defining yearā for startups. Learn why

Gain the upper hand on the year ahead with Google Cloudās Future of AI: Perspectives for Startups report.
It explains key AI predictions from 20+ industry experts from Google Cloud, Social Capital, and more:
Where startups should focus resources to beat the competition
Hidden opportunities to create immediate value
Steps to scale AI products efficiently
FROM THE FRONTIER
Gemini brings multimodal image generation to the masses

Alphabet is giving away more Gemini features at no cost. Source: Google
In an industry-first, Alphabet just gave everyone access to Geminiās native image generation capabilities on AI Studio.
What makes it unique? Usually, your image prompts get fed through an LLM ā and in the process, details can get lost in translation. Gemini 2.0 Flash, on the other hand, can move between different mediums on-the-fly, delivering much better speed and accuracy.
What can you do with it? You can generate a recipe and add photos for each step. Or embed entire words or sentences into your images. You can also quickly edit specific parts of an image without having to regenerate an entirely new one.
Here are some of our favorite examples:
Transforming the cover of a magazine, and swapping out its price tag
Quickly adding chocolate drizzle to a pile of croissants
Creating video game characters, then dropping them into a 3D world
Applying the style of one image to another
Geminiās also getting a major upgrade: Now, anyone can try out the platformās Deep Research feature, not just paid subscribers. Alphabet is also opening up access to Gems ā customizable presets ālike a translator, meal planner, or math coach.ā Finally, Gemini can now integrate into your search history and apps for a more personalized experience.
THE AI ACADEMY
Learn how to use ChatGPT Deep Research from scratch
ChatGPTās Deep Research is the best AI product Iāve used this year and it might be the first one that surpasses humans in research ability. Watch the full tutorial here.
Go to ChatGPT and sign up. (Upgrade to Pro or Plus for Deep Research)
Use 03-mini high or 03-mini reasoning models and select Deep Research.
Type a detailed, specific prompt with clear goals.
Answer any follow-up questions and wait 5-60 minutes for the generated report.
Review the report, including sources, summary, and ask follow-up questions for further details.
Note: Deep Research was opened to ChatGPT Plus users shortly after I recorded this video, so you no longer need the Pro plan to use it.
PRESENTED BY VANTA
Is Compliance Holding You Back? Vanta Can Help

Navigating new compliance requirements can be a daunting task, but with the right automation tools, it doesn't have to be.
Whether youāre a fast-growing startup or an established security team, Vanta can help you achieve continuous compliance (and more).
Join the live demo on April 3 to learn how Vanta can help you automate compliance for frameworks like SOC 2, ISO 27001, HIPAA, HITRUST, and ISO 42001 and build customer trust.
AI & TECH NEWS
Everything else you need to know today

Snap just introduced three new AI video lenses. Source: Snap
š± Open Sesame: Sesame just released the AI model powering their viral Maya assistant under an Apache 2.0 license, making it super easy to clone voices in under a minute - though some are worried about the lack of safeguards to prevent misuse.
šļø Polyglot Pros: According to Bloombergās Mark Gurman, Apple's working on adding a real-time live translation feature to AirPods, which could roll out later this year alongside iOS 19.
āØ All in One: Alibaba announced its consumer chatbot will now be powered by its frontier Qwen models, featuring deep thinking, search, and other agentic capabilities, which can all be accessed from a single app.
š¤³ Lens Leap: Social media platform Snap introduced new generative video ālensesā that let you do things like animate animals and add dynamic objects into your shot.
šµļø On the Case: The creator of āLaw & Orderā is launching an AI-generated murder mystery game, which will be updated with new mysteries on a daily basis.
PRODUCTIVITY
5 AI Tools to Supercharge Your Productivity
ā Duck AI: Anonymous access to popular AI models like GPT-4o mini and Claude 3.
ā Quadratic: Chat with your data, connect databases, and visualize results in a code-friendly all-in-one tool.
ā Innovating with AI*: Just welcomed 200 new students into The AI Consultancy Project, their new program that trains you to build a business as an AI consultant. Request early access now!
ā Greta: Ship any full-stack applications within seconds without writing a single line of code.
ā Whisper V3: Transcribe long-form YouTube videos with the click of a button.
* indicates a promoted tool, if any
PROMPT OF THE DAY
Improve Focus
Prompt: I need suggestions for techniques that can help improve focus and productivity during work hours. The techniques should be practical, easy to implement, and effective in helping individuals stay focused and avoid distractions while working.
Your task is to provide a list of proven techniques that can help improve focus, concentration, and productivity during work hours.
Work environment: [work environment]
Typical distractions: [typical distractions]
Source: promptadvanceclub
SOCIAL SIGNALS
Whatās trending on socials today

š¦ Squawk Back: Watch what happens when a parrot tries to have a conversation with Blandās AI voice assistant. āParrot-1B shows signs of promise in natural language understanding but has a very small context window,ā one commenter joked.
š Thread the Needle: X user ās13kā vibe-coded a game that lets you try to land SpaceXās booster rocket into the now-famous āchopsticks.ā You can try to beat his record of 7.2 seconds here.
š No Strings: Builder Catalin Pit shared a list of open-source alternatives to popular AI-powered apps like Bitly, Jira, and Docusign.
šø Fantasy Filter: AI video platform Pika just dropped a new batch of effects, including āmuseum me,ā ābaby me,ā āhero me,ā and āprincess me.ā
š© Bot Burnout: Anthropic CEO Dario Amodei explained why he thinks there should be an āI quitā button embedded in chatbots so they can let us know when theyāre getting too overwhelmed or stressed.
AI-GENERATED IMAGES
Springtime bliss

Source: ubudmdulc174 on Midjourney
Midjourney Prompt: Impressionist wood grain oil painting, parallel view, A photo of two cute [enter subject] wearing blue harnesses, lying on the grassy shore of Lake Lisa in KitzbĆ¼hel with a clear sky and mountains in the background. One [enter subject] is standing up while the other is sitting down. The scenery includes green meadows, lake water, and a distant mountain range under sunny skies. Bright sunlight highlights their fur colors and playful expressions, small strokes of oil painting, --stylize 250 --profile 6rtguen
Acquire new customers and drive revenue by partnering with us
Superhuman is the worldās biggest AI newsletter for businesses and professionals with 1M+ readers and 2M+ followers on socials working at the worldās leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.
š§ Your wish is my command
What did you think of today's email?Your feedback helps me create better emails for you! |
Got more feedback or just want to get in touch? Reply to this email and weāll get back to you.
Thanks for reading.
Until next time!
Zain & the Superhuman AI team