šŸ–¼ļø Gemini goes multimodal

ALSO: Learn how to use ChatGPT Deep Research from scratch

Read time: under 4 minutes

Welcome back, Superhuman. Alphabet is already back for more. Hot off the heels of Gemma 3, it just became the first company to hand out access to a fully multimodal image generator. Keep reading for some examples of what you can create with the powerful new feature.

Todayā€™s Insights

  • Efficiency-focused AI, realistic ads, and hybrid models

  • Gemini gets a major overhaul

  • Tutorial: Learn to use ChatGPTā€™s Deep Research from scratch

  • 5 new AI tools to boost your productivity

  • News, memes, whatā€™s trending on socials, and more

TODAY IN AI

Captionsā€™ new tool lets you generate realistic influencer ads. Source: Captions

1. Cohere drops ultra-efficient model for businesses: Googleā€™s Gemma 3 is only a few days old, and thereā€™s already a new LLM giving it a run for its money. Cohere just launched Command A, which can run on only two Nvidia chips while being about as powerful as DeepSeek and OpenAIā€™s latest offerings. The Toronto-based startup says the model is great for smaller businesses who donā€™t have dozens of pricey GPUs at their fingertips.

2. New platform generates ads that look eerily realistic: NYCā€™s Captions is rolling out a new feature called Mirage that lets you create marketing campaigns with AI-generated influencers. You can either write a script or upload audio and have the avatar automatically sync up with it. Natural-looking body language and ā€œmicro-expressionsā€ make each AI influencer look far more convincing than what weā€™ve seen from similar tools, too. You can try it out here.

3. Boundary-pusher Nous Research drops two-in-one LLM: Known for its censorship-free models, Nous unveiled a preview version of DeepHermes, in 3B and 24B sizes. It lets you toggle between chain-of-thought and quick thinking ā€” becoming ā€œone of the first models in the worldā€ to fuse both modes into a single interface. Itā€™ll also share its entire thought process, unlike most closed-source rivals. (Hereā€™s a link to try it out via the collectiveā€™s Discord channel.)

PRESENTED BY GOOGLE

2025 will be ā€œthe most defining yearā€ for startups. Learn why

It explains key AI predictions from 20+ industry experts from Google Cloud, Social Capital, and more:

FROM THE FRONTIER

Gemini brings multimodal image generation to the masses

Alphabet is giving away more Gemini features at no cost. Source: Google

In an industry-first, Alphabet just gave everyone access to Geminiā€™s native image generation capabilities on AI Studio.

What makes it unique? Usually, your image prompts get fed through an LLM ā€” and in the process, details can get lost in translation. Gemini 2.0 Flash, on the other hand, can move between different mediums on-the-fly, delivering much better speed and accuracy.

What can you do with it? You can generate a recipe and add photos for each step. Or embed entire words or sentences into your images. You can also quickly edit specific parts of an image without having to regenerate an entirely new one.

Here are some of our favorite examples:

  • Transforming the cover of a magazine, and swapping out its price tag

  • Quickly adding chocolate drizzle to a pile of croissants

  • Creating video game characters, then dropping them into a 3D world

  • Applying the style of one image to another

Geminiā€™s also getting a major upgrade: Now, anyone can try out the platformā€™s Deep Research feature, not just paid subscribers. Alphabet is also opening up access to Gems ā€” customizable presets ā€œlike a translator, meal planner, or math coach.ā€ Finally, Gemini can now integrate into your search history and apps for a more personalized experience.

THE AI ACADEMY

Learn how to use ChatGPT Deep Research from scratch

ChatGPTā€™s Deep Research is the best AI product Iā€™ve used this year and it might be the first one that surpasses humans in research ability. Watch the full tutorial here.

  • Go to ChatGPT and sign up. (Upgrade to Pro or Plus for Deep Research)

  • Use 03-mini high or 03-mini reasoning models and select Deep Research.

  • Type a detailed, specific prompt with clear goals.

  • Answer any follow-up questions and wait 5-60 minutes for the generated report.

  • Review the report, including sources, summary, and ask follow-up questions for further details.

Note: Deep Research was opened to ChatGPT Plus users shortly after I recorded this video, so you no longer need the Pro plan to use it.

PRESENTED BY VANTA

Is Compliance Holding You Back? Vanta Can Help

Navigating new compliance requirements can be a daunting task, but with the right automation tools, it doesn't have to be.

Whether youā€™re a fast-growing startup or an established security team, Vanta can help you achieve continuous compliance (and more).

Join the live demo on April 3 to learn how Vanta can help you automate compliance for frameworks like SOC 2, ISO 27001, HIPAA, HITRUST, and ISO 42001 and build customer trust.

AI & TECH NEWS

Everything else you need to know today

Snap just introduced three new AI video lenses. Source: Snap

šŸŒ± Open Sesame: Sesame just released the AI model powering their viral Maya assistant under an Apache 2.0 license, making it super easy to clone voices in under a minute - though some are worried about the lack of safeguards to prevent misuse.

šŸŒŽļø Polyglot Pros: According to Bloombergā€™s Mark Gurman, Apple's working on adding a real-time live translation feature to AirPods, which could roll out later this year alongside iOS 19.

āœØ All in One: Alibaba announced its consumer chatbot will now be powered by its frontier Qwen models, featuring deep thinking, search, and other agentic capabilities, which can all be accessed from a single app.

šŸ¤³ Lens Leap: Social media platform Snap introduced new generative video ā€œlensesā€ that let you do things like animate animals and add dynamic objects into your shot.

šŸ•µļø On the Case: The creator of ā€œLaw & Orderā€ is launching an AI-generated murder mystery game, which will be updated with new mysteries on a daily basis.

PRODUCTIVITY

5 AI Tools to Supercharge Your Productivity

āœ… Duck AI: Anonymous access to popular AI models like GPT-4o mini and Claude 3.

āœ… Quadratic: Chat with your data, connect databases, and visualize results in a code-friendly all-in-one tool.

āœ… Innovating with AI*: Just welcomed 200 new students into The AI Consultancy Project, their new program that trains you to build a business as an AI consultant. Request early access now! 

āœ… Greta: Ship any full-stack applications within seconds without writing a single line of code.

āœ… Whisper V3: Transcribe long-form YouTube videos with the click of a button.

* indicates a promoted tool, if any

PROMPT OF THE DAY

Improve Focus

Prompt: I need suggestions for techniques that can help improve focus and productivity during work hours. The techniques should be practical, easy to implement, and effective in helping individuals stay focused and avoid distractions while working.

Your task is to provide a list of proven techniques that can help improve focus, concentration, and productivity during work hours.

Work environment: [work environment]
Typical distractions: [typical distractions]

Source: promptadvanceclub

SOCIAL SIGNALS

Whatā€™s trending on socials today

šŸ¦œ Squawk Back: Watch what happens when a parrot tries to have a conversation with Blandā€™s AI voice assistant. ā€œParrot-1B shows signs of promise in natural language understanding but has a very small context window,ā€ one commenter joked.

šŸš€ Thread the Needle: X user ā€œs13kā€ vibe-coded a game that lets you try to land SpaceXā€™s booster rocket into the now-famous ā€œchopsticks.ā€ You can try to beat his record of 7.2 seconds here.

šŸ”“ No Strings: Builder Catalin Pit shared a list of open-source alternatives to popular AI-powered apps like Bitly, Jira, and Docusign.

šŸ‘ø Fantasy Filter: AI video platform Pika just dropped a new batch of effects, including ā€œmuseum me,ā€ ā€œbaby me,ā€ ā€œhero me,ā€ and ā€œprincess me.ā€

šŸ˜© Bot Burnout: Anthropic CEO Dario Amodei explained why he thinks there should be an ā€œI quitā€ button embedded in chatbots so they can let us know when theyā€™re getting too overwhelmed or stressed.

AI-GENERATED IMAGES

Springtime bliss

Source: ubudmdulc174 on Midjourney

Midjourney Prompt: Impressionist wood grain oil painting, parallel view, A photo of two cute [enter subject] wearing blue harnesses, lying on the grassy shore of Lake Lisa in KitzbĆ¼hel with a clear sky and mountains in the background. One [enter subject] is standing up while the other is sitting down. The scenery includes green meadows, lake water, and a distant mountain range under sunny skies. Bright sunlight highlights their fur colors and playful expressions, small strokes of oil painting, --stylize 250 --profile 6rtguen

Acquire new customers and drive revenue by partnering with us

Superhuman is the worldā€™s biggest AI newsletter for businesses and professionals with 1M+ readers and 2M+ followers on socials working at the worldā€™s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.

šŸ§ž Your wish is my command

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.

Got more feedback or just want to get in touch? Reply to this email and weā€™ll get back to you.

Thanks for reading.

Until next time!

Zain & the Superhuman AI team