- Superhuman AI
- Posts
- Four-person team takes on Voice Mode
Four-person team takes on Voice Mode
ALSO: How to create custom avatars
Read time: under 4 minutes
A young, four-person team is taking on the big guns with one of the first open-source voice-to-voice models. Find out how Standard Intelligence used 20 million hours of audio to build what it says is a more natural-sounding voice platform.
Today’s Insights
Today in AI: Data centers, robot brains, and MIT’s new model
Tutorial: How to create custom avatars that look like you
An open-source Voice Mode rival
Everything else you should know today
5 new AI tools to boost your productivity
AI-Generated Images: Stone Art
TODAY IN AI
Source: Getty Images
1. What do Jensen Huang, the king of Denmark, and Ozempic-maker Novo Nordisk have in common? They just came together to build one of the world’s largest supercomputers — one that’s bigger than a basketball court. Nvidia provided the AI chips, Novo Nordisk the funding, and Denmark’s Export and Investment Fund the political support. It’ll serve as a research hub across fields like healthcare and green energy.
2. A new AI-powered robot ‘brain’ could help humanoids do chores for you: With backing from Jeff Bezos and OpenAI, SF-based AI startup Physical Intelligence just raised $400M to develop a “generalist brain that can control any robot.” Despite launching only eight months ago, the company already demonstrated how its software can help robots do tasks as varied as folding laundry and bagging groceries.
3. Teaching an old robot new tricks: Unlike LLMs, most robot-focused models are trained on only a small set of highly specialized data, making it tricky for them to pick up new abilities on the fly. MIT recently took the opposite approach — throwing everything and the kitchen sink at its custom model — to help robots seamlessly move between different tasks. The technique was 20% more effective than traditional approaches.
SPONSORED BY MASTERWORKS
Billionaires wanted this ‘election-proof’ asset—but 67,229 everyday investors got it too.
When rare and valuable assets come to market, it's typically the wealthiest 1% who take home an amazing investment. But not always:
Over the last 7 elections (‘95-’23) contemporary art has outpaced the S&P 500 by 64% regardless of the victors. Now, Masterworks is taking on the billionaires at their own game, enabling everyday investors to join in on multimillion-dollar art investments (e.g. Banksy, Basquiat, and Picasso).
So far, those investors have gotten representative annualized net returns like +17.6%, +17.8% and +21.5% (among assets held 1+ year, not including unsold).
With $1B+ in capital raised across 450+ offerings, shares can sell out in minutes.
However, as a trusted partner, Superhuman readers can click here to skip the waitlist.
Past performance is not indicative of future returns. Investment involves risk. See Important Disclosures.**
THE AI ACADEMY
How to create customized avatars that look like you with HeyGen
Go to HeyGen's website and sign up to get credits.
Go to your dashboard and click on Avatar.
Now click on Photo Avatar and click on Create Photo Avatar.
Upload your image (upload a clear front-facing image, preferably a full body image). You can also use pre-existing avatars.
Upload at least 10 photos of yourself for the best results and click on Train model. This will take a few minutes.
Note: Uploading more images gives better results
Give additional details like age, model, etc.
Once done, write any prompt to put your character into any scene, clothes, and poses you want.
Once created, download and share it.
Using HeyGen's photo avatar feature you can generate as many variations as you want for your character with perfection. You can use this avatar to create ads, learning resources, videos, and much more.
FROM THE FRONTIER
An open source alternative to Voice Mode
Source: Vecteezy/Supachai Promrit
Think about how you became fluent in your native language — you don't mentally translate each word, you just understand. Most AI audio models take a clunkier route: They have to convert your speech to text, process it, then rebuild it as audio again.
That's why Standard Intelligence's latest breakthrough is turning heads. The four-person team unveiled Hertz-Dev, which it calls one of the first open-source models that directly transform voice to voice — no translations needed.
Trained on 20 million hours of audio, the 8.5B parameter version sounds just as realistic and speedy as OpenAI’s acclaimed Voice Mode, at least according to early demos shared by the team. And because it’s open-source, it can be fine-tuned for anything from live translation to classification. Stay tuned for a larger 70B version coming soon, too.
PRESENTED BY ASSEMBLY AI
New Speech-To-Text model released (preferred by ~73%)
AssemblyAI just launched Universal-2, their most advanced Speech-To-Text model to date.
Why 72.9% choose Universal-2 over others:
21% increase for Alphanumerics like phone numbers, zip codes, and more
24% improvement in proper noun recognition (i.e. brand names and people)
15% increase in formatting for emails, dates, currencies, etc
Try Universal-2 for no cost with a $50 credit.
AI & TECH NEWS
Everything else you need to know today
Perplexity launched an AI-powered voter guide ahead of the US election. Source: Perplexity
🗳️ Playing Politics: While most AI platforms are avoiding the US election altogether, Perplexity is leaning into it with an AI-powered hub that features live vote tallies, candidate summaries, and ballot information.
✨ Less is More: Anthropic’s efficiency-focused model, Claude 3.5 Haiku, is coming to the startup’s API as well as popular third-party platforms like Amazon Bedrock.
🐝 Buzzkill: Meta was forced to abandon plans for a nuclear-powered AI data center in the US because a species of rare bees was discovered near the site.
🔍 A-I Spy: Spot AI raised $31M for software that can monitor objects within a scene — or help you automatically scrub to relevant clips instead of searching for them manually.
📖 Next Chapter: OpenAI hired Gabor Cselle, the co-founder of a short-lived Twitter alternative called Pebble. The move could hint at OpenAI’s interest in a future social media product.
PRODUCTIVITY
5 AI Tools to Supercharge Your Productivity
✅ WebFill: Use advanced AI to automatically fill forms, complete surveys, handle data entry, and more.
✅ Clarity: Enhance productivity in document-heavy workflows with an AI platform that lets you talk to your documents and get instant insights.
✅ Section School*: Section is hosting a free, virtual AI conference on 11/14. Scott Galloway headlines, with AI leaders from Moderna, Hugging Face, and ServiceNow. RSVP now.
✅ Heep AI: Take action on WhatsApp, Instagram, and Messenger, from booking reservations to managing orders.
✅ Fable: Engage prospects, close more deals, and simplify onboarding with AI-powered demos.
* indicates a promoted tool, if any
PROMPT OF THE DAY
Literature Review
Prompt: Act as a graduate student in a specific field. You have been tasked with writing a literature review for a research project. Your literature review should provide an overview of the existing research on a specific topic, and identify gaps or areas where further research is needed. Your literature review should include at least 10 peer-reviewed sources, published within the last 5 years, and you should critically evaluate and synthesize these sources to build a cohesive argument. Your literature review should be structured in a clear and logical way, with subheadings to help organize your ideas. Additionally, you should provide an explanation of the methodology used to search for and select sources. Finally, your literature review should adhere to the style guidelines set forth by your department or discipline.
You can adapt the prompt to your specific needs. Or even add context, like for example:
Here's the context: You are a graduate student in the field of psychology, and your research project is on the effects of social media on adolescent mental health.
Source: @Das / Easypromptlibrary
AI-GENERATED IMAGES
Stone Art
Source: Inspired by @mazurkova on Midjourney
Midjourney Prompt: "human design" chart graph made in stone in shades of gray and ash and pastel yellow. simple but with visible structure --ar 9:16 --v 6.1 --stylize 250
Acquire new customers and drive revenue by partnering with us
Superhuman is the world’s biggest AI newsletter for businesses and professionals with 800,000+ readers and 1.5 Million followers on socials working at the world’s leading startups and enterprises. Companies like Amazon, Hubspot, and Salesforce feature their products in Superhuman. You can learn more about partnering with us here.
🧞Your wish is my command
What did you think of today's email?Your feedback helps me create better emails for you! |
Got more feedback or just want to get in touch? Reply to this email and we’ll get back to you.
Thanks for reading.
Until next time!
Zain & the Superhuman AI team
** The content is not intended to provide legal, tax, or investment advice.
No money is being solicited or will be accepted until the offering statement for a particular offering has been qualified by the SEC. Offers may be revoked at any time. Contacting Masterworks involves no commitment or obligation.
“Annualized Net Return” or “IRR” refers to annualized internal rate of return net of all fees and expenses, calculated from the offering closing date to the date the sale is consummated. For additional information regarding the calculation of IRR for a particular investment in an artwork that has been sold, a reconciliation will be filed as an exhibit to Form 1-U and will be available on the SEC’s website.
Art vs S&P data based on repeat-sales index of historical Post-War & Contemporary Art market prices and S&P 500 annualized return (includes dividends reinvested) from 1995 to 2024, developed by Masterworks. There are significant limitations to comparison of assets that trade episodically. Indices are unmanaged and a Masterworks investor cannot invest directly in an index.