AI SAMOSA
Posts
OpenAI' s new Text-to-Video Model Stuns the World! (and a lot more updates from the world of AI)

OpenAI' s new Text-to-Video Model Stuns the World! (and a lot more updates from the world of AI)

Stay ahead of the curve with our cutting-edge AI Newsletter! Subscribe now for a byte-sized dose of AI excellence delivered straight to your inbox!

AI SAMOSA
February 19, 2024

Howdy, Awesome People!

Welcome back to AI Samosa, your favorite AI Newsletter!

If this is your first time, having you here is great! Be sure to sign up to stay updated with the latest happenings in AI.

AI Bytes 📰

OpenAI's new Text-to-Video model stuns the world

Sora, an AI model from OpenAI, crafts realistic videos based on textual cues, maintaining quality and fidelity to user prompts for durations up to a minute. It excels in generating intricate scenes with multiple characters, varied motions, and detailed backgrounds, driven by its deep language understanding. Despite its strengths, Sora encounters challenges in accurately simulating complex physics, delineating cause and effect, and describing events over time.

OpenAI shares Sora's progress early to gather feedback from diverse professionals, leveraging a diffusion model that refines noise into videos over multiple steps. Sora integrates insights from past research, like the recaptioning technique from DALL·E 3, to enhance adherence to user instructions. Prior to integration, OpenAI implements safety measures and collaborates with red teamers to ensure responsible deployment, viewing Sora as a foundational step towards achieving artificial general intelligence (AGI). Read more…

Google launches its most potent AI model

Google has introduced Gemini 1.5, a next-generation AI model that demonstrates substantial enhancements in performance and capabilities. Gemini 1.5 Pro, a mid-size multimodal model, achieves comparable quality to Gemini 1.0 Ultra while using less computing. The model introduces a breakthrough in long-context understanding, allowing it to process up to 1 million tokens consistently, which is the longest context window of any large-scale foundation model to date.

This extended context window enables entirely new capabilities and helps developers build more useful models and applications. Gemini 1.5 Pro is currently available in a limited preview for developers and enterprise customers via AI Studio and Vertex AI. The model is built upon research and engineering innovations, including a Mixture-of-Experts (MoE) architecture, which enhances efficiency. Google is committed to responsible deployment and extensive ethics and safety testing for each new generation of Gemini models. Read more…

ChatGPT gets a new memory feature

OpenAI is testing a memory feature for ChatGPT, allowing users to retain information across multiple conversations, enhancing the chatbot's ability to provide more helpful and personalized responses. Users can control ChatGPT's memory by explicitly asking it to remember something, viewing and deleting specific memories, or turning off memory entirely. The memory system evolves with user interactions and is not linked to specific conversations. ChatGPT's memory improvements include tailored meeting notes, messaging for social posts, and lesson plans based on user preferences. This feature is being rolled out to a small portion of free and Plus users and will be expanded to more users soon. Additionally, memory-enabled GPTs will be available for users who interact with them, and each GPT will have its own distinct memory. Read more…

Reddit to sell user content for AI training

Reddit has made a new licensing deal with an unnamed large AI company, allowing the company access to its user-generated content platform. The deal is worth about $60 million on an annualized basis and could still change as Reddit's plans to go public are still in the works. This move comes as a response to the legal questions surrounding AI companies training their data on the open web without permission.

The specific AI company involved in the deal is not known, but it is significantly more valuable than the deals other companies, such as OpenAI and Apple, have been offering. Reddit's decision to make this deal follows a previous threat to cut off Google and Bing's search crawlers if it couldn't reach a training data deal with AI companies. The company's revenue was up by 20 percent by the end of 2023, but it was still $200 million shy of a $1 billion target it had set two years prior. Reddit is expected to open up for public investment in March, seeking a $5 billion valuation, which is half of what it might have achieved in 2021 before a market downturn held it back. Read more…

Tech Giants Join Hands to Fight AI Interference

A group of 20 tech companies, including OpenAI, Microsoft, Adobe, Meta Platforms (Facebook), TikTok, and X (formerly Twitter), have agreed to collaborate in preventing deceptive artificial intelligence (AI) content from interfering with global elections this year. The rapid growth of generative AI, which can create text, images, and video, has heightened concerns about its potential to sway major elections. The accord, announced at the Munich Security Conference, aims to develop tools for detecting misleading AI-generated content, create public awareness campaigns, and take action on such content on their services. The companies may use watermarking or embedding metadata to identify AI-generated content or certify its origin. The agreement does not specify a timeline for meeting the commitments or how each company will implement them. Read more…

This week’s How to Guide 🤯

Gemini is an experimental conversational AI assistant created by Google Research. It allows you to have natural conversations and ask questions across a wide range of topics. If you want to try out this powerful new AI tool, accessing Gemini is quick and easy by following these steps:

Go to https://gemini.google.com/app in your web browser. This will take you to the Gemini web page.
Click on the text box that says "Ask Gemini". This will allow you to start entering prompts and questions for Gemini to respond to.
Type your prompt or question into the text box. Press enter or click the send icon to submit it to Gemini.

For example, you could type "What is the weather forecast for tomorrow?" and Gemini will respond with the weather for your location tomorrow.

AI Art 🎨

Narrative Nook📔

The Music Within

The notes of Rohit's harmonica floated down the alley, weaving their way into the tapestry of sounds in the Mumbai streets. Though born to a family that struggled to survive in the city's packed slums, Rohit had talent that transcended his circumstances. Music was his refuge, a world unlocked with natural skill whenever he closed his eyes and played the battered instrument his father had given him.

He played as crowds rushed by, some tossing coins while others complained about the racket. But one day, his mournful melody made a renowned musician pause. Hearing the bounce of genius behind the boy's unrefined talent, he asked to take Rohit on as a student.

Under his tutorship, the ragged child was shaped into a professional musician. Through compassion and skill, his mentor cultivated the spark of talent, fanning it into a flame despite the adversity of Rohit's upbringing. At his acclaimed debut concert, Rohit's triumphant notes were a testament to music's power to transport any listener - even the most disadvantaged performer - to soaring heights.

Trivia Time! 🕵️

Test your understanding of AI history and state-of-the-art technologies with this 5-question quiz!

1. What year was the term "artificial intelligence" first coined?

a) 1952 b)1955 c) 1960

2. In machine learning, what technique allows a model to continuously learn from new data without forgetting previous learnings?

a) Online learning b) Continual learning c) Ensemble modeling

3. What company developed the deep learning framework PyTorch?

a) Google b) Facebook c) OpenAI

4. What is the name for the schema used to train natural language processing models like BERT?

a) Recursive neural tensor network b) Transformer c) Seq2seq

5. What term refers to AI algorithms that can produce creative content like images, videos, or text?

a) Reinforcement learning b) Supervised learning c) Generative AI

Tool Spotlight 🛠️

Jaq & Jil is an AI writing assistant designed by copywriters to generate high-quality, engaging content quickly and professionally. It caters to various marketing needs, such as SEO-optimized blog articles, social media captions, essays, and marketing materials. The platform allows users to create content at scale, with features like a blog post generator, content templates, and editing automation. It simplifies the content creation process, enabling users to publish directly to WordPress and Google Docs, and offers customization options to maintain a unique brand voice and tone. Jaq & Jil is accessible to more than 4,000 marketing agencies and freelancers.

Editor's Note ✍️

This newsletter comes straight from the desk of John V Jayakumar, CEO of Superposition Technologies.

My life's journey is a testament to the transformative power of education and technology—my dad broke free from poverty through education, and I achieved millionaire status thanks to technology. My takeaway for you? Wholeheartedly embrace technology; it's a game-changer for you and everyone in your orbit.

Note: The views expressed are solely those of the editor.

Connect with John on LinkedIn.

Thanks to Beehiiv for hosting this newsletter. Click here to create your own newsletter on Beehiiv.

Check out the answers for the trivia!

a) 1952
b) Continual learning
b) Facebook
b) Transformer
c) Generative AI

We Want to Hear From You!

Subscribe and share your thoughts with us! Leave Feedback

Welcome, and Thanks for being part of our community.