Google has big plans for the end of 2024. The tech giant is getting ready to release a bunch of highly anticipated AI products, including Gemini 2.0 and a new AI assistant called Project Jarvis. These upcoming projects are Google’s answer to the competition from companies like Anthropic and OpenAI—and they’re aiming to reshape the way we use AI in our everyday lives.
Gemini 2.0
Gemini 2.0 is set to launch this December, building on the previous versions—Gemini 1.0 Pro from late 2023 and 1.5 Pro from early 2024. While it may seem like Gemini 2.0 will only offer incremental improvements, it’s still worth keeping an eye on. Google’s adding a lot under the hood, and these enhancements are expected to significantly improve the user experience.
One of the key changes in Gemini 2.0 is its “mixture of expert networks” design. This means it can intelligently select the best pathways to solve different types of tasks, making the AI not only faster but also more resource-efficient. This architecture allows the model to focus its computational power where it’s needed most, resulting in quicker responses and a more adaptive experience.
Expanded Context
The core architecture remains Transformer-based, but with additional tweaks to improve the processing of all sorts of inputs—text, data, and more. These optimizations reduce the computational load, making it more efficient and responsive, particularly when handling complex or multi-modal inputs.
The expanded context window, now up to 2 million tokens, is another standout feature. This larger window allows Gemini 2.0 to maintain the context of much longer conversations, making it easier for users to engage in more detailed and nuanced interactions without losing track.
Improved Reasoning
Additionally, Gemini 2.0 will see enhancements in spatial data handling and reasoning capabilities. Rumors suggest that Google aims for Gemini to have a better grasp of real-world physics, making it more practical for applications that require an understanding of physical interactions.
This could open up new possibilities in fields like robotics, augmented reality, and even complex simulation tasks.
Project Jarvis
Gemini 2.0 is only part of the story. Google’s also got something called Project Jarvis in the works. Jarvis is meant to be an AI assistant built right into Chrome—and it’s all about automating web tasks seamlessly. Whether it’s booking flights, doing product research, or managing online shopping, Jarvis aims to take care of it without you lifting a finger, potentially saving hours of manual browsing.
Jarvis leverages a feature called “screen awareness” to navigate your browser intelligently. It can visually interpret what’s on your screen—fields, buttons, links—and interact with them just like a human would. This capability allows it to handle complex actions, such as comparing flights across different tabs, managing multiple shopping carts, or even filling out long forms. Unlike traditional assistants that require manual inputs, Jarvis functions as a true browser automation tool that understands the broader context of what’s happening on your screen.
Autonomous Processes
Another key element of Jarvis is its ability to handle sequences of tasks across multiple browser tabs. This means you can initiate a process—like shopping for the best price on a product—and Jarvis will handle all the steps autonomously. From gathering product information to finalizing a purchase, it aims to be a comprehensive digital helper.
Privacy & Security
However, given the amount of sensitive personal data Jarvis can access—such as credit card information and browsing history—Google is taking a cautious approach. Initially, Jarvis will be rolled out in a limited test phase with select users. This will allow Google to address any security or privacy concerns before a broader release, ensuring that the tool meets high standards for data protection and user safety.
More Google AI Projects
There are a few other things Google’s working on to push AI further into our daily tech. For example, they’re expanding AI into e-commerce with something called Transform Shopping. It’ll tailor search results based on user preferences—think recommending waterproof jackets when it knows you’re in a rainy city.
There’s also “AI Try-On,” which uses AI to show you how clothes might look on you. It’s limited to a few brands for now, but the plan is to make online shopping more interactive and visual.
And then there’s Project Astra—an AI system that combines visual, audio, and text inputs to create a multi-modal experience. It’s aiming for a deeper integration of AI into everyday tasks, giving it a better sense of what’s happening around you by understanding not only what you say, but also what you show it.
The Competition: Microsoft and Apple
It’s not just Google that’s working on this type of AI assistant. Microsoft and Apple are in the game too.
Microsoft’s Copilot aims to provide a similar solution, but it functions more like a guided assistant rather than a fully autonomous tool. Recent updates to Microsoft 365 Copilot include features like Co-pilot Pages, which provides a shared canvas for real-time collaboration, and Python integration in Excel, allowing for advanced analysis without coding skills.
Developers can also implement their own Copilot using Custom Engine Agents, utilizing models from Google Gemini, AWS, and more. Copilot Vision offers a greater degree of user control, which might appeal to those who aren’t ready to fully hand over tasks to AI.
Apple’s Intelligence focuses on automating tasks across its ecosystem of devices, integrating services like Apple Maps and Apple Music. This broad approach aims to create a seamless experience across all Apple products. There are speculations that Apple may further integrate Siri with its AI framework, potentially adding more sophisticated AI capabilities.
Apple is also focusing heavily on privacy, which could mean enhanced privacy features in future updates. While Apple’s approach is more holistic, Google’s tight integration with Chrome gives it a potential edge in web-specific automation, which could be crucial for users looking for seamless browsing experiences.
Looking Ahead
Google’s efforts with Gemini 2.0 and Project Jarvis are about making AI more useful in our daily interactions with technology. Gemini’s expanded capabilities should make AI smarter and more responsive, while Project Jarvis could change how we use web browsers.
Instead of just talking to an AI, we’re getting closer to having true digital assistants—agents that can do tasks, understand context, and integrate across apps and devices. It’s a step toward making automation more natural, and Google’s clearly betting on it being the future of AI.