Loading...
Loading...
A deep dive into the eight modular AI capabilities reportedly built into Google COSMO — from autonomous web browsing to proactive calendar management.
Screenshots captured from the COSMO app showing how Skills are organized and toggled.
COSMO organizes its capabilities into discrete, modular "Skills" — each designed to handle a specific type of task. This approach reportedly allows Google to add, update, or remove individual capabilities independently, making COSMO a flexible and evolving AI agent platform.
COSMO reportedly conducts comprehensive, multi-source web research on behalf of the user. Rather than returning a list of links, it appears to synthesize findings into structured, analyzed summaries — similar to having a research assistant.
This could leverage both Gemini Nano for on-device summarization and cloud-based models for deeper analysis. The skill appears designed for tasks like product comparison, topic exploration, and factual verification.
Based on reports, COSMO may include a Browser Agent powered by Project Mariner — Google DeepMind's browser automation initiative. This skill reportedly enables COSMO to autonomously navigate websites, interact with web elements, and complete multi-step online tasks.
Potential use cases include filling out forms, booking appointments, comparing prices across sites, and executing multi-step web workflows without user intervention.
COSMO appears capable of detecting dates, appointments, and events mentioned in conversations, messages, or emails. It then proactively suggests adding them to Google Calendar — without the user needing to manually create entries.
This skill exemplifies COSMO's proactive agent approach: rather than waiting for commands, it anticipates needs based on contextual analysis of the user's communications.
Reportedly able to draft emails, documents, and structured written content based on context, user instructions, and past interactions. COSMO's Document Writer may go beyond simple text generation by understanding the user's tone, preferences, and communication patterns.
This could include drafting replies to emails, creating meeting notes, generating reports, and producing formatted documents — all informed by contextual awareness.
Based on reporting, COSMO may retain contextual memory of past interactions, enabling increasingly personalized and context-aware assistance over time. Unlike session-based AI tools, Recall could allow COSMO to reference previous conversations and preferences.
This persistent memory capability raises important privacy considerations — see our Privacy & Permissions page for details on how COSMO reportedly handles user data.
Appears to enable rapid visual search and identification. Users may be able to point their camera at objects, documents, or scenes for instant information retrieval, identification, or related content discovery.
This could build on Google Lens technology while integrating COSMO's contextual awareness for more personalized and actionable visual search results.
Reportedly generates concise, actionable summaries of long message threads across messaging apps. This helps users quickly catch up on conversations they may have missed — extracting key decisions, action items, and important details.
This skill may leverage COSMO's screen-reading capabilities via AccessibilityService to analyze conversations across different messaging platforms.
Based on reports, helps create, maintain, and manage lists including shopping lists, to-do items, and other organized collections. COSMO may proactively suggest additions to existing lists based on conversations and browsing activity.
This skill could integrate with Google Tasks, Keep, and Shopping to provide a unified list management experience across Google's ecosystem.
COSMO's Skills reportedly rely on a hybrid architecture combining two key technologies:
Google's on-device AI model enables local processing, privacy-preserving inference, and low-latency responses — even without internet connectivity.
Google DeepMind's browser agent technology enables COSMO to autonomously navigate, interact with, and complete tasks across websites.
Several of COSMO's Skills — particularly Recall, Conversation Summary, and the Browser Agent — involve access to sensitive user data. COSMO reportedly uses Android's AccessibilityService for screen-reading capabilities, which raises important questions about data handling, storage, and user control. For a full breakdown, see our Privacy & Permissions guide.