AI Geekly
Posts
AI Geekly: Pushing the Envelope

AI Geekly: Pushing the Envelope

Meta's Connect, Gemini at Work, and OpenAI spin the wheel faster...

Brodie Woods
September 30, 2024

Welcome back to the AI Geekly, by Brodie Woods, brought to you by usurper.ai. This week we bring you yet another week of fast-paced AI developments packaged neatly in a 5 minute(ish) read.

TL;DR Llama Shades; Office 365 Copilot’s Twin; OpenAI’s Never Dull Moments

This week Meta and Google both showed-off some new capabilities and models at their respective Connect and Gemini at Work conferences. While Meta showcased its newest open-source model, Llama 3.2 (now with vision!), Google announced an updated version of Gemini 1.5 along with a host of new capabilities to make its Microsoft 365 (formerly Office) alternative, Google Workspace, more competitive with the Copilot features MSFT has been rolling out. Never one to stay out of the news cycle, OpenAI has been a hot topic of late with the announcement of the departure of three senior leaders (including its CTO), a massive $7 Bn equity raise (valuing the company at $150 Bn) and a proposed change in structure from world’s-strangest-non-profit to a B-Corp (Benefits Corporation) —a move that will remove the caps on returns for shareholders (a feature of the current structure) and bring it more in-line with how the company has been operating (also a necessity to get more capital in the door). Read on below!

New Model, New Specs
Meta announces Llama 3.2 and Orion AR glasses at Connect

What it is: Meta's annual Connect event showcased the company's ambitious vision for the future of computing, with a focus on augmented reality (AR), virtual reality (VR), and AI. CEO Mark Zuckerberg unveiled a prototype of Meta's first AR glasses, codenamed Orion, featuring a novel holographic projection system. Additionally, Meta introduced the Quest 3S, a more affordable ($299) VR headset aimed at broadening access to mixed-reality experiences. However, the real star of the show was Llama 3.2, Meta's latest open-source AI model, now boasting multimodal capabilities with support for both image and text understanding, bringing it more in line with the capabilities of popular closed-source models like Google’s Gemini, Anthropic’s Claude, and OpenAI’s GPT-4 series models.

What it means: Meta is doubling down on its commitment to a future where the digital and physical worlds are seamlessly intertwined (it did rebrand to Meta for this specific purpose after all). The Orion AR glasses, even in prototype form are part of Meta's ambition to create a more immersive and interactive AR experience, potentially surpassing existing offerings from Apple and others (granted the current Orion prototype takes about $10k to build). The Quest 3S demonstrates a strategic focus on accessibility, bringing mixed-reality technology to a wider audience especially compared to the Apple Vision Pro’s $3,500 price tag. The introduction of the multimodal Llama 3.2 reinforces Meta's commitment to its open-source strategy, empowering developers to create innovative AI applications across diverse domains, including corporate and personal use, while enabling the safety and control that comes with a local/on-prem solution.

Why it matters: Meta's announcements highlight the growing significance of multimodality in AI development and the increasing convergence of AI, AR, and VR technologies. Indeed, the lack of vision in the prior Llama family of models was a limiting factor in their application. The open-source nature of Llama 3.2, coupled with its ability to process both images and text expands the capabilities of the open-source community with a leading-edge open-source foundation model. These types of models are vital to ensuring the democratization of modern AI, while the decreasing price of advanced VR and AR products (ex-Orion for now) shows us a path of future convergence between spatial compute and AI, something we have been discussing for some time.

Google Plays Catch-Up: Gemini at Work, But Is It Enough?
GOOG swears people are using Gemini

What it is: Google is making an aggressive push to establish its generative AI capabilities in the enterprise market, highlighting a range of new products and customer success stories at its recent Gemini for Work event. The event was designed to convey a sense of growing adoption of Google Cloud's AI across various sectors (despite recent reports to the contrary), with a particular focus on the integration of Gemini into Google Workspace and the launch of a new AI-powered customer service solution. However, despite its vast resources and deep AI expertise, Google is still perceived as playing catch-up to Microsoft's Copilot platform build-into its Microsoft 365 suite and powered by OpenAI’s best-in-class models.

What it means: Google is attempting to leverage its strengths, including its robust cloud infrastructure, extensive data assets, and a vast network of partners, to position Gemini as a comprehensive and flexible AI solution for businesses. However, the company's decision to offer competing AI models alongside Gemini on its platform suggests tacit recognition that its own product is not the undisputed leader (or maybe it’s just trying to avoid another antitrust suit). Furthermore, while the integration of Gemini into Google Workspace aims to boost productivity, it has yet to achieve the same level of seamless integration and functionality as Microsoft's Copilot within Microsoft 365 applications which itself is still a work in progress (though the Wave 2 enhancement covered in last week’s Geekly is a serious step in the right direction).

Why it matters: While Google Cloud may be seeing significant growth in AI adoption as customers already using Google Cloud begin to experiment, the company needs to demonstrate that its own proprietary Gemini models can deliver tangible business value and a superior user experience to not only close the gap with Microsoft but grow the business materially. Both companies will need to generate orders of magnitude higher revenue from their GenAI offerings to backfill the amount of investment they’ve made into AI hardware and talent over the past several years. Things are picking up as use cases emerge, but it’s still early innings.

How we got here: The success of Google's strategy will be closely watched by investors as they continue to view the company as the curious underdog in the GenAI story. That is, the company is largely responsible for the modern GenAI boom thanks to the development of the underlying transformer architecture by its own researchers in 2017. However, the company’s overly conservative nature delayed R&D, allowing OpenAI to launch ChatGPT in somewhat of a vacuum, getting out ahead of Google and others and maintaining that lead to this day.

OpenAI’s Four Rs: Restructure, Raise, Release, Resignations
Another busy week for the ChatGPT maker

Restructure: OpenAI is undergoing a significant corporate restructuring, shifting from its current (overly-confusing) non-profit structure to a for-profit benefit corporation. The change aligns with how the company actually runs its business, and indeed may be necessary to attract incremental investment

Raise: Reportedly the restructuring is a pre-requisite as OAI seeks a $150 Bn valuation as part of its ongoing $7 Bn equity raise (led by Thrive Capital). Reportedly, CEO Sam Altman is poised to receive equity for the first time (though he has contested the widely reported 7% figure which would value his stake at $10.5 Bn).

Release: This week also saw the company finally follow through on its promise to release the much-hyped Advanced Voice model for ChatGPT Plus users (announced in May with the release in “coming weeks” reaching meme-like status on the internet). The capability is impressive, certainly better than Gemini Voice. It’s one of those “Aha!” moments with AI when experienced for the first time. We encourage readers to try it out if they have ChatGPT Plus, or check it out here.

Resignation: And of course, what OpenAI coverage would be complete without the announcement of a high-profile resignation (or three)? OAI was forced to announce three more high-profile resignations this week as CTO Mira Murati, Chief research officer Bob McGrew and head of post-training Barret Zoph, all left the company, raising questions about Altman's management style and the potential impact on OpenAI's competitive edge.

What it means: OpenAI's transformation into a for-profit entity reflects the growing commercialization of AI and the practical need for smaller players (yes, compared to Alphabet/Google, Meta, and Microsoft, OpenAI is small) to secure significant capital to continue to build more powerful AI models through investment in hardware and R&D. The restructuring gives OpenAI greater financial flexibility but calls into question the company's commitment to its original mission of developing safe and beneficial artificial general intelligence (AGI) for all of humanity, particularly in light of recent departures of key personnel involved in AI safety and research.

Trust me, I’m a Benefits Corp.: Realistically, it seems foolish to rely on a corporation (even a benefits corporation) to execute on such a goal. Corporations are designed to make money for shareholders, and that’s precisely what OpenAI will do, and has to do, under its fiduciary duty. While Benefit Corps owe a fiduciary duty to other so-called stakeholders as well, given OAI’s approach to date we can expect this to be a secondary priority. Once again, this development underscores the need for open-source models to democratize access to advanced AI. With monied interests lining-up to invest in closed AI, it stands to reason that these players will look to expand and consolidate their influence via new technologies —in a zero-sum game this inevitably comes at the cost of the middle class and below. The only check against this expansion of power is the widespread dissemination of said power to broader society via open source.

Why it matters: The ongoing drama at OpenAI highlights the challenges of balancing rapid growth, corporate governance, and the ethical considerations of developing powerful AI technologies. The company's ability to navigate these complexities will be critical to its long-term success and its ability to maintain its leading position in the AI market. Investors will be closely watching how the restructuring and leadership changes impact OpenAI's product development and its ability to attract and retain top talent in a highly competitive environment.

What if…?: We’ve seen many journalists and tech writers scratch their heads about why senior leaders of OpenAI would be leaving the company if it is just on the cusp of AGI: “If GPT-5 or AGI is just around the corner, why are all these senior people leaving?” they muse. We have a theory. It may be because these senior leaders have figured out that the upcoming models, or possibly even AGI, will so fundamentally disrupt the world as it exists today that they no longer see the need to be employed by OpenAI, even if they will be the one to effect that change. Put another way, they may foresee a world so dramatically different from the status quo that many of the considerations about where they work, how much they get paid, etc. may become trivial in a handful of years. They may be taking this time now to enjoy what remains of the current state before things forever change. We have more thoughts on this topic that we will expand upon in a separate Geekly focusing on the impacts of AGI.

Before you go… We have one quick question for you:

About the Author: Brodie Woods

As CEO of usurper.ai and with over 18 years of capital markets experience as a publishing equities analyst, an investment banker, a CTO, and an AI Strategist leading North American banks and boutiques, I bring a unique perspective to the AI Geekly. This viewpoint is informed by participation in two decades of capital market cycles from the front lines; publication of in-depth research for institutional audiences based on proprietary financial models; execution of hundreds of M&A and financing transactions; leadership roles in planning, implementing, and maintaining of the tech stack for a broker dealer; and, most recently, heading the AI strategy for the Capital Markets division of the eighth-largest commercial bank in North America.

AI Geekly: Pushing the Envelope

Meta's Connect, Gemini at Work, and OpenAI spin the wheel faster...

New Model, New SpecsMeta announces Llama 3.2 and Orion AR glasses at Connect

Google Plays Catch-Up: Gemini at Work, But Is It Enough?GOOG swears people are using Gemini

OpenAI’s Four Rs: Restructure, Raise, Release, ResignationsAnother busy week for the ChatGPT maker

About the Author: Brodie Woods

New Model, New Specs
Meta announces Llama 3.2 and Orion AR glasses at Connect

Google Plays Catch-Up: Gemini at Work, But Is It Enough?
GOOG swears people are using Gemini

OpenAI’s Four Rs: Restructure, Raise, Release, Resignations
Another busy week for the ChatGPT maker