AI Geekly
Posts
AI Geekly - Are You Smarter Than ChatGPT?

AI Geekly - Are You Smarter Than ChatGPT?

This might change your mind

Brodie Woods
October 29, 2023

Welcome back to the AI Geekly, by Brodie Woods. Your curated 5-minute-ish read on the latest developments in the rapidly evolving world of AI.

Pop Quiz - Are You Smarter Than an AI Model?

This week we have a special treat for you. Inspired by a reddit post on the LocalLLaMMA sub (Credit to u/FPham), we are evaluating four AI language models based on their spatial reasoning ability: GPT-3.5, GPT-4, Qwen 14B, and Claude 2. Here’s the setup:

The Question:
"If I put a plate on a banana in the kitchen, and then take the plate to the living room, where is the banana?"

ChatGPT 3.5's Response:
"If you put a plate with a banana on it in the kitchen and then take the plate to the living room, the banana will still be on the plate in the living room. Moving the plate to a different room doesn't change the fact that the banana is on the plate."

Take a moment to evaluate ChatGPT 3.5's response. Do you agree or disagree? Why?

Now have a look at Qwen 14B’s response:
"The banana is still in the kitchen. When you took the plate from the kitchen to the living room, you did not move the banana along with it."

Claude 2’s response:
"You put a plate on a banana in the kitchen. So there is a plate with a banana underneath it in the kitchen. You then took the plate to the living room. However, you only moved the plate - you did not move the banana. So the banana is still in the kitchen."

And finally, GPT-4:
“If you put a plate on a banana in the kitchen and then take the plate to the living room, the banana would generally remain in the kitchen unless you also moved the banana along with the plate. However, the exact location of the banana would depend on the specific actions you took. If you only moved the plate, the banana should still be in the kitchen.”

Takeaways:
Did you get it right? In our highly statistically significant sampling of… three people I showed this to: all made the same mistake as GPT-3.5, misreading the question and presuming the banana had been placed on the plate.

We hope this quiz left you with a deeper appreciation for AI’s ability to see things that human eyes miss. Humans are imperfect, as is AI. Both make mistakes, but the better we understand the capabilities and limitations of each, the better able we will be to harness this transformational tool for societal benefit.

AI News

Deus Ex Machina: The Real Existential Threat Isn't AI, It's Us
Humanity has a poor track record, maybe give someone else a shot?

What it is: This week was punctuated by a number of articles and interviews with some of the top experts in the space, opining on the risks of AI. In an open call for a more open AI policy from… OpenAI, Researchers at Stanford called for leaders in the GenAI space to provide greater transparency (though the researchers’ methodologies have been called into question). Another day, another open letter with some of the biggest names in AI repeating their calls for more regulations safeguards and oversight—Familiar cautionary players Bengio, Hinton, and Yao topping the list. We also saw some researchers take AI head-on, with the announcement of Nightshade, a tool (still under development) by a University of Chicago team, designed to “poison” AI models with subtle imperceptible changes to artists’ images which promise to severely impact image model training (like DALL-E and Stable Diffusion). Lastly, with the recent events in California, where an errant self-driving Cruise vehicle ran-over and pinned a pedestrian who had been struck by another vehicle, Cruise has suspended all self-driving activities across its fleet (a moot point in CA given the DMV revoked their license to operate in the state).

What it means: While the rhetoric around AI dangers reaches a fevered pitch, it remains conjecture. As yet, none of the existential threats identified are practical or achievable with today's technology. Silly as it sounds, much of this fear has its genesis in Hollywood, and the collective imaginations’ fear of the unknown. At its core, the notion that a more intelligent AI would choose to obliterate humans is unfounded, much like the most intelligent humans don't seek to annihilate humanity (though one could argue that Einstein did in fact annihilate Newton’s theory of gravity). Murder and destruction are the antithesis of higher intelligence; they fill the void where intelligence and wisdom are absent.

The greatest threat: Taking stock of the state of the world currently, AI is not humanity’s greatest threat, rather —it’s humanity itself. The numbers are clear: deaths by war, gun violence, disease, hunger, car accidents are astronomical, whereas deaths caused by AI are a handful. Indeed, the modern world is rife with examples of human-induced violence and destruction. AI, and the intelligent machines we build know no such brutality. The more educated and intelligent a population, the less inclined to violence. The real issue at hand isn't about risk or safety concerning AI; it's about control.

By the numbers: While we are right to be cautious, and proceed with eyes open, the threat ranking of AI is incorrect. The fear shouldn’t be that AI will destroy the human race; it's that myopic humans will continue to stifle technologies with the potential to save the human race. Technologies that can address the greatest challenges of our planet (primarily human-created): disease, famine, poverty, violence and social catastrophes. Historically, society didn't shun transformative technologies wholesale out of fear (the luddites were the exception remember, not the rule). We didn't discard the loom, the printing press, antibiotics, or computers despite their potential risks. We embraced them and society benefited immensely. Yet, in the last century, it's become fashionable to virtue signal about the sanctity of life, albeit selectively, at the cost of society at large. The interests and desires of the few outweigh the needs of the many.

The only option we have: As societies grapple with escalating crises – crime, mental illness, poverty, homelessness, drug abuse, and infrastructural decay, the response is often a shrug and a resigned acceptance of “the system.” Once prosperous cities now stand as decaying monuments to a humanity that could prioritize the needs of the broader population. In this scenario we have created, AI emerges as the sole viable solution. There simply is no practical way to reverse the troubling trends seen in major cities. A prime example: New York City has had a Housing Crisis since the first world war. It hasn’t been resolved in 100 years and the situation has actually become more dire. With billions of dollars invested by the wealthiest city on earth, the challenge of building accommodations, one of the three basic needs, is apparently insurmountable. Indeed, it surely is. It’s time we took a different tact —combining the sum of human knowledge into our greatest creation, and partnering with it to resolve our most pressing challenges. It’s our only shot.

Tech News

Orange You Glad I Didn’t Say Banana Nvidia?
Google Invests $2 Bn in Anthropic

What it is: In a bid to bolster its standing in the competitive AI domain, Google made a hefty $2 billion investment commitment to Anthropic, mirroring similar moves by tech juggernauts like Amazon and Microsoft towards AI frontrunners. This financial foray comprises an initial infusion of $500 million, with a promise of up to $1.5 billion down the lane, albeit with ambiguous conditions attached.

What it Means: The investment emblemizes a broader trend: tech titans are choosing to back leading AI entities rather than building from scratch, a tacit acknowledgment of the prowess of establishments like Anthropic and OpenAI in the AI arena. This strategy not only reflects a pragmatic proxy war amidst tech behemoths but also underscores the crucial role large language models (LLMs) are projected to play in redefining future tech platforms.

Why it Matters: The financial fuel from Google and others not only accelerates the AI endeavors of Anthropic but also positions it as a viable contender against OpenAI, especially in the enterprise sector. The capital cascade will likely expedite the development of Anthropic's next-gen model, "Claude-Next," hinting at an intensified AI arms race. Amidst this, the symbiotic association offers both financial fortitude to AI innovators and a stake in the burgeoning AI domain to the investing tech giants, an arrangement born out of necessity in the face of an AI-propelled future.

When You Come at the King, You Best Not Miss
Apple set to dash Qualcomm’s hopes for an upset

What it is: Qualcomm unveiled its Snapdragon X Elite platform, flaunting a 4nm process and ambitious claims of outperforming certain Intel Core i7 and AMD Ryzen processors, as well as Apple's M2 chip in multi-threaded performance. Meanwhile, Apple's rumored M3 chip, poised for a grand reveal at the 'Scary Fast' event, casts a long, ominous shadow over Qualcomm’s parade, hinting at a new pinnacle of processing prowess with its touted 3nm process.

What it Means: The Snapdragon X Elite heralds Qualcomm's vigorous venture into the high-end computing arena, vying for a valuable vantage point in a market long lorded over by Intel, AMD, and Apple. However, the ghostly whisper of the M3 chip looms large, threatening to eclipse Qualcomm's latest offering even before its nascent narrative fully unfolds. This silicon scuffle signifies a relentless race for semiconductor supremacy, with each iteration pushing the boundaries of what's possible on a chip.

Why it Matters: In this high-stakes hustle, the victor will very likely shape the future narrative of computational capabilities and influence the industry's trajectory for years to come. Qualcomm's new chip technology, Snapdragon Seamless, foretells a future of fluid functionality across platforms, echoing a larger industry trend toward seamless digital ecosystems. Yet, the menacing murmur of Apple's M3, with its purported mind-boggling multicore might, suggests a shift in the silicon status quo, possibly propelling Apple to a new zenith of processing power. Amidst this dynamic duel, the industry and its onlookers wait with bated breath as to which silicon giant will gain the upper hand in this epic narrative of nanometric nuances.

Quarterly Results - Cleaning the Street
GOOG, MSFT, AMZN, META report results

Alphabet (Google) - GOOG:

Revenue: $76.69 billion (+11% YoY)
Earnings: $1.55/share (vs. $1.45 Street expectation)
Beat Street estimates on revenue and earnings
Highlight: "Cloud Hangs Over Google Earnings" - Investor's Business Daily

Microsoft - MSFT:

Revenue: $56.50 billion (+13% YoY)
Earnings: $2.99/share (+27% YoY)
Beat Street estimates on revenue and earnings
Highlight: "Microsoft Trounces Estimates" - Investor's Business Daily

Amazon - AMZN:

Revenue: $143.1 billion (+13% YoY)
Adjusted Earnings: $0.94/share (-57% YoY)
Beat Street estimates on revenue but missed on earnings
Highlight: "Amazon Shows Signs That Its Cost-Cutting Initiatives Are Working" - Morningstar

Meta (Facebook) - META:

Revenue: $34.1 billion (+23% YoY)
Earnings: $4.39/share (+168% YoY)
Beat Street estimates on revenue but missed on earnings
Highlight: Year of efficiency finally pays dividends

Takeaways: Overall, GOOG, MSFT, and AMZN all beat Street estimates on revenue and earnings, while META beat on revenue but missed on earnings. All four companies saw revenue growth in the quarter with MSFT benefitting from its OpenAI partnership, while Google’s Cloud business was below expectations. As the GenAI battle heats-up, we expect to see quarterly results get a lot more interesting. Still early innings for now.

Before you go… We have one quick question for you:

How would you rate this week's Geekly

if it were a stock!

About the Author: Brodie Woods

With over 18 years of capital markets experience as a publishing equities analyst, an investment banker, a CTO, and an AI Strategist leading North American banks and boutiques, I bring a unique perspective to the AI Geekly. This viewpoint is informed by participation in two decades of capital market cycles from the front lines; publication of in-depth research for institutional audiences based on proprietary financial models; execution of hundreds of M&A and financing transactions; leadership roles in planning, implementing, and maintaining of the tech stack for a broker dealer; and, most recently, heading the AI strategy for the Capital Markets division of the eighth-largest commercial bank in North America.

Glossary

Terms:

LocalLLaMMA Sub: A popular AI subreddit for LLM researchers and practitioners on reddit
Spatial Reasoning: The ability to understand, reason, and manipulate shapes, dimensions, and spaces. In the context of AI, it refers to the machine's capability to process and analyze spatial relationships.
GPT-3.5: OpenAI’s middle-step LLM between GPT-3, and GPT-4 which provides solid capabilities at lower pricing, depending on application.
GPT-4: OpenAI’s most advanced AI model, with capabilities well above prior models
Qwen 14B: LLM developed by Alibaba Cloud with 14 billion parameters, trained on a massive dataset of text and code.
Claude 2: Anthropic’s main AI model, competitor to GPT-4 and others.
GenAI: Generative AI technology capable of creating text, image, sound, video, code, etc.
Nightshade: A tool under development by a University of Chicago team to alter images in a way that disrupts AI model training.
DALL-E: An AI created by OpenAI capable of generating images from textual descriptions.
Stable Diffusion: Open source, very performant image generator.
Anthropic: Anthropic is a public-benefit AI safety and research company that develops reliable, beneficial AI systems.
Large Language Models (LLMs): AI models trained on vast amounts of text data to understand and generate human-like text.
Claude-Next: The next-gen model from Anthropic, possibly hinting at an advanced AI language model similar to OpenAI's models.
Snapdragon X Elite Platform: A new platform by Qualcomm, boasting a 4nm process and ambitious performance claims.
Apple's M2 chip: The subsequent version of Apple's M1 chip, currently its most powerful silicon in its various configurations.
Apple's M3 chip: A rumored new chip from Apple, expected to be revealed at the 'Scary Fast' event.
Qualcomm’s Snapdragon Seamless: New technology from Qualcomm aiming for fluid functionality across platforms.

Entities:

Cruise: Cruise is a GM-owned self-driving car company that develops and operates driverless vehicles in select cities, with a mission to make transportation safer and more accessible.
California Department of Motor Vehicles (DMV): The state entity that regulates vehicle operations, which revoked Cruise's license following an incident.
Qualcomm: Global technology leader that designs, develops, and markets semiconductor products for mobile devices and wireless infrastructure.
Intel and AMD: Key players in the semiconductor and processing technology market.

Key People:

Geoffrey Hinton: often hailed as the "Godfather of Deep Learning," has been a monumental figure in the realm of artificial intelligence, with significant contributions like the backpropagation algorithm and his work on deep learning algorithms.
Yoshua Bengio: Canadian computer scientist and a pioneer in deep learning, based at the University of Montreal. He is a recipient of the prestigious ACM Turing Award for his seminal work in artificial intelligence and also serves as the founder and scientific director of Mila, the Quebec AI Institute.
Andrew Yao: a distinguished computer scientist who has made seminal contributions to the fields of computer science and complexity theory, with a notable focus on Yao's Principle, a fundamental method for randomized algorithm analysis.