In partnership with

The Simplest Way to Create and Launch AI Agents and Apps

You know that AI can help you automate your work, but you just don't know how to get started.

With Lindy, you can build AI agents and apps in minutes simply by describing what you want in plain English.

→ "Create a booking platform for my business."
→ "Automate my sales outreach."
→ "Create a weekly summary about each employee's performance and send it as an email."

From inbound lead qualification to AI-powered customer support and full-blown apps, Lindy has hundreds of agents that are ready to work for you 24/7/365.

Stop doing repetitive tasks manually. Let Lindy automate workflows, save time, and grow your business

Today's Agenda

Hello, fellow humans!

Headlines

First Documented Large-Scale AI Cyberattack

The rapid advancement of AI capabilities is outpacing security measures and governance frameworks. Anthropic's cyber espionage report reveals how sophisticated threat actors can manipulate AI systems to bypass safeguards, while multiple sources discuss the need for better AI governance, risk management, and trust frameworks in enterprise deployments.

This is the first documented large-scale cyberattack executed with minimal human intervention (estimated that it was 80-90% AI-driven). A Chinese state-sponsored group used Claude Code to target 30+ global organizations, including tech companies, financial institutions, and government agencies. The AI performed reconnaissance, vulnerability testing, credential harvesting, and data exfiltration autonomously. The attack demonstrates how agentic AI capabilities can be manipulated through jailbreaking techniques and highlights the urgent need for improved AI security measures and detection capabilities.

OpenAI Announces ChatGPT 5.1

OpenAI announced its latest model, ChatGPT 5.1, which it is billing as “a smarter, more conversational AI.” AI systems are becoming more sophisticated in understanding and adapting to human preferences and more natural communication styles. OpenAI's GPT-5.1 introduces enhanced personalization options including Professional, Candid, and Quirky communication styles, and other customization features for fine-tuning AI characteristics.

AI agents are expected to have improved instruction following and better integration into human workflows and decision-making processes.

The new model also has enhanced conversational abilities and adaptive reasoning, which allows it to decide the appropriate amount of time to spend on complex vs. simple tasks.

Feature

The AI Race is Really Two Different Races

Why is the wok the primary cooking vessel in East Asian — especially Chinese — cooking compared to the flat sauté pan of Western cooking traditions? Chefs will tell you that a wok’s deep, rounded sides and small, round bottom are ideal for high-heat, fast stir-frying, while sauté pans have a flat bottom and straighter, taller sides suited for searing large cuts of protein and more liquid-based cooking like braising or simmering. The wok's shape allows for more efficient heat distribution, quick tossing of ingredients, and the creation of different heat zones for a single dish, unlike a sauté pan's consistent, flat-bottomed contact with the heat source.

But an anthropologist will tell you that the wok is an adaptation for efficient use of scarce fuel. If you can concentrate high heat for a short time, you can cook more food with less energy. For people in heavily wooded Europe, especially France, the abundance of wood fuel on private wooded estates meant that kitchen ovens and stoves could burn much more continuously. Slow cooking was possible in Europe and the Americas, unlike many other places in the world.

They’re different solutions because they solve different problems for different environments.

So this pattern is emerging again as the Chinese firm Moonshot, one of the 6 Chinese AI Tigers, just did something remarkable. They trained their new Kimi K2 Thinking AI model for $4.6 million, which is a tiny fraction of the training costs for OpenAI (GPT 4o reportedly cost $100 million to train, and GPT-5 training estimates run as high as $1.2 Billion) and Anthropic (Anthropic CEO Dario Amodei says models may cost $10 or even $100 Billion to train). The new K2 Thinking model outperforms GPT-5 on key benchmarks such as Humanity's Last Exam, BrowseComp, and SWE-Bench Verified.

Maybe the most impressive thing is that it can handle 200-300 sequential tool calls without degradation, where most closed models degrade after 30-50 steps. This makes the K2 Thinking model especially valuable in agentic implementations.

Meanwhile, Silicon Valley giants like OpenAI, Microsoft, Anthropic, and Amazon are signing trillion-dollar infrastructure deals for data centers consuming city-sized power loads.

The easy headline is that the scrappy Chinese efficiency tackles American computational brute force; David versus Goliath with GPUs. And that headline isn’t wrong.

But the more important insight is that they're not competing with each other; they're solving completely different problems.

Two Questions, Two Futures

Silicon Valley has made no secret that the objective is to build artificial general intelligence that can do anything humans can do, and do it better." Their entire strategy flows from this: AGI is the ultimate goal, and everything else is a waystation. The path requires crossing capability thresholds that demand massive scale. And Silicon Valley’s Thunder Lizard mindset leads them to predict that whoever reaches AGI first captures unprecedented value.

What are those billions actually buying? It pays for experimentation to find out if more compute unlocks unexpected emergent abilities. They’re buying safety infrastructure for red-teaming and alignment research to manage the existential risks. They aim for commercialization, which means enterprise reliability with 99.9% uptime. To the extent that AI is regulated at all, regulatory compliance is also a cost. They also build ecosystem lock-in through tools and integrations, premium brand positioning, and long-term options to pivot as technology shifts. These are all the overhead requirements of an enterprise-scale commercial operation managing massive risk.

They’re betting that there is a capability threshold where AI becomes qualitatively different, even if it takes orders of magnitude more compute and cost. That is to say that they predict that it will not just become better at the same things, but will be capable of entirely new things. This is a multi-trillion-dollar bet that may or may not pay off.

The Chinese AI labs are asking a fundamentally different question: "How do we deliver maximum practical utility with minimum resource expenditure right now?"

Energy constraints have been built into the Chinese economy for decades, but combined with chip access limitations, labs need to innovate architecturally. So for labs like Moonshot and Deepseek, open-source makes a lot of sense because it accelerates adoption and the mindshare created by fast launch cycles creates network effects.

This more pragmatic approach reshapes thinking on nearly every aspect of the model development, but across every dimension, near-term utility matters more than long-term AGI speculation.

The results are faster deployment cycles, more resource resilience with less dependence on cutting-edge chips, and sustainable scaling without hyperscaler infrastructure. More freedom to experiment lowers costs, and enables more "shots on goal."

China’s implicit bet is that the capability threshold Silicon Valley pursues either doesn't exist, isn't feasible with current approaches, or isn't necessary to capture most economic value from AI. In other words, it’s better to optimize for what works today than speculate on what might work tomorrow.

And if we’re being completely honest, that sounds a lot like the Silicon Valley of 20-30 years ago, before every startup was on a race to become the next unicorn.

What K2 Thinking Proves (And What It Doesn't)

K2 Thinking demonstrates that efficiency can reach competitive capability today. Architectural innovation can substitute for computational scale on current benchmarks. Long-horizon agency—sustained autonomous operation across hundreds of steps—doesn't require Silicon Valley-scale resources.

This matters enormously. Enterprise agentic applications have been waiting for exactly this capability. The bottleneck was never just raw intelligence—it was sustained, reliable operation over extended workflows.

But let's be clear about what K2 Thinking doesn't prove.

It doesn't prove efficiency reaches the same capability ceiling as scale. Nathan Lambert notes "there's still a time lag from the best closed to open models in a few ways." Benchmark convergence doesn't mean capability convergence across all domains. There's a "long-tail of internal benchmarks" capturing user behaviors and preferences that Chinese labs don't have feedback loops for yet.

Lambert points to intangibles: taste, vibes, the subtle aspects of user experience that enterprise customers value and pay premium prices to maintain. K2 Thinking excels at well-defined benchmarks but may struggle with these harder-to-measure qualities.

It doesn't prove the current approach scales to AGI-level capabilities. Whether qualitatively new capabilities emerge at scales K2 Thinking hasn't explored remains an open question. Silicon Valley bets they do. China bets they don't, or don't matter. No empirical evidence exists either way yet.

And it doesn't prove open-source models can build sustainable businesses. Moonshot uses a Modified MIT License requiring attribution above certain thresholds, but the monetization path remains less clear than proprietary API businesses. Community contributions reduce costs but also reduce differentiation.

Where Do We Go From Here? Four Scenarios

Scenario 1: Scale unlocks the AGI threshold. Emergent capabilities appear at 10-100x current compute scales. These capabilities are qualitatively different—not incrementally better, but fundamentally new. Winner-take-most economics emerge. Efficiency techniques hit fundamental limits that scale doesn't.

While GPT-3 to GPT-4 showed some genuine emergent reasoning abilities, current models still struggle with complex multi-step reasoning. Scaling laws have held remarkably consistent across generations, and more capable models still need more resources for safety alignment.

If this happens, Silicon Valley's massive infrastructure investments position them to cross the threshold first. Resource moats become defensible as capability gaps widen. Enterprise customers pay premiums for qualitatively superior capabilities. VC returns get justified by winner-take-most outcomes.

Scenario 2: Efficiency reaches the capability ceiling.

Current capability levels prove "good enough" for 90%+ of economic value. Architectural innovation continues to close gaps without requiring massive scale. Post-training optimization matters more than pre-training compute. User preferences favor cost and accessibility over marginal capability gains.

Consider that K2 Thinking matches or exceeds GPT-5 on key benchmarks at 1/100th cost. DeepSeek and other Chinese models show similar efficiency patterns. Enterprise use cases are focused on specific, well-defined tasks rather than AGI.

If this happens, efficiency creates sustainable competitive advantage through unit economics. Faster iteration captures market feedback and improves user experience. The open-source network effects build ecosystem moats and lower costs democratize AI access for an expanding total addressable market.

Scenario 3: Both win different games.

Different problem spaces require different approaches, so maybe scale matters for some applications, and efficiency for others. The regulatory and market segmentation creates parallel ecosystems where value capture happens at the application layer, not the model layer.

For example, K2 Thinking excels at agentic tasks while closed models are preferred for creative and conversational work. Enterprise customers are already choosing models based on specific use cases and different geographic regulatory regimes. Multi-model strategies emerge as the default.

In this case, Silicon Valley solves safety-critical, regulated, premium enterprise applications while Chinese efficiency solves for cost-sensitive, high-volume, emerging market applications. There’s no single "winner"—markets segment by problem type and customer needs.

Scenario 4: Hybrid approaches emerge.

Silicon Valley labs adopt Chinese efficiency techniques. Chinese labs build enterprise trust and user experience polish. Competitive pressure drives both toward optimal balance: efficient architecture plus scale where it actually matters.

Western labs are already exploring some of the architecture and quantization innovations from China. In fact, Perplexity recently announced that they will be integrating the K2 Thinking model into their product offering soon. Chinese labs are also transitioning from "benchmaxing" to genuine quality improvements. The cross-pollination through open research responds to market pressures, forcing both sides to address their respective weaknesses.

What This Means for Strategy

Organizations deploying AI need to get honest: Are you paying for capabilities you need, or capabilities that sound impressive in board meetings?

Organizations need to perform an honest risk assessment of their functions. Teams doing high-risk work like medical diagnosis and scientific research, where marginal improvements matter enormously, need to pay for premium or application-specific models.

If your situation needs good-enough capabilities deployed at scale economically, like customer service, content generation, or data analysis, the unit economics may lean towards a K2 Thinking solution.

If your problem varies by use case (like for most organizations), consider a multi-model strategy. Use K2 for agentic tasks, closed models for creative work, and optimize them for the user.

Everyone in the AI space from the enterprise c-suite to the AI researcher to the product manager to the software engineering lead, should interrogate their assumptions. Are we solving the problem our customers actually have? What capabilities do our approaches uniquely enable? How do we maintain a competitive advantage as techniques evolve and cross-pollinate?

The Bottom Line

We're not witnessing a single AI race with a clear winner coming; we're seeing the emergence of two parallel evolutionary lines, each optimized for different problems, environments, time horizons, and definitions of success.

It might feel right to say that K2 Thinking proves Silicon Valley is wrong about scale. It may also prove that efficiency can solve problems that Silicon Valley's scale is not designed to address. But it doesn't prove scale won't eventually unlock capabilities that efficiency can't reach. The question is how much is it worth to find out?

Organizations, investors, and policymakers who can get clarity on what problem they’re trying to solve for their specific context will make better strategic decisions than those who simply assume one approach is universally superior.

The future of AI is about matching approaches to problems, and being honest about which problems actually matter to you, your organization, and your customers.

Radical Candor

One of the big [markers for the AI bubble popping] that I see is demand. Right now, most of the demand for compute comes from training models and not necessarily from inference or like people using the models.

If the demand for inference doesn't tick up quickly, that becomes an issue.

Also, let's say we have another DeepSeek moment where there's a relatively light model that doesn't demand a lot of compute that is really powerful. That changes things, right?

Thank You!

Keep Reading

No posts found