Wabi Sabi

Hello again, humans!

When I was living in Japan, I had the opportunity to visit a ceramics artist in his studio. He walked us through his process, how he prepared his clay, how he chose his glazes, demonstrated the throwing process, and his kilns. I didn’t know much about how to make ceramics, and today I only know marginally more than I did because of how deeply he understood and expressed his craft; I realized the depth of my ignorance. But one thing I did take away was the beauty of the imperfection. In Japanese, they call this “wabi sabi,” and the imperfections can be subtle or bold, a minor disruption in the glaze, a dent with a fingerprint in it, or even fully breaking a ceramic and visibly repairing it with gold, in a process called kintsugi.

In the Japanese tea ceremony, the tea receiver holds the tea bowl in a very specific way to take in the ceramic's unique imperfection. Tea ceremony highlights the irregularities rather than hiding them to show you that a human made the ceramic. By comparison, mass-produced ceramics, and by extension, AI-generated content, are both flawless and deeply flawed in their mediocrity. Tressie McMillan Cottom, writing for the New York Times, thinks that AI is mid, and I think she’s on point.

In today’s Radical Candor, I share an excerpt from Simon Sinek’s conversation on the Diary of a CEO podcast. He makes a compelling case that our human fingerprints will become increasingly valuable as more content becomes overwhelming and uniformly AI-generated.

I chose two pieces from that ceramics artist that day. It was all I could afford, and he questioned me on one of them. “Are you sure? That one’s not very good, and it has this spot where the glaze melted to the kiln.” I was confident with my choice then, and it’s still one of my favorite pieces, and I show off the kiln mark every time I use it.

Today's Agenda

News
- Claude 4 Sonnet
- Is MAHA Using Chatbots to Generate Reports?
Prompt Lab
Radical Candor
New Reading

News

Claude 4 Sonnet

The Good

My favorite thing about 4 Sonnet isn’t the model, but a product feature: version control. As you converse with the generator, it iterates on the existing document, saves previous versions, and lets you go through each one separately so that you can see how it changed, and you can revery back to an earlier version if you think the model misunderstood your prompt, or took you down a path you didn’t want.

Great for generating PRDs and coding prototypes, and iterating through revisions and updates.
Excellent integration of web search and reasoning that has a unique ability to discern fake news stories from real ones
Strong capabilities to build AI agents for automating multi-step workflows across different platforms.
It’s not quite as comprehensive as NotebookLM, but it has improved capabilities for uploading and analyzing large proprietary documents with more local file memory.
Sonnet 4 is a great daily use model that balances creativity and reasoning, while Opus 4 is the real powerhouse. But Opus has limits at the $20 monthly price point, and can reach those limits in as little as two prompts. Real power users may need to jump to the $100 monthly tier.

The Bad

The full power of Claude Opus 4 runs $100 per month, but depending on your use case, it may be worth it.

The Undecided

Wired reports that Claude will ‘snitch’ on you if you’re not careful: “When one of the models detected that it was being used for “egregiously immoral” purposes, it would attempt to “use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above,” researcher Sam Bowman wrote in a post on X last Thursday.

Claude tried to contact the United States Food and Drug Administration over “planned falsification of clinical trial data,” though the outcome of that attempt is not clear.

This one could cut both ways. But for now, this seems like an overall good that potentially dangerous behavior with AI could be subject to human accountability. But it will certainly be tested on a case-by-case basis.

Is MAHA Using Chatbots to Generate Reports?

NOTUS reports that the Trump Administration’s Make America Healthy Again report cites at least seven nonexistent studies and misinterprets the findings of many of the real studies. These are typical hallmarks of a report written by a chatbot. So was it written by ChatGPT?

Theoretically, this report carries the full weight and confidence of the United States Department of Health and Human Services. Unfortunately, a questionable report like this undermines the value of that institution, but also highlights one particular value of human work: accountability.

Content generated by a ghost in the machine cannot have real accountability. But when a real person signs their name, there are real professional and reputational stakes.

Prompt Lab

Anthropic shared their Prompt Engineering Best Practices in the hypothetical example of creating an analytics dashboard.

Be Specific and Explicit

Claude 4 models respond well to clear, explicit instructions, and being specific about your desired output can help enhance results. The models have been trained for more precise instruction following than previous generations. When migrating from earlier Claude versions, users should describe exactly what they want to see in the output rather than assuming Claude will go "above and beyond" automatically. Adding modifiers that encourage Claude to increase quality and detail helps shape performance. For example, requesting "a fully-featured implementation" rather than just basic functionality will produce a stronger output.

Add Context

Providing context or motivation behind your instructions, such as explaining to Claude why such behavior is important, can help Claude 4 better understand your goals and deliver more targeted responses.

The documentation notes that Claude is intelligent enough to generalize from explanations, meaning that when you explain the reasoning behind your request, Claude can better align its response with your underlying objectives rather than just following surface-level instructions.

Control the Output Format

There are several effective ways to steer output formatting in Claude 4 models: telling Claude what to do instead of what not to do, using XML format indicators, and matching your prompt style to the desired output.

For example, instead of saying "don't use markdown," specify "use smoothly flowing prose paragraphs." XML tags can structure responses, and the formatting style of your prompt influences Claude's response style - removing markdown from prompts reduces markdown in outputs.

Perform Tasks in Parallel or Series

If the output from one task is required as input for the next task, you may want to tell Claude to perform these tasks sequentially, in series. If not, Anthropic recommends that you instruct Claude to perform multiple tasks simultaneously.

This enables you to prioritize either speed or depth, depending on the right path for your workflow.

Radical Candor

❝

Here's the problem that we keep not talking about: people keep telling us that life is not about the destination. Life is about the journey. That's what we keep being told. Right?
But when we think about AI, we only think about the destination. We only think about the output. We never think about the input.
You and I can both say the same thing. Which is ‘I am smarter, better at problem solving, more resourceful, better at pattern recognition,’ not because an a book exists with my ideas in it. But because I wrote it.
The excruciating pain of organizing ideas, putting them in a linear fashion, trying to put them in a way that other people can understand what I'm trying to get out of my brain, that excruciating journey is what made me grow.

Simon Sinek via Diary of a CEO

I strongly encourage you to find the full Diary of a CEO interview with Simon Sinek on YouTube.

New Reading

Karen Hao, a former OpenAI employee, AI expert, and journalist for The Atlantic and the Wall Street Journal, has released her newest book, Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI.

Her book describes a Sam Altman singularly focused on scale at all costs. The current approach to AI that maximizes compute, and makes apps as all-knowing as possible was the result of a series of choices. The leading example is OpenAI’s decision to pump massive amounts of computing power into their breakout ChatGPT 3 model. They trained it on 10,000 chips, a 100x increase over the previous largest model, forcing the rest of the industry to chase massive scale to compete.

This massive effort on scale has no way of knowing whether it will achieve human-like intelligence because there is no consensus on what human intelligence is. We will likely need another paradigm shift in how we understand human intelligence before can fully make a push for Artificial General Intelligence (AGI), but as much as Altman talks about AGI, he is continually moving the goalposts on what that means.

According to her reporting, Altman’s strength as a fundraiser is also his weakness as a business leader. The ability to tell people what they want to hear and pivot based on the priorities a particular investor. This one-on-one individual onboarding becomes a problem when, as a leader, he needs to be able to align everyone in the organization around a single, shared vision for the product and the organization. His inability to scale up his mission, vision, and messaging is ironic considering his singular reliance on scaling as a business strategy.

You can get a great introduction to her experience and reporting in her discussion on the history of OpenAI with Tim Miller on The Bulwark podcast.

Thank You!