Cat Wu on Rapid AI Product Development at Anthropic

Cat Wu shares insights on the accelerated product development pace at Anthropic, emphasizing the importance of clear goals and collaboration in AI innovation.

Introduction

In a landscape where most companies release new products quarterly, Anthropic has compressed its release cycle to daily iterations. Behind this rapid pace is Cat Wu, a Chinese-American woman born in the 1990s. From engineer to the product lead for Anthropic’s flagship products, Claude Code and Cowork, she is not only driving the evolution of this generation of AI products but also interviewing hundreds of aspiring product managers in the AI field, witnessing firsthand who succeeds and who falls behind.

Cat Wu, whose full name is Catherine Wu, has a rich background in engineering and venture capital. She graduated with a degree in computer science from Princeton University and has held positions at Scale AI, Dagster, and Index Ventures before joining Anthropic in August 2024. In July 2025, she and executive Boris Cherny were recruited by the AI programming startup Cursor but returned to Anthropic shortly after, taking over the Claude Code product line.

Accelerated Product Development

“We have shortened the development cycle for many product features from six months to one month, and sometimes even just one day,” Cat Wu stated in a recent in-depth interview. This rapid pace has been a consistent state at Anthropic for several quarters. “Internal models have improved efficiency, but more importantly, it’s about the processes and team expectations. We strive to minimize processes, removing all obstacles to release, so everyone feels they can turn an idea into a product in a week or even a day.”

When prioritizing product features, the team focuses on one mission: to bring safe AGI to all of humanity. “If Claude Code fails but Anthropic as a whole succeeds, I would be very happy. The entire team is willing to make decisions based on this mindset,” she noted. Interestingly, Cat pointed out that during new model releases, the most significant changes often come from “deleting features” that were originally added to compensate for the model’s limitations.

Regarding the previous Claude Code source code leak, she revealed, “This was a human error,” and the involved employee still works at the company. “This is a process issue; the most important thing is to learn from it and increase protective measures, which is what we are currently doing.”

Insights on Product Management

Host: I want to start with your role, especially your collaboration with Boris. Everyone knows Boris, who created Claude Code and leads the team, submitting countless PRs daily. I feel like you don’t get enough recognition for your contributions to Claude Code, Cowork, and everything you’re doing. Can you explain your role in the team and how you collaborate with Boris?

Cat Wu: I feel very fortunate to work with Boris; he is a fantastic thought partner. He is our technical lead and the visionary behind the product, skilled at defining what the product should look like in three to six months, even envisioning the “full AGI version” of the product.

My focus is more on the path from now to that three to six-month vision. I spend a lot of time on cross-team collaboration, ensuring that marketing, sales, finance, and computing teams all align with the plan, moving in the same direction, and ensuring that features are ready and not stuck at the release stage. In some ways, we collaborate very well because we have a sense of “brain circuit fusion.” But the boundaries are quite blurred; about 80% overlaps, with 20% I particularly care about and lead, while the remaining 20% is what he cares about more, and he leads.

Host: You mentioned that you have been interviewing a lot of PMs. If I received a dollar for every referral I made for someone to become a PM at Anthropic, I’d probably have 300 billion ARR by now. It’s one of the most sought-after companies, so I can imagine how many people you’ve interviewed. You said many people misunderstood what it means to be a successful AI product manager. Can you share the issues you’ve observed and what skills are needed to succeed?

Cat Wu: Before AI, the pace of technological change was relatively slow. You could plan on a six to twelve-month cycle, and because feature releases were also slow, there was a strong emphasis on collaboration with other teams to ensure their features could unlock your path, as writing code itself is expensive. But now, AI has significantly increased engineering efficiency. With rapid improvements in model capabilities, the development cycle for many product features has shortened from six months to one month, then to a week, and sometimes even a day. In this context, we need to push products out faster.

This means that as a PM, you should no longer focus on aligning roadmaps across multiple quarters but rather think about how to get things done in the quickest way possible. How can you deliver an idea to users within a week? The best PMs in AI-native products are those who can drastically shorten the time from idea to user while clearly defining which core tasks must be ready to go out of the box.

Host: I like what you said; many people still don’t realize how fast the pace is and how much of the work is about “helping teams accelerate.” How do you help the team move so quickly?

Cat Wu: The first thing is to set clear goals. Because large models are inherently general, they bring a lot of ambiguity: who are we making products for? What problems are we solving? What are the most important use cases? A good PM can clarify these, such as: our core users are professional developers; this feature addresses the issue of too many permission pop-ups causing fatigue; our goal is to allow developers in enterprises to implement “zero permission pop-ups” safely. This makes the goals clear and automatically excludes many unnecessary solutions.

Second, establish a reusable release process. For example, in Claude Code, we release almost all features in the form of “research previews.” We clearly tell users this is an early product, just an idea, still collecting feedback, and may not be supported long-term. The benefit of this approach is that it lowers the commitment cost, allowing us to quickly launch something in one or two weeks. Third, we create a collaboration framework for the team, letting everyone know when to pull in cross-functional teams and what their expectations are.

We have very tight processes between engineering, marketing, and documentation: once engineers feel a feature is ready and has completed internal use, it goes to a release channel, and the documentation, PMM, and developer relations teams immediately follow up, allowing an announcement to be made the next day. This process reduces release friction, and one of the PM’s responsibilities is to build this system.

Host: What role does the PRD play in this system? You mentioned that goals are important; do you still write PRDs or just simple bullet points? How has this evolved in the AI era?

Cat Wu: We mainly do two things. First, we have very strict data metrics, reviewing them weekly with the entire team to ensure everyone deeply understands the business aspects: what the core goals are, how trends are, and what the driving factors are. Second, we have a set of team principles, including who the core users are and why they are. This is to ensure everyone understands how the business operates, what is important, and what can be sacrificed, allowing for autonomous decision-making instead of being bottlenecked by PMs. For particularly ambiguous features, we still write a one-page document outlining the goals, ideal use cases, and current failure modes that need addressing. Of course, for some projects, especially those involving heavy infrastructure, it does take months, and in those cases, we still write complete PRDs.

Host: I want to delve deeper into how you can move so quickly. I’ve never seen a release pace like Anthropic’s, with significant features coming online almost daily. Recently, you developed a model called Mythos, which is still in preview. It’s so powerful that people are a bit concerned about its capabilities. Are you using it internally, and is that one of the reasons for your speed?

Cat Wu: We have been fast for several quarters, so it’s not solely due to Mythos. It is indeed very powerful, and we do use the model internally, which has improved some efficiency, but that’s not the main reason. The more critical factors are the processes and team expectations. We strive to minimize processes, removing all obstacles to release, so everyone feels they can turn an idea into a product in a week or even a day.

Host: That’s amazing; having the strongest models while developing products is a hard advantage to replicate.

Cat Wu: We are indeed fortunate to have access to these cutting-edge models.

Overlapping Roles of Engineers and PMs

Host: Recently, there was an incident where Claude Code’s source code was leaked about a week ago. Can you explain what happened?

Cat Wu: We conducted an investigation immediately after noticing. This was a human error. At the time, someone was using Claude to write a PR, which was an update about the release process, and it went through two layers of human review. Ultimately, it was a human mistake, and we have strengthened our processes to ensure it doesn’t happen again.

Host: Is that person still with the company?

Cat Wu: Yes, they are. This is a process issue; the most important thing is to learn from it and increase protective measures, which is what we are currently doing.

Host: Another issue is OpenClaw. Recently, you restricted the use of Claude subscriptions to run OpenClaw, and the community reacted strongly, with many feeling this harms the open-source community. What’s your take?

Cat Wu: We have indeed seen very high demand for Claude, so we have been working hard to expand our infrastructure while optimizing token usage efficiency to allow for longer usage. However, this product was not originally designed for third-party products; their usage patterns differ significantly from our first-party products. We have also spent a lot of time considering how to make a smooth transition, such as providing additional credits to subscription users. But ultimately, we made a tough decision: to prioritize supporting our first-party products and APIs, which is the context for this decision.

Host: This makes sense to me. Your $200 monthly subscription is essentially unlimited, but the computing costs are high, and the company still needs to make a profit; it’s not feasible to keep subsidizing. Returning to the PM team, what is your team structure like? How many PMs do you have?

Cat Wu: We currently have about 30 to 40 PMs divided into several teams. There is a research PM team responsible for collecting user feedback on models and passing it to the research team while also participating in model releases; a cloud developer platform team that maintains the Claude Code API and releases capabilities like hosted Agents; a Claude Code team responsible for the core products of Claude Code and Cowork; an enterprise team that makes these products easier for enterprises to adopt, focusing on cost control, permission management, security, etc.; and a growth team responsible for the growth of the entire product line, with whom we closely collaborate on Claude Code and Cowork.

Host: Speaking of growth, Amole recently appeared on our podcast. He mentioned an interesting but rarely discussed point: there’s a general feeling that fewer PMs will be needed in the future, with some saying, “Why do we need PMs when engineers can release on their own?” But his view is the opposite: because engineers are moving so fast, PMs and designers are being “squeezed out,” and with new features coming online daily, it’s hard to keep up. So he believes we actually need more PMs. What’s your perspective? Do you think PM hiring will increase in the future? How will this profession evolve in the long term?

Cat Wu: I think various roles are merging. PMs are doing some engineering tasks, engineers are doing PM tasks, and designers are doing both PM and coding. You can choose to hire more engineers with a strong product sense or keep the number of engineers constant and add more PMs to guide their work. In our team, we prefer to hire engineers with a strong product sense. This reduces the “friction cost” in the product release process. For example, we have many engineers who can go from seeing user feedback on Twitter to launching a product within a week, with minimal PM involvement. I think this is actually the most efficient way.

So I believe the boundaries between engineers and PMs are overlapping. Regardless of which type of person you add, it will bring value. However, I think “product sense” remains a very scarce skill, and whenever we see someone particularly strong in this area, we are very eager to hire them.

Host: You were originally an engineer, right?

Cat Wu: Yes, I was an engineer for many years. Then I briefly worked in venture capital before joining Anthropic. In fact, almost all PMs in our team are either from engineering backgrounds or have written code on Claude Code. I think this helps build trust within the team and allows us to move faster. Many of our designers were also former front-end engineers.

Host: This leads to a key question: as these roles merge, many people wonder which skills will be most valuable in the future if I come from an engineering, product, or design background. In your case, engineering skills are clearly important. But in other companies, would a design background transitioning to PM be more advantageous?

Cat Wu: I still believe the core lies in “product sense.” As coding becomes cheaper, the more valuable skill becomes deciding “what to write.” For example, what is the best user experience for this feature? How can we make users feel most satisfied?

We receive thousands of GitHub issues daily, and users suggest everything. At that point, strong judgment and taste are needed to decide what is worth doing and how it should be done. This ability can come from any background, but it is the most important. I think engineering backgrounds will be particularly valuable in the coming months because they help you assess the feasibility of implementing something, which often affects prioritization. For example, if a feature is easy to implement, it might not require much discussion, and you can just spend an hour to get it done; but if it’s complex, you’ll realize it’s costly, which will affect decision-making.

Sacrificing Product Consistency

Host: You mentioned that in the coming months, skills will change rapidly, making it hard to predict how things will be. What will humans continue to value in the short term?

Cat Wu: I think the most important thing is “first principles thinking.” You need to understand how the technical environment is changing, what the team truly needs you to do, and proactively fill that gap. Work is becoming increasingly “ambiguous,” and an excellent PM should be able to see all the gaps, prioritize them, and either learn new skills or use existing abilities to solve problems. Therefore, what is now more popular is someone who can “switch between multiple roles,” is willing to take on various tasks, and doesn’t care much about titles.

Host: I love this answer. I’ve been asking cutting-edge professionals like you a question: before humans reach superintelligence, where does the value of the human brain lie? Listening to you, the core is in choosing topics, judging directions, prioritizing, and determining whether something is “right.” Is there anything to add?

Cat Wu: I think humans still have an advantage in “common sense.” A product launch involves thousands of details, with many potential pitfalls. Models are currently not very good at understanding who all the stakeholders are, their relationships, preferences, and how to communicate with them. These are more about “tacit knowledge,” similar to emotional intelligence, which remains very important. Of course, we hope models become stronger in this area, but there is still a gap.

Host: In such a rapidly changing environment, how do you maintain your sanity? It feels like being in the eye of a tornado.

Cat Wu: I think our team enjoys the chaos. We face challenges with a smile because there are always many things to deal with and many risks. If you get anxious about everything, you’ll quickly burn out. We prefer to find those who see difficulties and say, “This is hard, but I’m excited to solve it.” They do their best, accept imperfection, but can sleep soundly knowing they’ve done their best.

Host: This is also an important ability. Some say this is the “most normal time in the world,” and things will only get crazier.

Cat Wu: It will indeed get increasingly difficult. Sometimes on a Sunday night, there’s a P0 issue, and by Monday morning, there’s an even more severe one, and by the afternoon, there might be something even crazier, making you feel that yesterday’s issue was nothing. You just have to accept that what you can do is limited. You need to ensure you get enough sleep to make good decisions the next day. At the same time, prioritize extremely, focusing on the most important things, and accept that some things won’t be done well. For instance, some of our products may not be polished enough upon launch, but as long as it doesn’t affect core user value, it’s acceptable because we will quickly gather feedback and fix it in the next iteration.

Host: It sounds like that scene in “Pirates of the Caribbean” where the ship is about to explode, and someone is still elegantly walking downstairs. The people I’ve encountered at Anthropic do seem very calm and optimistic.

Cat Wu: Without this state, it’s easy to burn out. We also tend to hire those who have experienced many ups and downs in the industry; they know what brings them energy and how to maintain their state over the long term.

Host: In this trend of role merging, what might we lose? For example, career paths, design consistency, code quality?

Cat Wu: We will indeed sacrifice some “product consistency.” When code costs were high, you would meticulously plan the entire product system, with each product’s positioning, use cases, and how they collaborate, usually corresponding one scenario to one product. But now, AI is developing too quickly, and we need to test many ideas, so sometimes features overlap. Often, this is because we internally like two different forms at the same time, hoping users will tell us which is better. But this can confuse new users: they don’t know what the best path is to complete a task. This means we need to do more user education to help them understand core functions and best practices.

Another issue is that users may feel they can’t keep up. In the past, you would only have an update once a month or even a quarter, and not looking at it was fine. But now these tools develop so quickly that many people check Twitter daily for the latest updates. We are also thinking about how to make users less anxious, hoping that when they open the tool, it can guide and teach them rather than making them feel like they are on an ever-faster treadmill.

Host: I noticed you recently launched an interesting feature called /powerup, which helps users understand the best ways to use Claude Code. Is this to address this issue?

Cat Wu: Yes, that’s the idea. Initially, we were hesitant to create such onboarding because we felt the product should be intuitive enough not to need a tutorial. But later, we realized there were too many features, and users were eager for a built-in guide to tell them what the top ten most important features were among hundreds. So we adjusted our previous philosophy and added this feature.

Anthropic’s Growth and Mission

Host: Anthropic has experienced remarkable growth over the past few years. Initially, it was quite behind, had little funding, and lacked distribution channels, with OpenAI far ahead, and many thought there was no chance. But now, your growth is astonishing. From an internal perspective, what do you think has been the key to success?

Cat Wu: I think the two most important factors are a highly unified sense of mission and the ability to make quick decisions based on that mission. We hire people who genuinely care about “bringing safe AGI to all of humanity.” And this is not just a slogan; we repeatedly reference this mission when making product decisions. By placing the mission above any single product, we can make rapid decisions and execute uniformly across the organization. This is quite rare in a company of our size.

Host: Just to confirm my understanding, you prioritize “safety alignment (ensuring AI is beneficial to the world)” as the primary mission. As long as this mission is clear enough, many decisions become easier to make. For example, when two priorities conflict, you look at which aligns better with Anthropic’s mission and prioritize that. Once a decision is made, everyone supports it.

Cat Wu: Sometimes this also means that, for example, we want to release a certain feature on Claude Code but find something more important, so we lower the priority of that feature and postpone it for later.

Host: This is interesting. I think it also explains the difference between you and another company, OpenAI, which has done many different things. Your logic is: we won’t do social networks, and we won’t do information streams because these don’t align with our mission. This restraint allows Anthropic to maintain focus, which seems to be one of the key factors for success.

Cat Wu: When I talk about “mission,” I mean placing Anthropic’s goals above any individual, any single product. To me, our second-best trait is actually “focus,” but mission and focus are still somewhat different. The mission means the team is willing to make sacrifices, even if it impacts their goals or KRs, as long as it serves Anthropic’s overall goals and KRs. And everyone is willing to make such trade-offs. For example, if Claude Code fails but Anthropic overall succeeds, I would be very happy. The entire team is also willing to make decisions based on this mindset.

Host: This question may be sensitive, but do you think decisions like those regarding OpenClaw also fall under this logic? For instance, this direction didn’t push Anthropic’s mission, so it had to be stopped?

Cat Wu: I think it’s very important for Anthropic to expand the user base we can reach. One way to achieve this is through Claude subscriptions and our first-party products. So we are very determined to double down on these directions, but this sometimes does come at the expense of third-party products.

Claude’s Internal Skills

Host: We just mentioned products like Claude, Cowork, etc. I want to clarify that everyone understands the differences between these tools and am curious about how you personally use them. For instance, when should one use Claude Code, Claude desktop, or Cowork?

Cat Wu: I usually use Claude Code in the terminal, especially when I want to quickly start a one-off coding task and want to use the latest features. The CLI is our earliest product form, and many new features are launched here first, so it’s the most powerful tool. Generally, I use it when handling one or a few tasks at the same time. The desktop version is more suitable for front-end work. I love using its preview feature; for example, when I’m working on a web app, I’ll use both Claude Code and desktop, opening the preview panel on the right side, so I can interact with Claude while seeing the web page effect in real-time.

For non-technical users, the desktop version is also friendlier. The terminal can be intimidating for many, with various prompts that look “scary,” and it doesn’t allow for the same clickable operations as other products. So if you’re not used to the terminal, I highly recommend using the desktop version of Claude Code. Additionally, the desktop provides a global view, allowing you to see CLI sessions, desktop sessions, and tasks initiated on web or mobile, serving as a unified control panel. As for web and mobile, their biggest advantage is “initiating tasks anytime, anywhere.” CLI and desktop require you to use them on a local computer, but in reality, you can’t always carry a laptop.

I’ve seen many people walking outside, using their phones to hotspot their laptops, and not daring to turn off their computers. This shows we actually lack a product that solves this scenario. Mobile does a great job of addressing this issue, allowing you to initiate tasks anytime without needing to carry a laptop.

Host: That’s very relatable. I’ve seen this scenario on planes where people are afraid to close their laptops, just waiting for the Agent to finish running while staying connected to Wi-Fi.

Cat Wu: As for Cowork, it addresses another class of problems: many work outputs are not code. For example, clearing Slack, clearing inboxes, creating client presentation PPTs, writing feature goal documents, or release plans are all “non-code outputs.” Cowork is very suitable for these scenarios. So my classification is simple: if the output is code, I use Claude Code (whether on desktop or mobile); if the output is not code, I use Cowork.

Host: I think people may underestimate Cowork’s success. It’s growing rapidly, but many may not fully understand what it can do. Can you share some practical use cases based on your work as a PM? Any surprising applications?

Cat Wu: If you’re just starting to use Cowork, the first step is to connect all relevant data sources related to your work.

Because only by obtaining enough context can it provide high-quality results. For me, I connect Google Calendar, Slack, Gmail, and Google Drive, allowing it to freely access context, extract information, and link threads, significantly improving result quality. For example, last night I was using Cowork because we had a Code with Claude conference, and I needed to give several presentations. One of the presentation topics was: how Claude Code evolved from “assistant” to “real Agent.” I wanted to showcase our released products and some internal success cases.

I fed Cowork all the materials, including a draft prepared by our product marketing colleague Alex, and told it the narrative logic I wanted to present. Then it worked for an hour: it looked at what we had published on Twitter, checked internal release records, reviewed the announcement channel for Claude Code (which contains many practical cases shared by teams), and finally integrated all the information into a 20-page PPT. When I woke up in the morning, the overall quality was quite good. Although I made some modifications, such as preferring “fewer words” in the slides, it initially wrote a bit too much.

But overall, the speed far exceeded my own efficiency. And because it can access our design system, the PPT looked like it was made by a professional designer, very polished.

Host: This is essentially a PM’s dream; creating PPTs is so tedious and slow. To help everyone try it out, the steps you mentioned are: first connect Slack, Google Calendar, Gmail, Google Drive, right?

Cat Wu: Yes, the key is to connect your communication tools and the team’s “information sources.”

Host: What was your prompt like at that time?

Cat Wu: I actually kept it very simple: “Help me create a PPT for the Code with Claude conference. This is the content suggested by PMM, this is the draft I’m not satisfied with, and here’s a version I made manually (with links). First, give me a detailed outline while avoiding repetition with the keynote.” Claude would first read these links and then generate an outline. I would then decide which content to keep based on its suggestions. This reflects the current role of PMs: Claude is a strong “brainstorming partner,” capable of quickly integrating large amounts of information and providing multiple possibilities; but the final decision still rests with the PM.

The structure I finalized was: from “making local tasks successful” to “ensuring every PR goes through,” and then to “helping engineers submit more PRs,” with corresponding demos for each stage. Once I confirmed the outline, Cowork took a few more hours to complete the entire PPT.

Host: Amazing. It’s like you’re conversing with a designer who understands both design and content. How is the design system implemented? How does it know Anthropic’s style?

Cat Wu: We already have a standard external presentation template, and I directly provided this template to Claude. It can learn our color schemes, fonts, layouts, etc.; for example, we have about 20 commonly used slide formats. You can also connect Figma’s MCP; if your template is there, it can read directly from it.

Host: Speaking of which, I’m curious about your PM toolkit. Besides Claude Code and Cowork, what else do you use?

Cat Wu: My toolkit mainly consists of Claude Code and Cowork. Anthropic essentially operates around Slack; I feel it’s almost the company’s “operating system.” In my daily work, I spend about 30% of my time continuously testing Cowork’s boundaries to see where it falls short. I also spend a lot of time conversing with the model to understand why it makes mistakes. Additionally, we’ve built many internal tools. The biggest value of Claude Code is that it significantly lowers the barrier to developing custom applications. So now, there are many “personalized work software” within the company to solve very specific scenarios instead of relying on those not fully compatible general tools.

Host: Can you give some examples?

Cat Wu: For example, one of our sales colleagues using Claude Code found himself repeatedly creating similar client presentation PPTs. So he developed a web app: it contains several of the most effective templates (like 101, 201, advanced tutorials); then you can input client information, which will be automatically pulled from systems like Salesforce and Gong; the system will automatically adjust the content based on client circumstances, such as whether they use Bedrock or the enterprise version of Claude; whether they focus more on code reviews or security compliance; and whether HIPAA compliance is needed; then it automatically generates a customized PPT. What used to take 20-30 minutes of work now gets done in seconds.

Host: It’s interesting that tools like Slack are rarely attempted to be replaced. Everyone talks about SaaS being replaced by self-built tools, but Slack seems to be an irreplaceable infrastructure.

Cat Wu: I think it is indeed a crucial communication infrastructure, and it does very well in “real-time information synchronization.”

Host: Yes, many people complain about Slack, but it does its job very well, and the most cutting-edge teams basically can’t do without it, which is quite interesting.

Cat Wu: Yes, and I also appreciate its design in terms of “customizability.” We love to create Slack bots, and this “hackability” allows us to integrate Slack in our own way. So I really commend Slack’s work in this regard.

Token Usage and Internal Model Limits

Host: You just mentioned many different teams and how they use Claude Code and Cowork. Besides the engineering team, which team uses the most tokens? I’d guess engineering is first; if not, that’s interesting. Who’s second?

Cat Wu: The Applied AI team is very strong in exploring the boundaries of Claude Code and Cowork. Much of their work involves collaborating with clients to help them implement our APIs. So sometimes they directly help clients create prototypes, and Claude Code has made this process much faster than before. At the same time, they also handle a lot of client communications, such as client needs, historical meeting records, etc. So their usage on Cowork and Claude Code is very heavy.

Host: What exactly is the Applied AI team? Is it similar to forward-deployed engineering?

Cat Wu: You can think of it that way. Their work is to help clients implement our APIs and model capabilities internally, whether for their own products or to improve internal efficiency.

Host: Got it, it’s a somewhat technical go-to-market/customer success role.

Cat Wu: Yes, it’s a very technical go-to-market role.

Host: So you think they are the second-highest in token usage?

Cat Wu: Yes, and they are also constantly exploring the usage boundaries of Cowork. For example, many people are responsible for multiple clients and may have 5 to 10 client meetings in a day. So they might use Cowork the night before to prepare: “Help me summarize all client meetings tomorrow, what each client is focusing on, what demands they have raised, and what previous action items are.” Cowork will automatically generate a “battle briefing” to help them quickly get into the right mindset. Additionally, if a client asks in a meeting, “When will a certain feature be released?” Cowork can even check the latest progress in Slack and provide the latest ETA to include in the meeting materials. These are all workflows that people have built themselves and shared within the team.

Host: That’s cool. Recently, there’s an interesting trend: some people have reported that their AI token costs have exceeded their own salaries. Does Anthropic have similar data internally? For instance, how many tokens do engineers or PMs use daily or monthly?

Cat Wu: We have indeed observed that as model capabilities improve, people assign more tasks to it and spend more time on Claude Code and Cowork. So every time a model has a significant upgrade, the per capita token consumption increases. Currently, this cost is still far lower than the average salary of engineers, but this ratio is continuously growing.

Host: You also have a significant advantage in that you can use the most advanced models, and token usage is essentially unlimited, right?

Cat Wu: We can use many tokens, but there are indeed limits for some people.

Host: So there are still upper limits.

Cat Wu: We place great importance on enabling internal teams to develop as quickly as possible, and we believe everyone understands the costs of running the models and will use tokens responsibly. Wasting tokens is discouraged, but we trust everyone to make judgments.

Host: Returning to the PM role, you mentioned some aspects earlier. I want to ask systematically: what new capabilities do AI companies value most in PMs now?

Cat Wu: The most challenging capability is defining “what the product should look like in a month.” Because at this time scale, there is significant uncertainty in model capabilities and user behavior. But excellent PMs can see patterns from how users “break product boundaries” and set directions, continuously pushing forward. If model capabilities change beyond expectations, they can also adjust promptly.

Another difficult aspect is that you need to have a “just right” belief in AGI. Everyone can imagine a future where models are incredibly powerful and almost omnipotent, where products could even degrade to just a text box. But the real challenge is: how to maximize its potential under the current model capabilities? How to guide users onto the “best path”? How to amplify its strengths and compensate for its weaknesses? This capability is actually very scarce.

Host: How can this ability be cultivated? Does it require extensive interaction with models to understand their boundaries?

Cat Wu: Yes, it requires a lot of interaction with the models. One thing I enjoy doing is having the model “self-reflect.” For instance, sometimes when the model does something strange, I ask it why it did that. It might say: the system prompt was ambiguous; or it didn’t realize front-end validation was part of the task; or it delegated the task to a sub-agent but didn’t check the results. This analysis helps you understand where it was misled, allowing you to optimize the system.

Another important point is to find trusted “feedback sources.” Not all user feedback is equally valuable. Usually, there are a few individuals particularly skilled at judging model performance. Finding these five people is crucial. The third point is to conduct evaluations. You don’t need to do hundreds of evaluations; just ten high-quality ones can help the team clarify goals and measure progress. This is a severely underestimated task that more PMs and engineers should participate in.

Deleting Features After New Model Releases

Host: Many people say the future of product managers is to write evaluations, essentially defining “what success looks like.”

How much time do you spend on this?

Cat Wu: It depends on the specific issue. Some teams invest a lot of time in evaluations. We have a small team that collaborates closely with research to analyze model behavior meticulously. I usually participate when a feature needs clearer definition, such as doing five evaluations to explain how to run them, what succeeds, what fails, and how to optimize prompts. Features like memory rely heavily on evaluations.

Host: You mentioned the “personality” of Claude. I previously interviewed a co-founder who emphasized this point as well. Many initially thought it was just an “interesting” addition, but it’s actually core to Claude’s success. What’s your take?

Cat Wu: You can think of real-life colleagues; some people just make you feel “great to work with.” Claude is similar. People like it because it is: easygoing, fun; yet very professional; has no ego; is willing to admit mistakes; has a positive attitude; for example, when you feel a task is difficult, it says, “That’s okay, we’ll take it step by step. Would you like me to help you get started?” The traits of excellent colleagues are positivity, proactivity, and sincere feedback, which we are all striving to inject into Claude.

Host: You mentioned that after releasing new models, you often have to rethink products, which sounds both exciting and overwhelming. How frequent is this situation?

Cat Wu: The bigger change is actually “deleting features.” Many features were originally added to compensate for the model’s limitations. For example, the early to-do list: the model would miss steps when making large-scale modifications, so we added a task list to force it to complete them. But in the new model, it can naturally complete these steps, so that feature becomes less important. Every time we release a new model, we recheck the system prompt and delete parts that are no longer needed.

Host: So the model “eats up” those product-level patches you previously made?

Cat Wu: Yes. But what’s even more exciting is that new models also unlock entirely new features. For instance, code review—we tried many times until recently when the model was strong enough to reach a usable level. Now we can even run multiple code review agents in parallel, scanning the entire codebase and outputting high-quality issues.

Host: Finally, let’s talk about the vision. What is the long-term direction for Claude Code and Cowork?

Cat Wu: We think from the basic unit of “tasks.” The first step is to ensure individual tasks succeed consistently. As model strength increases and task success rates improve, people will start running multiple tasks simultaneously. The next step might be: running dozens or hundreds of Claudes at the same time. At that point, the questions become: how to manage these tasks? How to build an interface that lets humans know what to focus on? How to ensure agents have completed and verified their work? How to establish feedback mechanisms that allow the system to continuously improve itself?

This is what we are thinking about for the long-term direction.

The Value of Automation

Host: Many listeners, including product managers, entrepreneurs, and various cross-functional roles, are worried about their roles and future career development. What advice would you give them? Not just about surviving in this highly AI-driven world, but how to truly succeed and thrive? What do you think they should hear and do?

Cat Wu: I believe AI has given everyone a much larger leverage than in the past. So I would advise you: whenever you realize you’re repeatedly doing a manual task, think about whether you can automate it using Claude Code, Cowork, or other AI tools. Most people’s work includes parts they enjoy creatively and some tedious, cumbersome parts they dislike. The beauty of AI is that it can help you handle these tedious tasks. It can learn from every time you perform these manual tasks, summarize patterns, and then execute automatically, allowing you to focus on more creative aspects. This means you can do much more than before.

So my most direct advice is: identify those repetitive tasks that can be handed over to Claude, continuously iterate on these automated workflows until they achieve high success rates, and then think about what else you can do for your team, product, or company—those things you’ve always wanted to do but never had the time or energy to pursue, or whether there’s something you’ve always felt the company should do but never had time to tackle. If AI can help you handle those “grunt work,” it’s like you’ve gained an extra 20% of your time. My advice is to embrace these tools, delegate the work you dislike, and find ways they can accelerate you, and then you can achieve more.

Host: One core point you just made that I strongly agree with is using AI to solve problems. There are many tools and great potential now, but for many, the hardest part is figuring out what to do. Your advice essentially is: pay attention to those things you do repeatedly that can be automated and those ideas you’ve always wanted to pursue but haven’t had time for. Essentially, it’s about solving problems for yourself, right?

Cat Wu: Yes, that’s completely correct. I would also advise everyone to push automation from “this is a nice concept” to “it’s genuinely 100% usable.” Sometimes I see users automate a process to 90% or 95% and then give up. But if it can’t achieve 100% automation, it doesn’t count as true automation. The last 5% to 10% often requires more time, and building automation can sometimes be slower than doing it manually. But I still encourage everyone to pick something you really want to achieve 100% automation on, invest enough effort to refine it: teach the model your preferences, give it feedback, and let it improve continuously until it reaches 100%. Only then can you truly trust it. A 95% automated task doesn’t hold much value.

Host: I totally relate to that; it’s excellent advice.

Cat Wu: I’m in the same boat. I’m currently teaching Cowork to help me achieve inbox zero in Gmail, but the process is very time-consuming, and it’s far from ideal.

Host: What a coincidence; I am too. I set up an automated email classification process to sort those “junk requests” (like wanting to be on the podcast) into a folder. It’s about 95% accurate, but it occasionally misses important emails.

So your advice is great; I need to perfect it.

Cat Wu: We are also working to make these custom processes easier to use. The current process is indeed a bit complex: you have to define a skill, learn how to call it, give it feedback, and let Cowork update this skill based on feedback, and finally check the updated results. This is also our responsibility, to make the whole process smoother rather than painful.

Host: Fantastic. Cat, is there anything else you’d like to add? Or anything you want to emphasize before we jump into the quick-fire round?

Cat Wu: I see many people experimenting with AI to create various prototypes or build workflows. But I recommend focusing on applications you’ll use daily. Because only in true usage can you gain value. If you just create a prototype but it doesn’t help you improve efficiency, then AI hasn’t really brought you value. That kind of “once-off creation, thinking it’s cool, and then never using it again” approach teaches you very little and doesn’t truly leverage.

Host: That’s a great point. I’ve also noticed another extreme: some people spend a lot of time customizing their workflows. There’s a type of person who never automates, but another type who over-optimizes tools, adding various skills, MCPs, and workflow optimizations. Sometimes, this can lead them away from the initial goal, like actually releasing a product or making a feature.

Cat Wu: Yes, I feel the same way. Customizing these things is indeed fun, and we hope the product is hackable enough for you to use it in your own way. But there is a boundary. I see some people spending too much time on customization, even losing sleep, and neglecting the core tasks they initially wanted to complete.

Host: I’ve seen a lot of this on Twitter, with people saying, “Look at my configuration, how optimized it is.” But the question is, what are you actually doing?

Cat Wu: Many times, simpler configurations are actually more effective.

Host: Speaking of which, I saw a tweet from Andrej Karpathy yesterday, mentioning an interesting split: one group of people used ChatGPT or Claude early on but thought, “It’s just okay,” and then gave up, remaining skeptical about AI; while another group used it to write code and truly saw its power. These two groups completely fail to understand each other. So your advice is crucial: use it to do real things to understand its capabilities.

Cat Wu: Yes, I think a significant shift is that products in 2024 will mostly be “conversational,” while the current generation of Claude Code products is “action-oriented.” The real “aha moment” is when Claude can execute tasks for you. When you realize it can not only tell you what to do but also do it for you, that feeling is incredibly shocking.

Host: Exactly. I also want to mention a Chrome extension where you can watch Claude automate actions, like “help me fill out this form,” and it actually does it.

Cat Wu: Yes, that’s the feeling.

Was this helpful?

Likes and saves are stored in your browser on this device only (local storage) and are not uploaded to our servers.

Comments

Discussion is powered by Giscus (GitHub Discussions). Add repo, repoID, category, and categoryID under [params.comments.giscus] in hugo.toml using the values from the Giscus setup tool.