Why Nvidia CEO Jensen Huang can't stop talking about tokens

5 hours ago

Jensen Huang onstage at GTC, wearing a black leather jacket standing in front of a blue background.

Nvidia CEO Jensen Huang can't stop talking about tokens at the GTC conference.
Tokens determine how AI work is measured and priced.
AI agents will drive massive token usage, and companies should be happy to pay for them, Huang said.

Nvidia CEO Jensen Huang's remarks at the company's GTC conference had a recurring theme: AI tokens.

In a conversation with analysts on Tuesday, Huang framed future computers as "manufacturing equipment" that will produce tokens. He said that tokens will become a core line item in corporate budgets, like laptops or software subscriptions.

With all this talk about tokens, what exactly are they?

Tokens are units of text — a word or a part of a word — that determine how AI work is measured and priced. A short word could be a single token, while longer words can be split into several. A rule of thumb is that one token is about four characters.

Large language models like OpenAI's ChatGPT or Anthropic's Claude track how many tokens are processed when a user inputs text, and how many are generated in response — and AI giants bill companies accordingly. The more text you work with, the more tokens it takes for AI models to process. Unlike existing software costs, which are billed as subscriptions or flat fees, AI is priced by usage.

In the future, as the technology becomes even more central to work, Huang said engineers may receive "token budgets" to help them become more productive. During his keynote on Monday, he even floated the idea of offering Nvidia engineers tokens worth half their annual salary to attract talent.

Huang brought this idea up again on Tuesday. He said the costs will be worth it — especially for highly paid engineers, who can deliver big productivity gains with the help of agents, or autonomous apps that can conduct various tasks.

In a pitch for Nvidia, Huang added that more powerful and energy-efficient hardware can generate tokens more cheaply over time.

"If I added to them $100 a day of inference cost — token cost — I'd be more than happy to do it," Huang said of paying for tokens on top of engineering salaries — especially during intense periods. "If I added even $1,000 on crunch time, more than happy to do it."

Huang isn't alone. The concept is starting to percolate across the tech industry, with engineers asking about compute during interviews and executives considering it as part of compensation packages.

The rise of agentic AI could dramatically increase token usage, Huang said, because they'll run without human oversight.

"Right now, as we speak, all of our laptops are kind of sitting idle," he said Tuesday. "But in the future, the computer is going to be running 24/7 and creating tokens because your agents are off doing work."

Have a tip? Contact this reporter via email at [email protected] or Signal at @geoffweiss.25. Use a personal email address, a nonwork WiFi network, and a nonwork device; here's our guide to sharing information securely.