How to be ‘fearless’ in the AI age

Stanford’s Fei-Fei Li

The brilliant computer scientist Fei-Fei Li is often called the Godmother of AI. She talks with host Reid Hoffman about why scientists and entrepreneurs need to be fearless in the face of an uncertain future.

How to be 'fearless' in the AI age — with Stanford’s Fei-Fei Li and Masters of Scale’s Reid Hoffman

0:00 / 0:00

About Fei-Fei

Co-created ImageNet, catalyzing the deep learning revolution in computer vision.
Co-founder and CEO of World Labs, advancing world models and spatial intelligence (2025).
Founding director of Stanford’s Human-Centered AI Institute; former Stanford AI Lab Director.
Former VP at Google and Chief Scientist of AI/ML at Google Cloud (2017-2018).
Co-founder of AI4ALL, broadening AI education for underrepresented youth.

Transcript:

How to be ‘fearless’ in the AI age

REID HOFFMAN: I’ve lost count of the number of times we’ve done this. And it’s always awesome and an honor. So thank you for being here.

FEI-FEI LI: Thank you. I’ve lost count too.

Copy LinkThe next phase of AI: spatial intelligence & world modeling

HOFFMAN: Yes. Obviously, anyone who’s following your career knows that you’re one of the OGs in the wave of AI. ImageNet, all of these things that are contributions that are fundamental to where we are today. So by the way, thank you for that. And now, you’re working on spatial intelligence and world building. Say a little bit about why you took a sabbatical from the job you love with Human-Centered AI and Stanford CS and started building this company and what you’re doing.

LI: So Reid, you are one of the original supporters, investors of OpenAI. So when OpenAI was founded, you and I talked about the dream of AGI, right? As an AI scientist, I’m torn between the word AI versus AGI because they more or less mean the same thing for me. But language aside, what is—

HOFFMAN: Actually, I use AGI for the AI we haven’t invented yet.

LI: Right. So what is AGI to me? AGI, to me, is the capability of intelligence of machines that are on par with humans and, in many cases, can be superseding humans. And I think about this as a door to the future. And on this door, there are multiple keyholes. And language is one of them, one of the major ones, because language is an essential part of intelligence to the point that – I know you are a quasi-philosophy major – that Wittgenstein – quasi is actually a positive word.

HOFFMAN: Okay.

LI: At Stanford, I have to say, because Reid is an alumni, there’s a very special major called symbolic systems. They combine philosophy and cognitive science and computer science.

HOFFMAN: By the way, I was the eighth person to declare that major.

LI: Oh my gosh. So many famous people come out of that major. Yeah. So quasi, it just means the proportion. So it’s a good word. Sorry.

HOFFMAN: She’ll correct me later, but that’s all right.

LI: I’m digging myself out of a hole. Wittgenstein says that language defines the limit of the world.

HOFFMAN: Yes.

LI: I’m actually disagreeing with that.

HOFFMAN: Excellent.

LI: Because I think language defines a certain level of boundary that the world can be described in symbolic forms, but beyond that, the world is actually limitless.

HOFFMAN: Yes.

LI: And what is that world? How do we define that? What does that have to do with intelligence? How do we use machines to express that? I lump that whole thing into world modeling. World modeling is very connected to language, but it’s about expressing, representing, and eventually participating in the changes of the states of the world. And that could be virtual. It could be physical. And what does world modeling entail? It does entail language because language is one form of interrogation with the world, but it also entails the visual, the lights, the semantics, the space, the physical actions. And all of that is still at the dawn, and it’s the next phase of AI. And that’s what World Labs is about. We are trying to do world modeling, and we’re trying to bring that level of spatial intelligence into the next chapter of AI.

HOFFMAN: So two questions with the spatial intelligence. One, because there are a few people who will be deep here, but a lot of people, their primary experience of LLMs is like ChatGPT, Gemini, et cetera. So what should they understand is different, not just in the cognitive capabilities, because our world is not just language? And then two, what does the road ahead look like? What are some of the challenges to overcome in getting there?

LI: Yeah. What would it get us when we have world modeling? Well, we’re already seeing budding signs of that. Lots of storytellers are creatives with many media, whether it’s pixels, movies, sculptures, digital art. And that is a highly, highly creative interactive world that you cannot just use language to express. And world modeling, the ability to generate things, to generate worlds that you can immerse yourself in, you can interact with, is highly enticing and exciting for creators. And that’s one way you see world modeling could be applied for. This is not just for entertainment and storytelling. This could be for design, could be for even industrial uses all the way to healthcare, medicine, education. Also, the distance between being passively entertained and being actively participating in experiences right now is closing rapidly. And the ability to have machines to create world models that would allow that kind of immersive experience is really powerful.

And that also segues into simulation. Simulation is really important both for human experiences, human learning, as well as for embodied AI. Robots need to learn from simulation as much as they need to learn from the real world. We can really learn about the history of robots, including self-driving cars, and the critical roles that simulation has played. So the application really is boundless. What are the challenges? I’ll just call out one challenge. There are actually many challenges. One challenge is data. Unlike language, where data is all over the internet, when it comes to world modeling, data is not as obvious and easily obtainable compared to language. Of course, there is video data. That’s one of the most critical forms of data for world modeling. But the world, like I said, is very multimodal. It’s very spatial. It has fundamental 3D information, geometry, physics, dynamics. And some of those are not easily obtainable.

HOFFMAN: And so there obviously has been a lot of discussion around robotics. And one of the things I think we should draw the line for everyone is to understand how critical world modeling will be to any kind of robotic elevation of work and the human conditions. Say a little bit about why this cognitive set is so important there.

LI: Yeah. I spent a lot of time thinking about that because, frankly, after ImageNet, after the first wave of computer vision achieving a level of fidelity and quality, I actually went into a little bit of a crisis myself and started to soul search. What is perception about? What is vision for? I thought it would take me a hundred years to work on the problem of object recognition, but it went a little faster than I thought. So I needed another North Star. And it took me back to evolution. And I started to read a lot about evolution. And I’m like literally this much of a philosophy student compared to Reid.

About 530 million years ago, there was an incredible evolutionary event called the Cambrian Explosion where the animal speciation just exploded, where it’s also the beginning of the nervous system, the beginning of photosensitive cells. And it really, after reading a lot of literature and thinking, it really dawned on me that the evolutionary reason animals have perception is actually for activity, for interactivity.

It’s active. And that means perception and perceptual intelligence is the foundation of movement. And the beginning of movement is very simple. You just kind of translate your body somewhere quickly. The movement becomes much more interactive. And fighting for food to mating to nesting to rearing offsprings to a much deeper – look at mammals and humans. Our ability to move is very, very complex. The degree of freedom we have between our fingers, toes, and body, torso is very high. And all this requires a fundamental, perceptual, spatial intelligence of the world we’re in so that we know, we understand, and we can plan for all the movements. So really truly, in my opinion, that the level of nuanced, complex, spatial world understanding is the brain of embodied intelligence, including robots.

HOFFMAN: And actually, while the robots give a particular sense where you need that embodied intelligence for them to all be embodied, there will also, of course, be a little bit like your opening comment on Wittgenstein, the question of actual cognitive reasoning capabilities that are not just purely linguistic.

BERMAN: Still ahead, more of Reid’s conversation with Fei-Fei about why we should all be more fearless in the age of AI.

[AD BREAK]

Welcome back to Masters of Scale. You can find this conversation and much more from our 2025 summit on our YouTube channel.

Copy LinkWhat spatial intelligence has done for humans

HOFFMAN: Part of what you get with spatial intelligence is other forms of intelligence that will even be important there. It isn’t just in a pure perception-action loop. The old Western perception as a camera and action as separate is clearly wrong. That was the thing that you were just referring to. But it will also increase our cognitive capabilities or how we imagine the world, how we model it in our heads. What are some of the reasoning characteristics you think might come out from when you add spatial intelligence, not just to robots, but to every AI system?

LI: Yeah. That’s wonderful, Reid. This is why I love talking to you.

So throughout human civilization, if you look at the milestones of humans building civilization, there’s a lot of milestones that cannot possibly be achieved with just language. The nuance of space and spatial reasoning, world modeling is very, very clear. For example, let’s just take early days. The building of the pyramids, the ability to start to abstract geometry, the sense of geometry, and also the construction of large bodies. There are a lot of things that go into that cognitive spatial reasoning that’s not this simplistic transactional behavior of I see something, I want to move it, right?

HOFFMAN: Yes.

LI: Another example is the deduction of the structure of DNA. If you know the history of how DNA was discovered, of course many scientists were getting the vibe, using today’s language, that there’s something that’s going on in this fundamental building block of our genetics. But it took Rosalind Franklin, the under-appreciated scientist, to take the X-ray imagery. But then Francis Crick and James Watson, of course, they were deeply thinking about this. But to go from this imagery to a 3D double helix intertwined structure is deeply spatial.

You cannot language your way into this deduction. I’m sure language participated. I’m not anti-language. I love Wittgenstein.

HOFFMAN: I speak. I am pro-language.

LI: Exactly. Exactly. But that is a beautiful example of humans using spatial reasoning and cognitive ability to do something or discover something we’ve never done. So I think that as we empower AI with this ability, this is not just for robots that can pick up glasses or a cube. This is for lifting all of humanity’s capability because we can collaborate with machines having this capability.

Copy LinkIs AI over-hyped?

HOFFMAN: Yeah. Awesome. And actually, I’ve never heard the DNA answer from you before. So I love asking you questions that I have never asked you before. There’s a lot of discourse around, is AI over-hyped, under-hyped? We’re obviously here in the Valley, so everyone in the Valley more or less thinks under-hyped. People want to say, “Are we going to go through another AI winter?” What’s your view about this discussion about what’s going on with AI? What would you say? Hey, these are the parts that are under-hyped. These are the parts that are maybe a little too soon. What’s the guide to the wise on the current discourse and sorting wheat from chaff a little bit?

LI: Oh, boy. I have to be careful how I answer this question, but I totally appreciate this. AI is a civilizational technology. I’m not the only one saying this, but I truly believe because even if you’re inspired by humans and evolution, this is – the ability to intellectualize and to think, to do is fundamental to humans. And a piece of technology that can do that is phenomenal.

In my opinion, it’s more or less not over-hyped as an intellectual future of humanity because AI is the new computing. If you look at today’s world and just recognize where there are chips because chips are the physical places where computing happens. From a light bulb to a self-driving car to an airplane, everywhere there’s chips. Well, it’s very obvious at this point. Wherever there is chip, there’s compute. Wherever there’s compute, there will be AI, if it’s not there.

So from that point of view, both from a business as well as use case point of view, AI is the future. Obviously, when it comes to hype, I do think we have to be a bit nuanced. For example, it took more than 20 years to go from Sebastian Thrun’s first self-driving car that can drive a car 130 miles in the Nevada desert, which has no traffic apparently, to Waymo running in San Francisco, right? Well, you might say, “Well, because part of it is software. It was pre-deep learning age. And software development was slower.” You’re right. It’s definitely deep-learning accelerated, the brains of a self-driving car. But let’s also not forget that the car industry, the entire supply chain as well as the customer use basis has been established for more than a hundred years. And it’s a very, very mature business model and very mature infrastructure and manufacturing and everything. So if it took 20 years to get just cars, which is the simplest form of robot on the street–

HOFFMAN: Or Roombas–

LI: I was thinking, I knew he’s going to say that. Yeah. Yeah. Well, Roombas are mini cars. So Roombas, actually, I think cars are – it’s literally a square-ish box that moves on a 2D surface. And the only thing you have to do is not to touch anything, right? Because if you touch, you’re screwed. And Roombas do touch. And by and large, it’s okay. But robots, the whole thing about a robot is it’s a three-dimensional machine that the whole goal is to touch things and touch it in the right way. This is huge. So I think there’s still going to be a journey for robotics, for sure.

Copy LinkHow should leaders build society trust in AI?

HOFFMAN: 100%. So one of the things to realize, this civilization technology, is to build trust, and whether it’s to technologists, companies, et cetera, et cetera. What do you think the things we should be doing as leaders, as companies, as entrepreneurs, to help build trust? Because it’s obviously – we only start realizing the real benefits once we get there.

LI: Great question. I know you and I both care about this. One thing I feel very strongly is in the AI age, trust cannot be outsourced to machines. Trust is fundamentally human. It’s at the individual level, community level, and societal level. And this is why Reid was part of our supporters for the Human-Centered AI Institute at Stanford. We established that in 2018, so way before this latest wave of AI blossoming. It’s because we recognize that as machines get more powerful in computing and reasoning and eventually even actionable capabilities, we need to establish a new norm that needs to be part of the fabric of the society, where within this norm, humans continue to have the agency to build trust with each other, with the newer tools like AI, with more powerful products like chatbots and other things.

And eventually, this trust has to be renewed or updated into our governance model. Not just the governance of community and companies, but governance of the society at large. So I do think trust is a very, very important element. This audience is very entrepreneur-heavy. I would just say care about this from the beginning, no matter what product or business you’re doing. Some of you might be in healthcare. You know how important it is. Some of you might be just in the infrastructure, SaaS, whatever application that feels like maybe that’s more removed. That’s not true because you’re serving people, you’re serving businesses. Trust is really important. Have that human agency as the source of the trust.

Copy LinkWhy we need to be “fearless” with AI

HOFFMAN: Yeah. 100%.

Part of the work that you and Etch were leading with the Human-Centered AI, which came from your New York Times column, is part of the thing that led me to focusing on agency as one of the things, the elevation of human agency as one of the key things that we need to do.

So let’s go to the science side of it. You’ve said it’s important for scientists to be intellectually fearless. So what does that mean for how we should think about inventing the future? What does that mean for how science should progress? In terms of the next generation of innovators, where should fearlessness play into that?

LI: Well, if scientists need to be fearless, I think entrepreneurs need to be more fearless. Fearless, to me, I love this word. And this is part of the way I actually do hiring, is look for people, especially young people, with that fearlessness. Fearless is to be free, to get rid of the shackles that constrain your creativity, your courage, and your ability to just get shit done. And pardon my language. It’s actually –

HOFFMAN: No, it’s a technical term.

LI: Yes. It’s actually in our core culture of our company. So it’s that humans are not exactly the fastest, strongest animals on earth, right? If you look at many of the dimensions – I was just in Africa this summer with my kids – there’s so many animals that are just so much better than us. But I do feel that way.

HOFFMAN: Lots.

LI: Yes. But there’s something that is in our brain, in our mind, in our soul that can propel us to do incredible things for the world, for ourselves, for each other. And a lot of that comes from our fundamental uniqueness of our creativity and our sense of community and all this. And in order to unleash that, especially as technology is moving so fast, to me, the foundational emotional criteria is be creative, be free. And that translates into, be fearless, run into uncertainties, run into bold ideas that no one has made happen yet, run into contrarian hypotheses, run into hard tasks.

Someone said, I forgot who said that, tasks that are so certain versus tasks that are uncertain are sometimes equally hard. Choose the one that’s more uncertain because your creativity will be working harder. And that’s where the magic happens. So I love the word fearless because that’s where boundaries are broken and creativities are unleashed and magical things happen.

HOFFMAN: And with that, you can see why we wanted to open the first fireside with Fei-Fei. Let’s give her a hand.

LI: Thank you. Thank you.

BERMAN: Thanks to Dr. Fei-Fei Li for joining us at the Masters of Scale Summit. This conversation was recorded on stage at the Presidio Theatre in San Francisco.

Episode Takeaways

Fei-Fei Li discusses her journey from leading Stanford’s Human-Centered AI Institute to focusing on world modeling and spatial intelligence, arguing that intelligence goes far beyond language and is key for the next phase of AI.
The conversation explores how spatial and world modeling intelligence power more immersive and interactive experiences, essential for robotics, creative industries, and human-machine collaboration, but also present fresh data and engineering challenges.
Both Reid Hoffman and Fei-Fei Li emphasize that AI is a civilization-shaping technology and caution that while the hype is warranted on a broad level, practical deployment takes time, infrastructure, and continued innovation.
Fei-Fei Li stresses that building trust in AI must remain a fundamentally human endeavor and encourages both scientists and entrepreneurs to be fearless.