AI Safety’s Crux: Culture vs. Capitalism

SB1047’s veto, OpenAI’s turnover, and a constant treadmill pushing AI startups to be all too similar to big technology name brands.

AI Safety, also known as AGI Safety, is a leading philosophical position at the leading AI labs. The core idea is that the first principle taken when approaching large-scale AI systems and research projects is that AI is different. Proponents believe we cannot treat AI as we have previous technologies. They truly believe that we can build an artificial general intelligence that is creative and capable of acting beyond human constraints. This belief goes beyond marketing, but does make it easy for them to tell extremely compelling stories around their companies. They argue that we must think longer term and have vastly different (and stronger) controls.

For the first phase of AI’s development, roughly until ChatGPT’s explosion, the culture could easily be seen in decision-making, organization structures, products, and everything these companies did. The culture of AI safety is at a common crossroads seen by multiple value-informed design principles across technology — where the practical constraints of investment, markets, and regulation are sticking with their traditional methods.

There have been countless articles and reports in recent weeks about turmoil at OpenAI, with key employees and co-founders disagreeing on direction, speaking out against decisions, leaving, and all the normal sounds here. While turnover at mature startups is normal given the long life cycle required to reach IPO, the inflammatory nature of it is not. The Wall Street Journal and The Information both recently covered the high-profile exits, highlighting safety vs. product debates, failed courting of people to return, and many more extreme oddities.

Much of the recent events feel like mirrors of the high-profile firings of AI Ethics researchers from Google in 2020. At this point, Google was already a mature company, so firing may have been the only way to correct corporate goals and power dynamics. In startups, until leaving, vocal researchers have their say until they just can’t anymore. For most of OpenAI’s existence, something like 20% of compute could’ve gone to safety work (as was promised to OpenAI’s super-alignment team, but as capital investments grow and competition blooms, this can no longer be kept. One of OpenAI’s many controversies was then superalignment lead Jan Leike said “I resigned” after high-profile disagreements with leadership.

In Jan’s case, and many others along the way, capitalism and the drive to be a public company won. We’re seeing this again and again, OpenAI is just ahead on the curve. Anthropic’s founding was caused by one of the early splits, Elon Musk left early on, and there are surely more we haven’t seen due to OpenAI’s strict non-disparagement contracts.

Anthropic, at its founding, repeated many times it did not want to contribute to Arm’s Race dynamics, but today it has what many people believe is the strongest publicly available model. At the same time, their culture is still much more sharply honed to be sensitive to concerns of safety. Their tension is yet to bubble up in the same manner as OpenAI’s, which could be more reflective of OpenAI’s culture being similar to that of Twitter — one of tech’s odd ducklings, but it seems inevitable for Anthropic’s day to come.

Where Google and Big Tech are largely constant in their tenor of technological deployments, the safety testing and prioritization of the top labs continue to march towards them. This is because the organizations need to pay the large capital costs for next-generation models, not because the culture has changed. The only way to mobilize on the order of $100 billion that is needed is by appearing normal to traditional investors, e.g. through IPO, or by already having the money around, e.g. being big tech. OpenAI has chosen the former, and it is likely other labs choose the latter via acquisition.

In a world where OpenAI, the company that owns the website that the world associates with AI, can barely raise enough money to meet their goals — hence another $ 7 billion round at a $150 billion valuation is in the works. In a world where this is the case, painting optimistic futures for SSI or Anthropic as independent entities is challenging. Their thriving on their own is likely one of the better outcomes for general notions of AI being beneficial, but if it is low in probability, other options should be weighed.

These cultural changes are going to trickle down in important ways to the tools we use today. The methods we use AI are particularly salient within our digital lives, an influence I discussed in *Name, Image, and AI’s Likeness:*

AI technology, more than many technologies coming before, is one that mirrors the culture of its employees. A company’s employees mirror their leadership, even if they do not realize they are doing so. The culture of these companies can easily matter more than their stated goals when building technology given how steerable AI systems are. When using AI, such as the difference between Meta’s Llama 3 and Anthropic’s Claude, it’s easy to feel the difference. It’s not easy to feel the difference in cultures between Google and Bing (discounting the association of culture and product). When given a voice, these cultural mirrors will become stronger.

Technology has always taken on a reflection of its employees, but AI is doing it differently by attacking different pressure points, which are more culturally relevant, and adding more humanity to the technology, which makes it easier to absorb.

All this is to say, where does AI Safety go? Is it safe to keep forking your organization and going elsewhere to reset? The hopes of OpenAI being a nonprofit were some of the last structural incentives that could prevent this march towards traditional for-profit normalcy that worries plenty of people in AI (and not just those who self-identify as AI Safetyists). AI being governed by extremely large, for-profit entities is a trajectory with low variance, but one that is extremely likely to propagate most of the harms that we are used to on today’s internet.

Is founding new organizations actually an effective way for these prominent researchers and engineers to build the technology they want? Every fork in the road takes increasingly long to recover from, given the path dependence of building systems like large language models. In this way, a more direct way to ensure safety at the frontier of AI is to stay, but this comes at the emotional cost of acknowledging a change in internal relevance.

This is a recurring cycle in trying to build value-informed systems and one that is not restricted to just AI. It is more impactful to make ChatGPT, Gemini, or one of the extremely established AI systems safer than it is to make a 100% safe AI system that no one uses. These same challenges put a heavy strain on some decision-making at HuggingFace, which would be more classified as “ethical AI,” rather than AI safety, but the path of highest impact is still focusing on where the deployment will happen.

Unironically, joining Google could be the best way for these researchers could increase the probability of safe, beneficial AI, if they cannot accept the status quo.

SB1047 as a regulatory litmus test for AI safety

Today, AI Safety is becoming a household name. In this transition, the clarity around what the AI Safety movement aims for is blurred, and it is becoming deeply associated with its historically peer fields, such as AI Ethics. This brings obvious boons and challenges, one of which is in regulation.

The controversial California bill SB 1047 served as a basic check for how traditional notions of power would or would not embrace the original AI Safety ideology. If SB 1047 were to have passed, instead of its recent veto, it would’ve been seen as a moderate embrace of AI Safety beliefs — prerelease testing is crucial, scaling is the central factor governing AI development, and concentration of power by limiting access. The original bill was far closer to these principles than the one that reached Governor Newsom’s desk.

SB 1047 was an attempt at regulation starting from AI Safety principles. Many prominent AI Safety individuals have endorsed the final version of the bill. It is likely the next substantial attempt will take a different flavor.