OpenAI makes humanity less safe
If there's anything we can do now about the risks of superintelligent AI, then OpenAI makes humanity less safe.
Once upon a time, some good people were worried about the possibility that humanity would figure out how to create a superintelligent AI before they figured out how to tell it what we wanted it to do. If this happened, it could lead to literally destroying humanity and nearly everything we care about. This would be very bad. So they tried to warn people about the problem, and to organize efforts to solve it.
Specifically, they called for work on aligning an AI’s goals with ours - sometimes called the value alignment problem, AI control, friendly AI, or simply AI safety - before rushing ahead to increase the power of AI.
Some other good people listened. They knew they had no relevant technical expertise, but what they did have was a lot of money. So they did the one thing they could do - throw money at the problem, giving it to trusted parties to try to solve the problem. Unfortunately, the money was used to make the problem worse. This is the story of OpenAI.
Before I go on, two qualifiers:
- This post will be much easier to follow if you have some familiarity with the AI safety problem. For a quick summary you can read Scott Alexander’s Superintelligence FAQ. For a more comprehensive account see Nick Bostrom’s book Superintelligence.
- AI is an area in which even most highly informed people should have lots of uncertainty. I wouldn't be surprised if my opinion changes a lot after publishing this post, as I learn relevant information. I'm publishing this because I think this process should go on in public.
The story of OpenAI
Before OpenAI, there was DeepMind, a for-profit venture working on "deep learning” techniques. It was widely regarded as *the *advanced AI research organization. If any current effort was going to produce superhuman intelligence, it was DeepMind.
Elsewhere, industrialist Elon Musk was working on more concrete (and largely successful) projects to benefit humanity, like commercially viable electric cars, solar panels cheaper than ordinary roofing, cheap spaceflight with reusable rockets, and a long-run plan for a Mars colony. When he heard the arguments people like Eliezer Yudkowsky and Nick Bostrom were making about AI risk, he was persuaded that there was something to worry about - but he initially thought a Mars colony might save us. But when DeepMind’s head, Demis Hassabis, pointed out that this wasn't far enough to escape the reach of a true superintelligence, he decided he had to do something about it:
Hassabis, a co-founder of the mysterious London laboratory DeepMind, had come to Musk’s SpaceX rocket factory, outside Los Angeles, a few years ago. […] Musk explained that his ultimate goal at SpaceX was the most important project in the world: interplanetary colonization.
Hassabis replied that, in fact, he was working on the most important project in the world: developing artificial super-intelligence. Musk countered that this was one reason we needed to colonize Mars—so that we’ll have a bolt-hole if A.I. goes rogue and turns on humanity. Amused, Hassabis said that A.I. would simply follow humans to Mars.
[…]
Musk is not going gently. He plans on fighting this with every fiber of his carbon-based being. Musk and Altman have founded OpenAI, a billion-dollar nonprofit company, to work for safer artificial intelligence.
OpenAI’s primary strategy is to hire top AI researchers to do cutting-edge AI capacity research and publish the results, in order to ensure widespread access. Some of this involves making sure AI does what you meant it to do, which is a form of the value alignment problem mentioned above.
Intelligence and superintelligence
No one knows exactly what research will result in the creation of a general intelligence that can do anything a human can, much less a superintelligence - otherwise we’d already know how to build one. Some AI research is clearly not on the path towards superintelligence - for instance, applying known techniques to new fields. Other AI research is more general, and might plausibly be making progress towards a superintelligence. It could be that the sort of research DeepMind and OpenAI are working on is directly relevant to building a superintelligence, or it could be that their methods will tap out long before then. These are different scenarios, and need to be evaluated separately.
What if OpenAI and DeepMind are working on problems relevant to superintelligence?
If OpenAI is working on things that are directly relevant to the creation of a superintelligence, then its very existence makes an arms race with DeepMind more likely. This is really bad! Moreover, sharing results openly makes it easier for other institutions or individuals, who may care less about safety, to make progress on building a superintelligence.
Arms races are dangerous
One thing nearly everyone thinking seriously about the AI problem agrees on, is that an arms race towards superintelligence would be very bad news. The main problem occurs in what is called a “fast takeoff” scenario. If AI progress is smooth and gradual even past the point of human-level AI, then we may have plenty of time to correct any mistakes we make. But if there’s some threshold beyond which an AI would be able to improve itself faster than we could possibly keep up with, then we only get one chance to do it right.
AI value alignment is hard, and AI capacity is likely to be easier, so anything that causes an AI team to rush makes our chances substantially worse; if they get safety even slightly wrong but get capacity right enough, we may all end up dead. But you’re worried that the other team will unleash a potentially dangerous superintelligence first, then you might be willing to skip some steps on safety to preempt them. But they, having more reason to trust themselves than you, might notice that you’re rushing ahead, get worried that your team will destroy the world, and rush their (probably safe but they’re not sure) AI into existence.
OpenAI promotes competition
DeepMind used to be the standout AI research organization. With a comfortable lead on everyone else, they would be able to afford to take their time to check their work if they thought they were on the verge of doing something really dangerous. But OpenAI is now widely regarded as a credible close competitor. However dangerous you think DeepMind might have been in the absence of an arms race dynamic, this makes them more dangerous, not less. Moreover, by sharing their results, they are making it easier to create *other *close competitors to DeepMind, some of whom may not be so committed to AI safety.
We at least know that DeepMind, like OpenAI, has put some resources into safety research. What about the unknown people or organizations who might leverage AI capacity research published by OpenAI?
For more on how openly sharing technology with extreme destructive potential might be extremely harmful, see Scott Alexander’s Should AI be Open?, and Nick Bostrom’s Strategic Implications of Openness in AI Development.
What if OpenAI and DeepMind are not working on problems relevant to superintelligence?
Suppose OpenAI and DeepMind are largely not working on problems highly relevant to superintelligence. (Personally I consider this the more likely scenario.) By portraying short-run AI capacity work as a way to get to safe superintelligence, OpenAI’s existence diverts attention and resources from things actually focused on the problem of superintelligence value alignment, such as MIRI or FHI.
I suspect that in the long-run this will make it harder to get funding for long-run AI safety organizations. The Open Philanthropy Project just made its largest grant ever, to Open AI, to buy a seat on OpenAI’s board for Open Philanthropy Project executive director Holden Karnofsky. This is larger than their recent grants to MIRI, FHI, FLI, and the Center for Human-Compatible AI all together.
But the problem is not just money - it’s time and attention. The Open Philanthropy Project doesn’t think OpenAI is underfunded, and could do more good with the extra money. Instead, it seems to think that Holden can be a good influence on OpenAI. This means that of the time he's allocating to AI safety, a fair amount has been diverted to OpenAI.
This may also make it harder for organizations specializing in the sort of long-run AI alignment problems that don't have immediate applications to attract top talent. People who hear about AI safety research and are persuaded to look into it will have a harder time finding direct efforts to solve key long-run problems, since an organization focused on increasing short-run AI capacity will dominate AI safety's public image.
Why do good inputs turn bad?
OpenAI was founded by people trying to do good, and has hired some very good and highly talented people. It seems to be doing genuinely good capacity research. To the extent to which this is not dangerously close to superintelligence, it’s better to share this sort of thing than not – they could create a huge positive externality. They could construct a fantastic public good. Making the world richer in a way that widely distributes the gains is very, very good.
Separately, many people at OpenAI seem genuinely concerned about AI safety, want to prevent disaster, and have done real work to promote long-run AI safety research. For instance, my former housemate Paul Christiano, who is one of the most careful and insightful AI safety thinkers I know of, is currently employed at OpenAI. He is still doing AI safety work – for instance, he coauthored Concrete Problems in AI Safety with, among others, Dario Amodei and John Schulman, other OpenAI researchers.
Unfortunately, I don’t see how those two things make sense *jointly *in the same organization. I’ve talked with a lot of people about this in the AI risk community, and they’ve often attempted to steelman the case for OpenAI, but I haven’t found anyone willing to claim, as their own opinion, that OpenAI as conceived was a good idea. It doesn’t make sense to anyone, if you’re worried at all about the long-run AI alignment problem.
Something very puzzling is going on here. Good people tried to spend money on addressing an important problem, but somehow the money got spent on the thing most likely to make that exact problem worse. Whatever is going on here, it seems important to understand if you want to use your money to better the world.
(Cross-posted at LessWrong)
48 Comments
(I also find it odd that you'd call Paul careful, as he is by far the most optimistic/panglossian and least careful of serious AI safety researchers working today, from what I can tell from his writing, frequently making large unstated assumptions in frameworks he's proposing would be safe if they could be achieved.)
This signalled to me that Elon's main concern still surrounds democratisation of AI, and I feel that he will still influence the thinking and mission of all employees at OpenAI. Which made me update that OpenAI is more likely to be damaging.
ETA: I actually believe that winks by insiders are of limited value compared with institutional incentives and public coordination. (See this piece by Matt Yglesias on why politicians keep most policy promises for some of the reasoning.)
Lots of people talk about democratizing AI outside OpenAI, and there is vastly more code/pseudocode released by other organizations, including a big fraction of DeepMind's intellectual work. It is fair to ask if this overall level of openness is too high, which may depend on one's assumptions about timelines. But I don't see a massive difference between DeepMind and OpenAI in terms of philosophy arouns ipenness, other than the name (personally, BeneficialAI or GoodAI seem better to me but the latter is taken :) ).
Lastly, I agree that arms races are very important vut am skeptical of the OpenAI-->arms race theory. If there is any effect, it's a matter of degree, and I agree with someone else's point that it's not just about DeepMind and OpenAI. Other things cause arms races besides new organizations. To be fair in your analysis, you should probably also consider the fact that AlphaGo was directly cited as an accelerant of a Korean AI investment and brought tons of attention to AI in Asia more broadly.
It wasn't just the DeepMind show before OpenAI started, and DeepMind and OpenAI still aren't the only players in the field.
Consider FAIR, Google Brain, &c. Heck, OpenAI just lost one of their most prominent ML researchers (back) to Google Brain a few months ago.
It makes almost no sense to describe the state of the field as a two-party arms race between DeepMind and OpenAI. That's really just a factually inaccurate premise.
But compare FAIR, https://research.fb.com/category/facebook-ai-research-fair/: "Facebook Artificial Intelligence Researchers (FAIR) seek to understand and develop systems with human level intelligence by advancing the longer-term academic problems surrounding AI."
Or Google Brain, https://research.google.com/teams/brain/: "Make machines intelligent. Improve people’s lives."
Or Microsoft AI, https://www.microsoft.com/en-us/research/research-area/artificial-intelligence/: "At Microsoft, researchers in artificial intelligence are harnessing the explosion of digital data and computational power with advanced algorithms to enable collaborative and natural interactions between people and machines that extend the human ability to sense, learn and understand. The research infuses computers, materials and systems with the ability to reason, communicate and perform with humanlike skill and agility."
You'd need a painfully contorted definition of criterion (a) to end up with just DeepMind and OpenAI – basically by reading more into PR than into mission statements or actual research.
And I think Vicarious is generally regarded as somewhere between a joke and a scam.
Also, why focus on research groups rather than individual actors? After all, so much ML/AI research is publicly available on arXiv, blogs, etc. Probably, influencing research groups is a better strategy for people who want power over the future. Groups have more power than individuals since there are just more of them doing research (though maybe groupthink could create problems), and influencing each of a group of N people maybe doesn't take N times more effort than influencing a person the same amount.
First, some common ground: an arms race is brewing.
OpenAI's role there is massively dwarfed by AlphaGo.
When AlphaGo upset Lee Sedol 4-1, then proceeded to wipe the floor with the rest of the Go community pros (60-0), it hardly went unnoticed in Asia. The game is thousands of years old, a far deeper part of Korean, Japanese, and Chinese culture than chess is here. Their top scientists and government officials will not let DeepMind humiliate them again so easily.
China will have a the worlds most powerful supercomputer up this year - 70 petaflops. Japan is building a 120 petaflops supercomputer dedicated specifically to ML research. We all know how much China likes losing face to Japan; expect bigger supercomputers.
The AI Safety community skews hard Anglo-American. OpenAI and DeepMind have offices less than 50 miles apart. It's easy to forget forget that Asia has highly talented ML researchers. A dominant first place in 2016 ImageNet went to CUImage, second place Hikvision. That isn't Carnegie Mellon and Harvard.
Culturally, Asia is much more amenable to AGI. There is no Cartesian "consciousness" or "soul" reservation we have in the West. There has never been a Chinese AI Winter. Their national strategists can calculate the power of AGI just as lucidly as we can.
Don't expect the arms race to slow down. The world is becoming more nationalistic and less cooperative. I'd bet attempts to slow international progress to increase safety will be viewed in bad faith in China, because Andrew Ng has used his prestige there to mollify safety concerns on state TV with the "overpopulation on Mars" line.
As far as I can see, the arms race is ON. OpenAI and DeepMind are far from the only players. Humanity's hope doesn't lie in trying to sneak the AGI cat back in the bag, but rather progressing in AI Safety as rapidly as possible. If we can open-source a robust AGI Safety testing suite, it might not matter who gets there first. To that end, OpenAI is a massive boon.
Was this interview with Elon Musk also broadcasted live on TV? https://futurism.com/videos/baidu-ceo-robin-li-interviews-bill-gates-and-elon-musk-at-the-boao-forum/
Are you sure there wasn't a Japanese AI winter after this project? https://en.wikipedia.org/wiki/Fifth_generation_computer
Actually, I think there is a large benefit here. Many people will take AI Safety much more seriously if it's being proposed by an organization that is doing great capacity research as well. MIRI has often had a lot of difficulty getting people to listen to them, while if Facebook or Google were proposing similar ideas, they would be taken more seriously.
1. OpenAI's work is probably a distraction from the main business of aligning AI. I argued [here](https://medium.com/ai-control/prosaic-ai-control-b959644d79c2) that we should work on alignment for ML, and you didn't really engage with that argument. I do agree that today OpenAI is not investing much in alignment.
2. OpenAI's existence makes AI development more competitive and less cooperative. I agree that in general it's harder to coordinate people if they are spread across N+1 groups than if they are spread across N groups (though I think this article significantly overstates the effect). To the extent that we are all in this to make the world better and make credible commitments to that effect, we are free to talk and coordinate arbitrarily closely. In general I think it's nearly as plausible that adding an (N+1)st sufficiently well-intentioned group would improve rather than harm coordination. So I suspect the real disagreement between you and the OpenAI founders is whether OpenAI will really have a stronger commitment to making AI go well.
Put more sharply: supposing that you were in Elon's position and thought that Google and DeepMind were likely to take destructive actions, would you then reason "but adding an (N+1)st player would make things worse all else equal, so I guess I'll leave it to them and hope for the best"? If not, then it seems like you are focusing on the wrong part of the disagreement here.
I do think that it's important that OpenAI get along well with all of the established players, especially conditioned on OpenAI being an important player and others also being willing to play ball regarding credible commitments to pro-social behavior.
3. OpenAI's openness makes AI development more competitive and less cooperative. I do agree that helping more people do AI research will make coordination harder, all else equal, and that openness makes it easier for more people to become involved in AI. (Though this is an ironic departure from the theme of your recent writings.) The point of openness is to do other good things, e.g. to improve welfare of existing people.
I think that current openness has a pretty small impact on alignment, and the effect on other concerns is larger. If you share my view, then this isn't a good place for someone interested in alignment to ask for a concession (compared to pushing for more investment in alignment or closer cooperation amongst competitors).
Some quick arguments against the effect being big: the prospect of a monopoly on AI development has always been extremely remote; limited access to 2017 AI results won't be an important driver of participation in cutting edge AI research in the future (as compared to access to computing hardware and researchers); and there is a compensating effect where openness amongst competitors would make the situation more cooperative and less competitive (if it were actually done).
An unconditional commitment to publishing everything could certainly become problematic. I think that OpenAI's strongest commitments are to broad access to the benefits of AI and broad input into decision-making about AI. Those aren't controversial positions, but I'm sure that Elon doesn't expect DeepMind to live up to them. I would certainly have preferred that OpenAI have communicated about this differently.
For what it's worth, I think that the discussion of this topic by EA's is often unhelpful: if everyone agrees that there is a desirable conception of openness, and an organization has "open" in it's name, then it seems like you should be supporting efforts to adopt the desirable interpretation rather than trying to argue that the original interpretation was problematic / trying to make people feel bad about sloppy messaging in the past.
On (2), if OpenAI's not going to be a standout player with one to very few rivals, then its main effect* is eating up unjustified buzz. That seems like it would slow down both AI and AI safety, but slow down AI safety more because not all AI research institutions are safety-branded.
On (3) maybe OpenAI might try persuading Elon Musk first that its safety plan isn't just AI for everybody. If he's not persuaded of that, then I don't see why I should be, since I have far less control over and access to OpenAI than he does. Overall I am not very willing to assume that if I hear both X and Y and prefer Y, that Y is true.
I think our substantive disagreement on (3) depends on (1). It's imaginable to me that prosaic AI safety is enough, but in that world "AI Safety" doesn't really need to be a thing, because it's just part of capacity research. I put substantial probability (>50%) on MIRI being right because AGI is qualitatively different in ways that need qualitatively different safety work. In that scenario it's bad to conflate AI safety measures with weak AI capacity sharing, since then people will work on the easier problem and call it the harder one.
Separately, I think creating weak AI capacity and sharing it with the world is probably really good, and I'm glad people are doing it, and I'm glad people are working on making it not stupid. I just don't think that needs the term "AI Safety" or its various synonyms.
*The main effect of the organization itself. The researchers would presumably just be doing AI research somewhere else.
Why is this true?
> I just don't think that needs the term "AI Safety" or its various synonyms.
OpenAI describes its mission is described as "build safe AGI, and ensure AGI's benefits are as widely and evenly distributed as possible" (https://openai.com/about/#mission). Those are two different things with different benefits.
> On (3) maybe OpenAI might try persuading Elon Musk first that its safety plan isn't just AI for everybody.
I think Elon's view is that democratization of AI is important to avoiding some undesirable situations. I don't think he expects openness to resolve the alignment problem, which he recognizes as a problem. (I disagree with his overall view of alignment, but that's a separate discussion.) Those are just two different steps to obtaining a good outcome, both of which are necessary.
Eliezer Yudkowsky's original model of "AI Safety" entails gaining a fundamental understanding of how to ensure provable safety of even a superintelligent, rapidly self-improving artificial intelligence.
This is a hard problem -- it is hugely underspecified, for one thing -- so it is a very long-term project. Given that I think strong AI will not be here for a long time, I think this is fine.
Paul Christiano, the leading safety researcher at OpenAI, has a somewhat different model of "AI safety" that involves working on more tractable problems of bounding and aligning the actions of "prosaic AIs" like, for instance, a reinforcement learner that functions as a virtual corporation. Christiano's hypothetical "prosaic AIs" are weaker than Yudkowsky's notion of "strong AI" -- for example, they need not (and, I think, probably would not) be recursively self-improving. They would not even have to be "general intelligences" to fit Christiano's criterion of "can replicate human behavior".
The methods for dealing with "prosaic AIs" that I've heard about are qualitatively different from the thinking that was common in MIRI/SingInst in the old days. There, people thought largely about game theory and decision theory -- one assumes an agent *can* do whatever it wants to do, ignoring implementation details, and one thinks about aligning its incentives so that it chooses to do desirable things. In a machine-learning paradigm, by contrast, one makes assumptions about how the AI learns and responds to information, and imagines building it to have certain safeguards in its learning process. In other words, it's solving a much easier problem, about a *known* (if poorly interpretable) machine, rather than an arbitrarily advanced and self-improving machine. Most of the early debates on LessWrong were about asking whether one could "just" put safeguards into an AI (limiting the scope of its behavior, sometimes called "tool AI" or "oracle AI"), and Yudkowsky's answer was "no." Like Christiano, Holden Karnofsky believed (as of 2012) that tool AI was likely to be a feasible approach to safety.
I think there are probably *many* qualitatively different classes of "strong AI" of differing "strength". Some AIs which already exist (e.g. AlphaGo and image-recognition deep learning algorithms) are "human-level" in that they solve challenging cognitive tasks better than the best humans. But these are "narrow AIs", trained on a single task. One could imagine a future AI that was "general", more like a human toddler (or even a dog), which could learn a variety of behaviors and adapt to a range of environments without requiring correspondingly more training data. One could imagine AIs that are "conceptual" (that develop robust abstractions/concepts invariant over irrelevant properties) or "logical" (capable of drawing inferences over properties of agglomerative processes like grammar or computable functions). And there are recursively self-improving AIs, which rewrite their own source code in order to better achieve goals.
It seems very likely that defenses against the risks of "weaker" AIs will not work against "stronger" AIs, and Christiano's "prosaic AI" is among the weaker types of "strong AI."
This is fine in itself -- there's nothing wrong with working on an easy problem before tackling a hard one.
However, I think Karnofsky and Christiano are incorrect in believing (or promoting the idea) that this easy problem is the *whole* AI safety problem, or the bulk of it.
And I think, given that OpenAI is the biggest and most visible institution working on "AI Safety", this grant will lead to the belief, within the (rather large) community of technical people interested in AI, that the "easy problem" of prosaic AI control is the whole of the problem of AI safety.
It also gives the impression that working on AI safety is easily contiguous with being a conventional machine learning researcher -- getting a PhD in ML, working at software companies with big research divisions, and so on. You can go from that world to AI safety, and return from AI safety to the world of ML, entirely painlessly and with no cost to career capital.
I think that the meat of the AI safety problem will involve creating entirely new fields of mathematics or computer science -- it's that hard a problem -- and thus will *not* be nicely contiguous with a career as an ML researcher/engineer. But people have strong incentives to prefer to believe in a world where solving the most important problems requires no sacrifice of professional success, so they're incentivized to believe that AI safety is relatively easy and tractable with the toolkit of already-existing ML.
"AI safety is basically like ML" is a *dangerously seductive meme*, and, I believe, untrue. "We already know how to model the mind; it works like a neural net" is also a dangerously seductive meme, for somewhat different reasons (it's flattering to people who know how to build deep learning networks if their existing toolkit explains all of human thought), and I also believe it's untrue. Promoting those memes among the very set of people who are best equipped to work on AI safety, or other fields such as cognitive science or pure machine learning research, is harmful to scientific progress as well as to safety.
As far as I can tell, the MIRI view is that my work is aimed at problem which is *not possible,* not that it is aimed at a problem which is too easy. The MIRI view is not "If we just wanted to align a human-level consequentialist produced by evolution, that would be no problem. We're concerned about the challenge posed by *real* AI."
One part of this is the disagreement about whether the overall approach I'm taking could possibly work, with my position being "something like 50-50" the MIRI position being "obviously not" (and normal ML researchers' positions being skepticism about our perspective on the problem).
There is a broader disagreement about whether any "easy" approach can work, with my position being "you should try the easy approaches extensively before trying to rally the community behind a crazy hard approach" and the MIRI position apparently being something like "we have basically ruled out the easy approaches, but the argument/evidence is really complicated and subtle."
This surprises me, and I think I haven't heard about this.
Are you saying that they believe that you *can't*, in principle, constrain a reinforcement learner with things like adversarial examples or human feedback?
I think that all the MIRI researchers believe this will be exceptionally hard, and most believe it won't be possible for humans to do, if you want a solution that will work for arbitrarily powerful RL systems. (Note that we more or less know that model-free RL can get you to human-level consequentialism, if you are willing to spend as much computation time as evolution did and use an appropriate multi-agent environment.) I'm not sure about their views on "in principle" and it may depend on how you read that phrase.
How do we know this?
How do we know this? Also how is this compatible with the Atari games not being solved yet?
> How do we know this?
I interpreted this to be a reference to the evolution of humans.
I am in the middle but much closer to MIRI, and think it is unlikely that a sufficiently strong reinforcement learner could be constrained, even in principle, by adversarial examples or human feedback, while allowing it to be anything approaching maximally useful (as per Paul's condition that safety not be too expensive, which in context seems right).
I don't even think we have shown that we can in principle contain HUMANS via adversarial examples or human feedback while still allowing them to be anything approaching maximally useful. I haven't even heard reasonably plausible ideas for doing so!
However, it should also probably be easier to get to safety on systems that are made of humans, most of whose actuators are humans, and where a human is in the loop on nearly every high-level decision they make, than on systems that can go much faster than humans in ways that quickly become entirely opaque to us.
If by "weird" we mean "a weird way to build a safe AI," and by "average game theorist" we mean "average algorithmic game theorist," then I don't think this is true right now. Moreover, I doubt that anyone's views will change if/when it becomes clear that this isn't true.
OpenPhil has encouraged Good Ventures *not* to rapidly spend down its endowment on charitable causes, despite the fact that Dustin Moskovitz has expressed the goal of giving away his fortune in his lifetime. Now, OPP recommends its largest ever grant -- to the organization that employs two of Holden Karnofsky's long-time friends and roommates. (Not, for instance, to Stuart Russell, the world's most prominent AI safety researcher.)
From the outside, this looks like nepotism.
It's especially unfavorable that OPP's reasoning (http://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/openai-general-support#Case_for_the_grant) doesn't involve an overview of organizations and individual researchers working on AI safety. It simply says that "technical advisors" judge OpenAI and DeepMind as the major players in the field (not mentioning Google Brain, FAIR, IARPA, Baidu, etc) and that OPP could influence AI safety for the better by starting a partnership with OpenAI. This suggests that it's not that they think OpenAI is the *most effective* AI safety org, but that it's the best candidate for a partnership with OPP. This is plausible, given the close existing connections between OPP and OpenAI researchers. But this policy is out of line with the GiveWell-style policy of evaluating the impact of charities and donating to the most effective ones. If, instead of giving to the *best* organizations, you give to the ones that you think you can get most value out of influencing for the better, it's much harder to give a public accounting of why your spending has good outcomes, since all of your positive influence is happening behind closed doors.
But usually, professional fairness involves the assumption that while personal relationships are usable for hypothesis generation, some sort of *impersonal* objective criteria should be used for evaluation. You give your friend a chance to interview at your company, you don't just give her a job.
It seems perfectly natural that a lot of the people working in the same field are going to get to know each other. It seems a little off that they're just going straight to the collaboration/influence stage without passing through any "fair" tests (whether a GiveWell-style review of impacts, or a market mechanism, or public discourse.)
I think this means that outsiders should think of OpenAI more or less as they think of the Santa Fe Institute or the Institute for New Economic Thinking -- like "Ok, some bright people with some ideas have been given some money to play with, let's see what they do with it, the results could be anywhere from great to nonsense." OpenAI should not be thought of the way people think of Harvard (as, like, correct by default) or the way LessWrongers think of MIRI (as "our team" or "our friends.")
> OpenAI should not be thought of the way people think of Harvard (as, like, correct by default)
Is there anywhere that should be thought of like that? Because definitely not Harvard.
> This is plausible, given the close existing connections between OPP and OpenAI researchers.
It's not just that- the other major players are for-profit or government and thus cannot receive donations. Their grant to MIRI was much smaller in absolute terms but a much larger percentage of MIRI's budget.
> It seems perfectly natural that a lot of the people working in the same field are going to get to know each other. It seems a little off that they're just going straight to the collaboration/influence stage without passing through any "fair" tests.
I think OPP has been moving away from the concept of fairness for a very long time, and that's a good thing, for reasons currently locked in 3 unfinished blog posts. Grants from OPP are not supposed to prizes in fair competitions, they're supposed to effect as much change as possible. This is a problem if people treat them as fair competitions and especially if OPP doesn't fight this perception, but they've always discouraged people from donating and publicized that they're not following the Find The Best philosophy (http://www.openphilanthropy.org/blog/hits-based-giving). A strike against them is the name open: that's obviously incorrect and misleading.
OpenAI collaborates with DeepMind on safety: https://blog.openai.com/deep-reinforcement-learning-from-human-preferences/