I took a bunch of days off after my first day for various reasons, during which I came up with my plan to summarize each day’s work, so this is a retrospective with a longer lag than I hope will be usual.
I began by tracing some of my uncertainty on what to do back to uncertainty about how the world works. I decided to focus on the likely timing and speed of an intelligence explosion, because if I end up with a strong answer about this, it could narrow down my plausible options a lot.
I focused mostly on the timing of human-level artificial general intelligence, leaving the question of whether a takeoff is likely to be fast or slow for later. I also decided to leave aside the question of existential risk from AI that isn’t even reliably superhuman, although I suspect that this is a substantial risk as well.
I enumerated a few plausible paths to human-level intelligence, and began looking into how long each might take. I was not able to get a final estimate for any path, but got as far as determining that the cost and availability of computing hardware is not likely to be the primary constraining factor after about ten years, so I can’t just extrapolate using Moore’s law. Predicting these timelines is going to require a model of how long the relevant theoretical or non-computing technical insights will take to generate. This will be messy.
From decision uncertainty to fact uncertainty
A lot of my uncertainty around what to do in the world comes from not knowing how much time my actions have to bear fruit. It looks like I can constrain this uncertainty, to some extent, by having more specific beliefs about likely AI outcomes:
First, there will likely either be a “fast takeoff,” in which there is a runaway intelligence explosion leaving meat-humans in the dust almost immediately after crossing some threshold near human-level intelligence, or a “slow takeoff,” in which AI just gets smarter slowly and in a largely human-controlled way.
If we are likely to get an intelligence explosion very soon, like in a few years, then the only work that really matters is work directly on the problem of AI risk, or work that directly enables that work to go on (e.g. earning to give to AI risk organizations, or doing other things that make the lives of people working on AI risk easier and free up hours for them to work, or other resources).
If we are likely to get an intelligence explosion in a couple of decades, then direct work on the problem is still appealing, but so are short-term human capital building measures (e.g. persuading people outside the current AI risk community to work on the problem, especially those with skills and resources currently in short supply, and improving the communication or intellectual processes of people currently in the field).
If an intelligence explosion is likely to happen several decades in the future, then longer-term capital building to improve humanity’s collective problem-solving capacity and ability to steer the future deliberately becomes more appealing. Examples of this include:
- Getting education right, especially elite education.
- Human intelligence enhancement technologies
- Improving political and cultural institutions to nudge humanity further towards a high-trust high-cooperation equilibrium
If, on the other hand, there just isn’t a relevant threshold anytime in the next few centuries at which we get a runaway intelligence explosion, then AI recedes into the background as just another sort of ordinary technological progress, and I should be thinking in a more conventional way about how to help humanity direct its future well. This is similar to the “Fast takeoff, but generations later” scenario, though it seems to me like in slow takeoff scenarios technological unemployment is more likely to be a relevant concern.
Estimating time to human-level AGI
Setting aside the probability of a fast vs a slow takeoff, my best guess is that the relevant threshold for an intelligence explosion will be somewhere near human-level artificial general intelligence (AGI). I have not examined this assumption much yet.
I thought of a few plausible paths to human-level AGI:
- Whole brain emulation
- Human-inspired AI
- Pure general intelligence, with a generalized ability to construct “perceptive” systems
- “Rabbit out of a hat” form-agnostic methods, such as:
- Building a highly intelligent “tool AI” and telling it to build an agent-like AGI
- An evolution-type simulation
So far, this is just based on thinking on my own and reviewing Superintelligence. There are probably other places to check for plausible paths to human-level AGI.
Whole brain emulation
Whole brain emulation is an approach where we create human-level electronic intelligence by figuring out how the human brain works and emulate it on a computer.
Bostrom and Sandberg seem to think this will be substantially held back by computing power (“If WBE is pursued successfully, at present it looks like the need for raw computing power for real‐time simulation and funding for building large‐scale automated scanning/processing facilities are the factors most likely to hold back large‐scale simulations.”), which will be available sometime mid-century, but AI Impacts estimates on the basis of TEPS that a human brain would cost about $4,700 – $170,000/hour including energy costs to compute today. Since hardware costs decrease by a factor of 10 roughly every 4 years, that means that if current trends continue, we should expect it to be about as affordable to run an emulation of a human brain as to pay an upper middle class developed-world hourly rate in about 5-15 years. Superintelligence suggests that it will take at least 13 years from today (15 at time of publication) to achieve the other relevant precursor technologies: “We can also say, with greater confidence than for the AI path, that the emulation path will not succeed in the near future (within the next fifteen years, say) because we know that several challenging precursor technologies have not yet been developed.” However, I’m not sure where this estimate comes from.
It looks like, in order to have an opinion about timelines for WBE, I’ll need to look into the path towards WBE in more depth.
If we achieve WBE, it’s uncertain how easy a human-level emulation will be to scale up to something smarter. Plausibly we could just add neurons, but I don’t know whether that will actually work, and whether it will be easy to increase efficiency. Multipolar outcomes seem plausible here.
Instead of emulating a human brain from the ground up, we could come to a theoretical understanding of why it works the way it does, and construct the relevant modules from scratch, using algorithms that are easier to understand and modify and better adapted to digital computation than the ones implemented by the brain.
It seems like I could learn about this by looking into cognitive and brain science (both human and animal), and the evolutionary history of intelligence. For now, my uninformed sense of the relevant pieces of human of intelligence:
- Motor control and goal-seeking behavior: Nature does this everywhere. We seem to be mostly abstracting away from this one because it is simple and boring, but (I claim, very weakly) a large portion of most brains is devoted to it. It’s possible that AI can abstract from it, but far from certain; it’s conceivable that a superbrain might not sustain itself much more efficiently than we can, in which case we’d end up not terribly uncompetitive. I don’t expect this to be a timeline constraint, but it might affect fast vs slow takeoff
- Perceptual pattern recognition and discernment: Nature doesn’t seem to have a hard time with this, though IDK how hard it was for evolution. We seem to have made a lot of progress here. Possibly we’re going to be at human levels for things like image recognition as soon as 5 years from now, though it could imaginably be a lot longer. Possibly we’re already as good as it gets without learning how to make minds that can explicitly build and manipulate mental models, symbolic thinking to pass info between systems. Mainly we appear to be hardware-constrained on this one, which implies that this will fall into place quickly when we have figured out the higher architectural insights.
- Behavioral learning via reinforcement: Nature seems to have had an easy time with this too. It seems as though we have models for how to do this. It’s unclear how hard this is to combine with perception, but nature seemed to have no problem with this so a year seems plausible for getting it to work once we have good enough component perceptual systems. OTOH I can see this leading to super-sphexish behavior. Empirical question - do humans have some special capacity here beyond the reinforcement learning other animals can do?
- Building and manipulating explicit mental models: For some reason this seems expensive for nature. “Executive function” seems to have a lot to do with this, since it often has to overrule or replace our behavioral learning. We have toy versions of this, and it’s unclear whether our conceptual understanding is good enough to automate learning how to do this in a way that’s flexible enough to benefit from machine learning approaches to behavior and pattern recognition. Some cephalopods and the Portia spider appear to have this in some domains. Unclear whether this really is very expensive, or whether existing evolved neuron-based brains are just poorly constructed for doing it.
- Self-aware modeling (related to social modeling): This appears to be very expensive, though not unique. Dolphins?
- Language: language that’s used to represent thoughts internally (rather than just externally via mimesis). Maybe just follows straightforwardly from social cognition, self-aware modeling, pattern-recognizing?
Next actions would be around fleshing out a better version of this model using some actual empirical evidence if it’s available, figuring out which steps look likely to be hard, either theoretically or based on evolution.
Algorithmic improvement seems likely to be pretty easy for this kind of mind since it’s constructed to be modular and elegant.
Pure general intelligence
Figure out the right executive function / model-building / language and internal communication abstractions, and let it decide what to perceive. This has all the hurdles of abstracted human-style AI, plus we’d need to know how to build the general capacity to learn perception within a domain (vs the special-cases we’ve seen in things like image recognition and Go where deep learning teams presumably spend a bunch of time fine-tuning stuff - though I don’t actually know whether that story’s true), but it seems like enough to let the thing immediately be alien, incomprehensible, and plausibly arbitrarily powerful. This seems like a good candidate for a singleton. “Organic” scale-up seems basically irrelevant.
Tool AI builds agent AGI
I don’t really know anything about how close we are to this. Would need to read the literature. This doesn’t seem to much constrain the form of the ultimate agent AI. However, it’s conceivable that a sufficiently safe tool AI that understands enough about human preferences might automatically preferentially build a safe agent AI, just because it can tell that if it builds an unsafe one, we wouldn’t like it.
Superintelligence goes into an estimate of how hard it would be to generate human level intelligence by computing some sort of evolutionary process simulating a bunch of animals with neurons. Per Superintelligence it seems like it would take ~10^34-44 FLOPS for a year to emulate a billion years of evolution of animals with neurons. (Superintelligence says 10^31-44 but I don’t see how he gets that lower bound.)
The computational cost of simulating one neuron depends on the level of detail that one includes in the simulation. Extremely simple neuron models use about 1,000 floating-point operations per second (FLOPS) to simulate one neuron (in real-time). The electrophysiologically realistic Hodgkin– Huxley model uses 1,200,000 FLOPS. A more detailed multi-compartmental model would add another three to four orders of magnitude, while higher-level models that abstract systems of neurons could subtract two to three orders of magnitude from the simple models. If we were to simulate 10^25 neurons over a billion years of evolution (longer than the existence of nervous systems as we know them), and we allow our computers to run for one year, these figures would give us a requirement in the range of 10^31– 10^44 FLOPS.
On the lower bound, if the higher-level model takes 3 OOM off of the simple 1,000 FLOPS neuron models, then a neuron can be computed at 1 FLOPS, so 10^25 neurons would require 10^25 FLOPS. Running at a billion = 10^9 times normal speed, this would seem to require 10^34 FLOPS, not 10^31. Maybe I’m making some obvious error here, but for now I’ll use 10^24 as my lower bound.
AI Impacts says there are about 10^21 FLOPS in computing power total in the world, at a cost of about $1 per 10^9 FLOPS per year. (Sanity vhe current global capacity, if scaled at max efficiency, would cost about $10^12 or $1 trillion per year, and naively that means the present-day computing cost of doing that experiment if we had the software ready would be around $10^25-35, or $10 septillion - $100 decillion.)
Costs drop by about an OOM every four years, so the evolution program would cost about a trillion dollars ($10^12) in 50-100 years. The world economy also seems to be growing at about 5% per year, which means that in 50 years it will have grown 10x and a $1 trillion investment will be like the world today investing $100 billion, clearly manageable. So most likely, if the Moore family of laws continue, this will be feasible with a massive effort in 50 years, and with a moderate to minor effort (maybe even by a wealthy individual) several decades after that.