Coherent Extrapolated Volition/Commentary
Regarding the page Coherent Extrapolated Volition:
Coherent Extrapolated Volition Alternatives
My comments have become somewhat out of date, and upon reflection, somewhat different. I'll replace them with a concrete reaction when I have thought out a bit farther.
"I do think there's light in us somewhere, a will that might shape a Nice Place to Live. The question is whether collective volition seeks it out."
Why shouldn't the FAI guarantee that this "light in us somewhere" comes out, even if our collective volition wouldn't seek it? Is our collective volition more important than the light in it?
Is your definition of "light" smarter than our collective volition? Wouldn't you feel ashamed to have tampered if it turns out it would have worked anyway, thereby making a mockery of humanity's self-determination? Wouldn't you be horrified to screw it up? Setting a Last Judge is one thing, predetermining the outcome quite another.
Here are my comments. They're not deep, but it's something. :) I, or someone else should delete them once they've been addressed.
- first skimming, this looks like a zeroth law of volition based friendliness. should this be addressed? [Zeroth law? Huh? --EY]
- Zeroth law of robotics stating (in general) that the robots should protect humanity above anyone in paticular (first law). In this same way, Collective Volition seems to favor humanities volition over that of any individual (older volition based friendliness). Just an observation when I first skimmed.
- broken parallelism in the "in poetic terms, our collective volition is..." paragraph [Specify? --EY]
- We tried clearing this up in the chat room. Here's my best guess:
- In poetic terms, our collective volition is what our wish would be if we knew more, thought faster, were more like the people we wished we were, and had grown further together. It's our wish where the extrapolation converges, rather than diverges where our wishes cohere, rather than interfere; it is extrapolated as we wish it were extrapolated, interpreted as we wish it were interpreted.
- We tried clearing this up in the chat room. Here's my best guess:
What if it's not computationally tractable to extrapolate our volitions very far? If it requires you effectively just compute the entire world, no significant simplifications possible? Is there good reason to suggest this isn't necessary?
-- Nick Hay
[Answered in PAQ 5. --EY]
"I am dismayed, though not shocked, to see everyone who considers Friendly AI walking straight into the whirling razor blades. "
This is unclear. Are you saying that everyone who considers building FAI are committing suicide? Or is it that you are dismayed that people consider FAI a death sentence? Neither of these interpretations seemed to make sense to me.
-- Michael Roy Ames
[Fixed. --EY]
What ever happened to Objective Morality?
[I learned more, thought longer, found a non-mysterious question and discarded my old mysterious answer. --EY]
I'm still a fan of the idea that there is a clear cut way to objectively define 'Good' and 'Evil'. There have been some interesting proposals - I especially like one by David Deutsch in 'The Fabric of Reality' - the MWI of quantum mechanics might allow a definition of Objective Morality as equivalent to convergence of informational patterns across different Everett branches.
[Even if this computation you have specified is the same for all observers, why the heck should I care about this quantity instead of love and laughter? "What do you get if you multiply 6 by 7?" has an objective answer and is equally unrelated to morality. --EY]
With Objective Morality in the bag you wouldn't need to use 'Collective Volition', or indeed any 'Volition' at all would you?
[I suppose you don't care that your Objective Morality bears no relation to the Objective Morality that might be chosen by anyone else in the world? Or that you might reconsider your Objective Morality in a few years, when you knew more, thought longer? Or that your Great Moral Principle sounds cool but omits virtually everything humans care about? You do realize that when I spoke of people walking directly into whirling razor blades without the slightest trace of fear, I had you specifically in mind, right? --EY]
[Read what Hal Finney says below. I can't help thinking that you're actually still implicitly assuming some sort of objective morality and Coherent Extrapolated Volition is just an unsuccessfull attempt to 'sweep the problem under the rug'. I read your reply to Hal Finney but I'm not entirely convinced. What is the 'lightness' in humanity? What is the 'darkness'? If there is an Objective Morality I presume that SOME parts of human morality (things that people care about) would be a lower order manifestation of it. And no, I wouldn't presume to attempt to say what Objective Morality is directly ..the goal of my FAI would be to discover Objective morality for itself. As for fear: we don't actually know for sure that self-improving A.I is dangerous - perhaps all working seed A.I's necessarily become Friendly? --MG]
Even more fundamentally, I still doubt even your general approach to the problem. You make what I call 'Totalitarian/Singleton' assumptions. The idea that...like.. a single FAI is going to like.. save the world man. Bear in mind other possibilities.... may be the whole idea of a single AI trying to save the world is misguided, may be the 'Multiple A.I' approach would be better. Perhaps any General Intelligence neccessarily has to have a unique 'Personal Identity' which explores only part of the search space of 'Goodness' (as an analogy, morality might be like a 'wave function' and any General Intelligence like a specific 'particle' state of that wave function). I have to tell you... this is in fact, what I suspect. In that case it might not make sense for a lone FAI to try to 'save the world' (or at the least there would be a large number of 'degrees of freedom' in the initial specification).
[Gibberish. --EY]
[Um...what I think I was trying to say: The possibility of a large number of equally good initital A.I specifications. In that case, the initial choices of the programmer would have unavoidably have a huge influence on the future, since one particular specification would have to be chosen (and I don't see that rolling dice would avoid this - the outcome of the die rolls used as input would still qualify as 'the choice of the programmer'. --MG)
The whole idea of a 'Collective Volition' grates on my Libertarian nerves to be honest.
[Mine too. What's your point? --EY]
You're still making a heck of a lot of assumptions.
What if there is no such thing as volition? That is, what if human mentality cannot be split into things we really want, and things we only think we want? What if our internal mental goals and drives are all peers of one another, with one or another taking control depending on circumstances? What if you told your FAI, "do what I really want, not what I think I want," and it said, "I'm sorry, no such distinction is meaningful"?
It seems to me that this whole approach to FAI is merely sweeping the problem under the rug. Instead of assuming you have a robot which will automatically do the right thing, you are assuming that people's minds are such that the robot can read them and find out what people think is the right thing. But it's a big assumption that people's minds work that way.
Sure, you can imagine extrapolating a mind forward, enhancing and changing it in various ways, and see what its goals become. But many such extrapolations are possible, which will lead to different outcomes. Which one do you pick? Judging which extrapolation would represent our highest and best vision of ourself, the one which is the person we wished we were, must be done on the basis of our current goals. And again that seems to assume that the AI can identify a subset of our goals as being the most human, the deepest and most fundamental. But if all goals were peers then any such choice would be arbitrary.
Your model is contingent on this view of human mental structure as being divisible into good and bad goals. As a result, it's effectively just objective morality all over again. You're assuming a goal over goals, a utility function that weights utility functions, which the AI will somehow have built into it, and which will tell the AI which human goals are the ones they "really" care about and which are crude, base, primitive goals that they are "really" ashamed of. This is the same job that objective morality was supposed to do, except now you are calling it volition.
At a minimum, you should make this model of human motivational structure explicit and offer evidence that our minds work in this way. Even better, you should consider the possibility that the whole approach is incoherent, that you can't build an ethical system out of minds which have been created by an amoral evolutionary process, because you cannot derive "ought" from "is".
Where, at any point in the document, did I speak of the dynamic identifying a subset of our goals as deepest? The dynamic just does things like "knew more", "thought faster", and then asks which goals X would think were deepest if X knew more and thought faster. It's all based strictly off the humans. The model isn't contingent on human mental structure being divisible into good and bad goals by some built-in criterion of inspection, nor is there any utility function over utility functions built into the dynamic. The model is based on humans judging themselves. Which we do.
Look over the Coherent Extrapolated Volition paper closely; can you quote any point where I speak of the dynamic, rather than a human or an extrapolated human or a coherent set of extrapolated humans, performing a moral judgment? The dynamic extrapolates the mind forward using a set of defined operations, and does not pick one outcome as best; the dynamic calculates the spread. I see no concealed reference in such things as "calculating spread" to an objective morality.
Did you read through the entire paper, or some of the paper? I ask as an author seeking feedback.
1. Do you see "elimation of negative qualia in a way that does not directly conflict with personal freedom" as a possible attractor for collective volition? Do you see it as a probable attractor? If not why (specifically)?
2. How well does the AI need to predict the future in order for all of this to work?
3. Won't the amount of machine intelligence required to wipe out humanity arrive much earlier than the amount of intelligence required to accurately simulate entire countries filled with people and other smaller AIs? What is your plan for this interval of time?
4. FAI initially not programmed to count animals inside its definition of collective. However we would like to be people who value all forms of life and if we knew better we'd know that they are sentient to a certain extent THEREFORE should the FAI give a replicator to each household and forbid killing of animals, even if we were opposed to it? (I think so, but just to check with you)
metaqualia
> To construe your volition, I need to define a dynamic for extrapolating your volition, given knowledge about you. In the case of an FAI, this knowledge might include a complete readout of your brain-state, or an approximate model of your mind-state. The FAI takes the knowledge of Fred's brainstate, and other knowledge possessed by the FAI (such as which box contains the diamond), does... something complicated... and out pops a construal of Fred's volition.
This paragraph should be several PhD programs in length. I'd love to know, in a practical sense, how the AI can possibly understand me at the present time, let alone know enough to confidently predict what is good for me in the future?
- "Readout of my brain-state"? Maybe through nanotechnology, someday. If you're depending on that development for success of the FAI, we may be waiting a long time.
- "Approximate model of my mind-state"? Do I fill out a questionnaire?
- "something complicated"? What an understatement! Considering all the variables, all the other "mind-states" in the world, all the social interactions, environmental changes, resource limitations, local/global conditions, etc, and then considering that it will need to consider near-term and mid-term volitions (does that mean considering all permutations for the next few weeks/months/years?) you'd better have one hell of a computer running this show.
Mike W.
It does sound rather far-fetched, and I'm still undecided as whether or not Collective Volition is feasible, given the complicated nature of some of it's components. However, remember that Collective Volition is only an update to the complete "Friendly AI" document, which together with Eliezer's other documents such as "General Intelligence and Seed AI" (http://singinst.org/GISAI/) and "Levels of Organization in General Intelligence" (http://www.singinst.org/LOGI/) do outline a path to "one hell of a computer", with an intelligence level far above humans.
Michael Richards
This Coherent Extrapolated Volition reminds me a little of Asimov's 'Psychohistory'. Everyone has read the excellent 'Foundation Saga' by Isaac Asimov I presume?
Um.. a thought occurred to me...Don't we already (at least in modern Western nations) have an ultra-powerful Bayesian predictor extrapolating collective volition? It's called (drum roll please)...THE FREE MARKET. In the U.S the stock market succeeds in extrapolating collective volition to around 6 months into the future (the stock market predicts the state of the U.S economy about 6 months in advance). Are you sure that the ideal 'free market' and 'Collective Volition' are not one and the same thing? If the results of the extrapolation were made public this would change 'the public' and render the extrapolation invalid. That's why stock brokers can't predict better than the market... as soon as people realized that the predictions were accurate they would change their behaviour to try to take advantage... rendering the prediction useless. So it's not clear to me that Coherent Extrapolated Volition would be any different to a souped-up fairer version of what we now call 'The Free Market'.
Marc G
Aw, go ahead, take over the world. You know you want to. :-)
You seem to admit that, if you're going to build an ultra-powerful AI, you are responsible for what it does, and that you can't avoid deciding what motivations to give it. Your answer to this is to avoid being a jerk about it, at least as best you can. You appear to believe that building an AI that would implement the collective volition of all humans is non-jerky.
I claim that that's somehow arrogant, while at the same time feeling like an unsatisfying responsibility dodge... and that it abandons the fundamental insight behind your whole plan.
You want to build an AI that's smarter than humans, and then (well, at the same time, really) use it to figure out what humans, collectively, would do if they were more as they would wish to be. Your key insight, it seems to me, is that the AI itself can help to define a better goal system than you (or humans in general) could build for it alone.
... but you're not using the AI to its fullest. You're using your own relatively flawed, relatively fallible insight to decide that the right thing for the AI to pursue is the collective volition of all humans.
As an example of a possible flaw in that, you've obviously thought a lot about what "collective volition" means, but where do you get the idea that "all humans" is the right group over which to compute it? Why not "humans plus great apes", or "all mammals", or "all life", or "humans with IQs over 160", or "humans with IQs UNDER 160"? Or maybe there ought to be a more complicated (pseudo-) definition of "collective volition", or some qualifications to that. Your own ideas are still evolving... how will you know when, or if, you have it right?
Wouldn't you be better off to let the AI handle it?
By this, I mean setting up the AI to implement your own volition... your view as it would be if you were wiser, smarter, more informed, more moral... more as you would wish yourself to be. Basically, telling the AI "solve the whole problem for me", rather than "solve a derived problem that I think may have a related solution".
Yes, that privileges you. It puts your name in a rule, and I know you think that's wrong... but, as you've realized, you are responsible either way. If you're the guy who makes the decision, then it can't be divorced from you... you're just as responsible if you put "all humanity" in the rule as you are if you put "Eliezer Yudkowski" or "the AI developers". So, unless you're really, really sure you're up to that responsibility, shouldn't you let (an approximation to) a wiser version of yourself make that decision? If that wiser version decides that "all of humanity" is still the way to go, then the AI will adjust itself accordingly, right?
I'll have to admit that I have an ulterior motive here. I don't trust the majority of humanity... and I suspect that I share more values with, say, an AI-programming athiest American Jew, than I do with the totality of humanity. If there's a serious divergence between the extrapolated volition of all humanity, and the extrapolated volition of that AIPAAJ, each under the condition that the extrapolation be done in the direction of "more as X would wish to be", then I suspect I'm going to be morally happier with the AIPAAJ result. And so, by definition, would you be morally happier... and you're responsible for the result.
So, given that you have to make the decision, which makes you more a jerk? Making it according to (an improved and refined version of) your own morals, or punting it to a large group that's more likely to come up with something odious?
Disclaimer: This assumes, of course, that the whole volition extrapolation thing works at all, about which I have enormous doubt. If I had to bet, I'd give 10 to 1 against it.
-- jbash
I think Eli was right not to make individual volition the whole basis of morality. I don't think morality is solely based on what a person really wants. After all, you could imagine a society of sadomasochists that enjoy being raped and sodomized, but this doesn't correspond to our intuitive conceptions of morality at all. On the other hand morality should help people. So morality has to be somehow related to human wants and needs. It boils down to the age-old dilemma of internal wants and needs versus external dictates and constraints.
So where are these external factors to come from? Eli's decided that they should come from 'the group' (humanity) as a whole. But is humanity as a whole really where Eli should be looking? I have my doubts. There could very well be more to morality than this. Simply trying to derive morality from 'the group' only pushes the fundamental questions back to another level, it doesn't really resolve them. What if the 'Collective Volition' (i.e. the morality derived from the group as a whole) still runs totally contrary to reasonable moral intuitions? Eli pulled a clever move when he came up with 'extrapolated volition' - what people would want if they thought longer, faster, knew more , were more wise etc. This pragmatic operational definition might well end up corresponding to morality, but the trouble is that it's an answer suspiously empty of content. Not wrong as far as it goes maybe, but not necessarily very useful. My first reaction upon seeing the CV 'answer' to the question of morality was that it was rather like being given the answer: 'Well it's a face' to the question: 'What is Mona Lisa?' For instance I could define knowledge as the converging probability resulting from factoring successive pieces of information into Bayes Theorem, but where does that get me? What is the solution to the Riemann Hypothesis? Oh it's the result you would get if people kept factoring in new pieces of maths info into Bayes until there was a convergence recognizable as the solution! Sure, but hardly a satisfactory answer. Morality is defined as the end result of a process, a process so enormously complex it's quite likely to be practically impossible to calculate. In that case CV might be correct as far it goes, but just useless.
I wouldn't rule out the possibility of some sort of objective morality yet. Sure, you need to look at humans for 'calibration' of any reasonable morality that would speak to the wants and needs of humans but there doesn't mean that there is isn't some sort of objective standard for determining the morality of various human wants and needs.
What Eli seems to be worried about is the possibility of A.I programmers 'taking over the world'. But does the world really need anyone to 'run' it? Not according to the anarcho-capitalists and various other political systems that have been floated. Not that I'm advocating anarchy, I'm just pointing out that the whole idea of a singeton centralized agent might be misguided. In any event the way the world seems to work in the modern free market democracies is that people are assigned status roughly acccording to their talent and latent cognitive abilities. For instance childen have fewer rights than adults, brilliant adults who create good products end up with more economic power etc. Since FAI would have cognitive abilities far beyond an ordinary human, it's not clear why it would be wrong for the FAI to be given the most rights.
Collective Volition is unlikely to be the last word in 'Friendliness theory'. Not even close I suspect.
A quote from your text: "Collective volition looks forward in time, ... Since the output of the collective volition is one of the major forces shaping the future, I'm working on the order-of-evaluation problem to prevent this from becoming an infinite recursion."
My contention: Building a machine which predicts far enough into the future that this future is sensitively dependent on near term human behavior and informing the population of its predictions creates a feedback loop that can only be damped by the predictions being unavoidable, and usually unpleasant.
My reasoning: There is a wide (based on the variability of human behavior) range of possible futures available in the absence of the prediction. There is another different wide range of possiblities available after the prediction is given, different because the prediction is a strong force in the causal progression. Most of this second range is taken up by futures in which the prediction is proved wrong because of people who knew of the prediction took action to prevent it or take advantage of it and lead to a future different from the one predicted. For the machine to be accurate in its prediction it must choose from the narrow slice of future possibilities that are so restrictive that no action available to humanity can change the outcome. Because of people's strong motivation for self determination the machine's very action of predicting the future becomes a strong force in making that future unavoidable. In essence, the machine looks at all the possible futures, chooses the one with the greatest amount of restriction (because of its drive for accuracy in prediction), and selects that future for us.
Of course, this is not what we want so the machine cannot tell us the future. It cannot tell us what it has planned for us. It might be better if we didn't know the machine existed at all, totally background operation.
Mike Deering.
- Yes, this is a reason why I speculate that the collective volition might simply shut up and help, rather than arguing with us about it. Requiring a self-consistent deterministic prediction of the future would be impossible due to quantum branching, and undesirable because it would be an immensely strong nonhumane determining principle. That's why I keep emphasizing the need to satisfice a spread of extrapolated people - not to predict a single future. Remember, extrapolating people isn't exactly the same as looking into their future selves; that's just one way of thinking about the problem. -- Eliezer Yudkowsky
On a minor point, Yudkowsky's idea of a 'Last Judge' surprises me as it seems to be a rehash of the 'AI in a box' idea which he himself so convincingly trashed. ~Maru Dubshinki
- The Last Judge isn't there to prevent Failure of Friendliness; the Last Judge is a designed-in part of the meta-machinery (to check that the collective volition of humankind isn't something awful relative to at least one human). The Last Judge would not be fighting the optimization process. But I also have major misgivings about the Last Judge concept, so I do not, at this time, defend it very strongly. -- Eliezer Yudkowsky
My opinion on CV, for the record. Essentially I agree with everything here; the things I am likely to disagree about are the implementation details (e.g. 'model social interaction, but only the worthwhile bits...' leaves a lot of design lattitude) and final definitions of key terms. There are basically three failure scenarios for CV; we make the wrong implementation tradeoffs (go back and retry), getting useful certainty is computationally intractable, and humanity proves to have too much evolutionary (or less likely, cultural) baggage to renormalise to niceness. Given the risk of the latter, I would not consent to work on a project to implement CV without a veto on the outcome, as I refuse to be responsible for a preventable evil. Naturally I would want to delegate that veto to my extrapolated personal volition. The entire programming team doing this should cover the third failure scenario; we would probably still need a human last judge as a fail safe for the first failure scenario (which would corrupt the team's extrapolations too) unless we can be very sure that the RPOP powering the CV can do this for us. To avoid the 'pick and choose dynamics' problem we should select the best contingency plans (in case of CV failure) ahead of time, and stick with that assessment unless the reasons for CV failure are utterly surprising.
This document talks about 'teaching an AGI to extrapolate volitions'. This very much implies a stand-alone RPOP that extrapolates the CV as a mental modelling operation. Expected utility is now deprecated due to multiple serious safety issues, which I accept (despite it making development harder, we can't compromise on safety). However recently you have been talking about building a decision function that closely approximates an idealised version of human reasoning, and declaring that 'wrappers' are a bad idea. Could you please clarify how closely you are attempting to model human cognition, and how you're going to prevent the CV RPOP from acting preemptively before we've extrapolated as far as we can given hardware or spread constraints? In some of your recent writing I got the impression that you're proposing to directly develop a super-efficient and accurate human volition simulator, but I must have misunderstood because as you say here 'the task would be a thousand lightyears beyond hopeless'.
- Starglider
I strongly dispute the idea that we cannot move from individual to collective volition Libertarian Case. Should a friendly AI be built to act upon peoples individual volitions it is quite possible that there would be people who would prefer some kind of collective. However were this the case they could simply band together and ask the AI to do this. Those who prefered that they remained in control of their own existance could also have their wish Those who wished to experience (eternally) the reality produced by the AIs prediction of the collective volition would not have their experience denigrated if the collective consisted of only those who wanted a collective rather than 'all mankind'. It is meaningless to 'force libertarianism on mankind' as libertarianism is the one philosophy that can be rejected/replaced on any level by an individual at any time. we would be in fact giving each individual the choice that you friendly AI planners are currently making for them (an individual cannot opt out of collective volition).
Individual will has similar advantage over individual volition. Right now there exists (as far as we know) no vastly greater intelligence with the power to coerce or decieve us into following its or 'the collectives' volition. However using our current free will we are planning friendly AI which places its take on volition over our will. If a self improving super AI were built with:
1. Defending to the death human free will that does not infringe other human free will
2. Not to affect individual Humans unless they have explicitly asked to be affected
3. Not attempting to alter human free will in any way unless asked (yes permission would be required for any conversation)
(Note the words 'Human Free Will' could be replaced by Expressed Human Desires)
as its ultimate, ruling principles (implicit in this is a mission to prevent the creation of super AI that did not) there is no reason why people who prefered deterministic volition could not ask for and have it. Such an AI could never produce any of the dystopic failures you predict. People who want social interactions or other things which slightly (or majorly) infringe on individual freedom could again ask for them. They could then have them with (im guessing there would be quite a few) others who would ask for the same. This system would still allow everything you hope for to be realised for those who want it. Libertarianism, not collective volition is the system easiest to alter in the light of better knowledge derived from the AI, as each person can effectively redefine the AIs mission (concerning them) at will. Libertarianism allows everyone to act as their own 'Last Judge'. Everyone gets what they want. Surely its the way to go?
The economics involved in the allocation of finite resources for granting the potentially infinite human wishes are probably best left to the AI, providing all humans are given equal status in the source code. However, in response to the free market comment - the free market as it exists in the american led world economy is terrible at expressing 'collective volition'. The 'profit motive' reward system is effective in ensuring that selfish humans sacrifice their time to produce, produce well and produce what the market wants ensuring economic growth. This inevitably leads to inequalities in wealth. These inequalities increase exponentialy as venture capital in the hands of the rich makes them more competitive. America is by far the richest country in the world yet it does not provide for the majority of its people (ie most without health insurance) as the collective is so far skewed in favour of the rich. Whilst theoreticaly a quasi-socialist, economicaly keynsian government could adress this with adequate monopoly controls, decent minimum wage etc, the control of the media and both mainstream american political parties by the rich complicates this. Such a party would in anycase not be accountable to those in the 3rd world sweatshops oppressed by global capitalism.
Friendly AI offers a means for effective libertarian socialism (to date equality and the lack of the 'profit motive' have resulted in laziness and therefore slow economic growth & universal poverty, not to mention some cases of authoritarians coming to power [ie. USSR, China]). All work and production could be done by AI, so enough would be produced and everyone could therefore be given equal entitlement to the wealth produced ensuring that all peoples needs are met. -- anon
The short answer to that is 'you've just picked four great moral principles that the FAI will lock us into for all time'. If this was really the best thing to do, a CV extrapolation should realise that, create your principles, then either delete itself or reimplement itself as a purely voluntary opt-in process. I could do a more detailed critique, but first I'll give you a chance to respond to this point. All attempts to date to define a system based on 'individual volition' have included some amount of personal preference and individual ethical choices, which may be unavoidable in such schemes and is what CV is trying to avoid in the first place. -- Starglider
The thing is that right now you are deciding that collective volition is the best way and working to impliment it, as right now you are free to excercise your individual free will. Under the system I described you will be in the same position. YOU and anyone else who wanted the collective volition system would be free to redefine the AIs attitude to you, and have a collective volition system. No one is locked into anything as anyone can chose moral principles different to the ones I outlined (ie volitionism) and ask for these to be applied to them. Collective volition, should it turn into something nasty (which the institute sees as a possibility hence the last judge) offers no opt out and mankind is locked in for all time. Human ethical preference (which traces of will be found in the phrasing of volitionism) is inevitable as any AI must be written by humans, so what ethical preference is written into the AIs attitude to any person should be up to that person.
In a nutshell : You are moving from libertarianism to collective volition right now. This same movement would be possible under the AI I described (providing more than one person wanted collective vollition), however the choice made under a libertarian AI would be made by each person individually, for themselves, rather than by the institute for all mankind. To me this seems both fairer and safer. -- anon
- And if week-old infants lacked the complexity of will for their volition to describe anything except superinfants? What will you say to the weeping parents, forever barred from their children? How will your AI decide whether chimpanzees are voting citizens? Do you trust your own unaided intelligence to make such decisions? -- Eliezer Yudkowsky
I would call them moraly bankrupt and tell them that no one has the right to control anothers existence. I would tell them that the grief they felt was a prejudiced relic of pre-singularity society. Being a super-infant could be really, really good. What would you say to people who grew up the 'natural way' barred a far better existence because the collective consisted of predominantly adults? What would you say to the formerly happy and simple elements of our species if ignorance is trully bliss and a human with extrapolated knowledge and intellect would want only death? With chimpanzees its no different from your system. Its up to humans whos included in the collective and there is no firm evidence that the collective will concern itself with anyone wrongfully excluded. I guess in my system if someones wish was for chimpanzees to be included then (seeing as with an AI resources would be virtually infinite) they would be.
My main point is that I trust my unaided intelligence more than that of any other human (ie you), and by building an intelligence far greater than our own we are playing with dynamite. If/when a super AI that will govern the rest of my existence is built I would definitely want to be the one to decide what it wants with me. The aim behind my system is to allow everyone, not just a select few, to do this (some human has to). Collective Vollition may well be the best system, but Id want to be in control and ask the AI what it would be like before opting in. -- anon
My main point is that I trust my unaided intelligence (and my motivations) more than those of any other human (ie you); conflating intelligence and motives is a bad idea. If you were the most intelligent person on earth, you could presumably build your own personal AI. The actual implication is that you don't trust the motives of the programmers, in that you think they might knowingly cause something to happen to you which you don't want to happen. A far more sensible counterproposal than a genie AI (which is what you appear to be advocating) is individual extrapolated volition, where we extrapolate just your Goal System to find out what you'd want if you were n orders of magnitude smarter/better informed and make decisions about what happens to you based on that. The minimal version of this is a veto; we extrapolate your personal volition and use that to decide what abilities (e.g. self-modification, creating children, talking to dangerous Powers likely to use transhuman persuasion abilities to get you to do things against your own interest, killing yourself, etc) your future self (at the limits of coherence) would trust your current self to use responsibly. This has its own potential problems, and I wouldn't advocate it directly, but it's a much better idea than the genie-AI version of libertarianism, and I wouldn't be surprised if the output of a CV extrapolation incorporated some form of extrapolated individual volition in the background dynamic. - Starglider
Whether they cause something I dont want intentionaly or otherwise is irrelevant. Yow wouldnt be suprised, but you cant be sure. If I wanted individual extrapolated volition, Id ask for it under the genie system. Anything you can think of setting up now can be set up under the Libertarian Case or genie system (which I am advocating). The difference is you cant drag everyone else down with you. -- Matt vere
Your proposal values a single very specific thing; the irrevocable ability of individual sapients to choose and have implemented restrictions on their future selve's experiences, and /nothing else/*. Whether these restrictions extend only to their immediate future selves you haven't specified (i.e. can you voluntarily remove your own freedom of choice, for good, or for some period of time). Virtually all humans actually have a much more complex value system than this, both in terms of their own goals and the ethical principles they think should apply objectively. It is true that if humanity was a mature completely rational species, with an accurate perception of the dangers of wishing, the risks of self-modification and a healthy appreciation of their own fallibility, then simple individual volition fulfillment might be a valid starting place from which all the other things people value can be safely and efficiently obtained. Humanity is not such a species; we are just barely over the critical sapience threshold beyond which runaway technological progress can occur, and we are laughably under-equipped to traverse (and over-confident about tacking) the dense minefield of fatal risks that SL3+ (arguably less) technology brings. As such I confidently predict that individual volition will be a disaster for a large majority of humanity, plus quite possibly an ongoing cause of suffering for vast numbers of sentients, and I see no need to endure all that to achieve a highly positive outcome for the remainder. Again, another plausible (although I'd guess not very likely) outcome of CV is to just flash-upgrade all of humanity to the point where simple volition is actually workable. If you included this in your proposal it would make a lot more sense, though then you have the very thorny question of deciding exactly how the 'upgrading' would work.
If Nick Bostrom's 'Oracle' proposal was workable and implemented, we could build a Power class seed AI and ask it questions such as 'what would the results of implementing a direct individual volition FAI be'. Eliezer appears to object to this for obtuse ethical reasons about 'not picking and choosing Singularity outcomes', which looks like another case of a misapplied blanket rule to me. Unfortunately there are unrelated practical difficulties in building safe Oracles that probably render this impractical anyway. But a relevant question would be what you would do if an Oracle presented a convincing proof that direct individual volition would be a disaster; if you insist on valuing only local freedom of choice anyway then your morality is simply incompatible with mine and the bulk of humanity's.
- Semi-technically, I assume your proposal means that individual sapients control the access control list which specifies what classes of constraint various sapients may specify on the structure of a region of the universe containing one or more (and presumably all extant copies) of that sapient's immediate future selves, where a monolithic FAI implements any requested constraints allowed by the ACL and prevents propagation of effects between regions that equate to ACL violations, including attacks on the SysOp infrastructure itself.
-- Starglider
How will the RPOP gain its initial model of all humans, in order to begin extrapolating? Many humans currently have no electricity, live in war-torn areas, etc. If gaining that initial data depends on future scientific advancements (like nanotech or high-fidelity non-invasive brain scanning) or on social and technological changes throughout much of the world (to eliminate war, put in place electricity, fiber optics, etc.), then I fear some team with fewer scruples will develop an RPOP of the bad kind long before the Friendly team succeeds.
-- Joseph Knecht
Hard Problem is a place to discuss the third, and most difficult, problem of Friendly AI: "designing a framework for an abstract invariant that doesn't automatically wipe out the human species".