Skip to content
  • Home Home Home, current page.
  • Moments Moments Moments, current page.

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @
  • Language: English
    • Bahasa Indonesia
    • Bahasa Melayu
    • Català
    • Čeština
    • Dansk
    • Deutsch
    • English UK
    • Español
    • Filipino
    • Français
    • Hrvatski
    • Italiano
    • Magyar
    • Nederlands
    • Norsk
    • Polski
    • Português
    • Română
    • Slovenčina
    • Suomi
    • Svenska
    • Tiếng Việt
    • Türkçe
    • Ελληνικά
    • Български език
    • Русский
    • Српски
    • Українська мова
    • עִבְרִית
    • العربية
    • فارسی
    • मराठी
    • हिन्दी
    • বাংলা
    • ગુજરાતી
    • தமிழ்
    • ಕನ್ನಡ
    • ภาษาไทย
    • 한국어
    • 日本語
    • 简体中文
    • 繁體中文
  • Have an account? Log in
    Have an account?
    · Forgot password?

    New to Twitter?
    Sign up
ESYudkowsky's profile
Eliezer Yudkowsky
Eliezer Yudkowsky
Eliezer Yudkowsky
Verified account
@ESYudkowsky

Tweets

Eliezer YudkowskyVerified account

@ESYudkowsky

Ours is the era of inadequate AI alignment theory. Any other facts about this era are relatively unimportant, but sometimes I tweet about them anyway.

Joined June 2014

Tweets

  • © 2018 Twitter
  • About
  • Help Center
  • Terms
  • Privacy policy
  • Cookies
  • Ads info
Dismiss
Previous
Next

Go to a person's profile

Saved searches

  • Remove
  • In this conversation
    Verified accountProtected Tweets @
Suggested users
  • Verified accountProtected Tweets @
  • Verified accountProtected Tweets @

Promote this Tweet

Block

  • Tweet with a location

    You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more

    Your lists

    Create a new list


    Under 100 characters, optional

    Privacy

    Copy link to Tweet

    Embed this Tweet

    Embed this Video

    Add this Tweet to your website by copying the code below. Learn more

    Add this video to your website by copying the code below. Learn more

    Hmm, there was a problem reaching the server.

    By embedding Twitter content in your website or app, you are agreeing to the Twitter Developer Agreement and Developer Policy.

    Preview

    Why you're seeing this ad

    Log in to Twitter

    · Forgot password?
    Don't have an account? Sign up »

    Sign up for Twitter

    Not on Twitter? Sign up, tune into the things you care about, and get updates as they happen.

    Sign up
    Have an account? Log in »

    Two-way (sending and receiving) short codes:

    Country Code For customers of
    United States 40404 (any)
    Canada 21212 (any)
    United Kingdom 86444 Vodafone, Orange, 3, O2
    Brazil 40404 Nextel, TIM
    Haiti 40404 Digicel, Voila
    Ireland 51210 Vodafone, O2
    India 53000 Bharti Airtel, Videocon, Reliance
    Indonesia 89887 AXIS, 3, Telkomsel, Indosat, XL Axiata
    Italy 4880804 Wind
    3424486444 Vodafone
    » See SMS short codes for other countries

    Confirmation

     

    Welcome home!

    This timeline is where you’ll spend most of your time, getting instant updates about what matters to you.

    Tweets not working for you?

    Hover over the profile pic and click the Following button to unfollow any account.

    Say a lot with a little

    When you see a Tweet you love, tap the heart — it lets the person who wrote it know you shared the love.

    Spread the word

    The fastest way to share someone else’s Tweet with your followers is with a Retweet. Tap the icon to send it instantly.

    Join the conversation

    Add your thoughts about any Tweet with a Reply. Find a topic you’re passionate about, and jump right in.

    Learn the latest

    Get instant insight into what people are talking about now.

    Get more of what you love

    Follow more accounts to get instant updates about topics you care about.

    Find what's happening

    See the latest conversations about any topic instantly.

    Never miss a Moment

    Catch up instantly on the best stories happening as they unfold.

    1. Lulie‏ @reasonisfun Mar 12

      https://overcast.fm/+Ic2hwsH2U/1:10:49 … #AI “You could build a mind that thought that 51 was a prime number but otherwise had no defect of its intelligence – if you knew what you were doing” —@ESYudkowsky Is it possible build a mind able to learn but incapable of correcting this error? (Why?)

      8 replies 2 retweets 16 likes
    2. David Deutsch‏ @DavidDeutschOxf Mar 12
      Replying to @reasonisfun @ESYudkowsky

      It isn't possible. Because from '51 not prime' you could lead it into a contradiction, such as 1=0. Then it would display another defect e.g. denying that that was a contradiction.

      4 replies 1 retweet 24 likes
    3. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
      Replying to @DavidDeutschOxf @reasonisfun

      See my reply downthread. You'd need to prevent further propagation of the inconsistent consequences while still allowing enough propagation to make associated behaviors real. The *hard* part would be consistency under iterated reflection.

      1 reply 0 retweets 2 likes
    4. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
      Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

      It would *not* be simple the way a superintelligent paperclip maximizer is simple and coherent. I don't know exactly how to do it. But I'm confident it could be done by someone with a completed understanding of AI, because thought steps are physical and not metaphysical.

      1 reply 0 retweets 1 like
      Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
      Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

      It would look to us like a mind with a lot of weird fiddly bits attached to maintain the delusion plus the weird fiddly bits, but the things you'd need to fiddle would be finite. The meta-meta-meta delusion would look a lot like the meta-meta delusion; there'd be a fixed point.

      12:13 PM - 12 Mar 2018
      • 2 Likes
      • Oatmeal Agitante ashik panigrahi
      4 replies 0 retweets 2 likes
        1. New conversation
        2. Lulie‏ @reasonisfun Mar 13
          Replying to @ESYudkowsky @DavidDeutschOxf

          On why the things you’d need to fiddle would be finite: Are the meta delusions comparable? I can imagine 51′s not-prime-ness being a solution to many different problems a mind could have. If there’s a potentially unlimited number of these, you’d need unlimited idea-suppression?

          1 reply 0 retweets 2 likes
        3. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 13
          Replying to @reasonisfun @DavidDeutschOxf

          Finite axioms have infinite consequences; infinite consequences can sometimes be compactly patched for the same reason. The question isn't how large the set is, it's whether the set compresses.

          1 reply 0 retweets 2 likes
        4. Logos‏ @speakthelogos Mar 13
          Replying to @ESYudkowsky @reasonisfun @DavidDeutschOxf

          Do we have reasons to believe that one can compactly patch a set of consequences following from a problematic axiom without also effectively patching the axiom in that same process?

          1 reply 0 retweets 0 likes
        5. Logos‏ @speakthelogos Mar 13
          Replying to @speakthelogos @ESYudkowsky and

          For instance, if the Intelligence is trained to secretly believe that 51 is prime, but also trained to act as if it isn't prime (which would solve all of the extended errors from the belief), and exclaims to everyone that 51 isn't prime, would that really count?

          1 reply 0 retweets 0 likes
        6. Lulie‏ @reasonisfun Mar 14
          Replying to @speakthelogos @ESYudkowsky @DavidDeutschOxf

          This opens up questions like: what does it mean to believe something? Can one believe something in some ways but not others? Some situations but not others? If you can believe different things in different situations, how many situations do you have to cover to suppress a belief?

          1 reply 0 retweets 2 likes
        7. Logos‏ @speakthelogos Mar 14
          Replying to @reasonisfun @ESYudkowsky @DavidDeutschOxf

          This is a problem with humans too. In different contexts we express different beliefs. Because of that, I think to some extent we have to define belief as what someone acts out, not what they claim. In the context of AI, we may need a similar definition.

          0 replies 0 retweets 0 likes
        8. End of conversation
        1. New conversation
        2. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
          Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

          To respect the power, depth, entanglement, interrelation, and consequences of intelligence is to think that the number of fiddly bits would be large and that lots of naive approaches wouldn't work--not to believe that it could never ever be done.

          1 reply 0 retweets 1 like
        3. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
          Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

          But I suppose it should be made precise that when I said (out loud on a podcast) that there wouldn't be further defects, I meant defects of external behavior and capability not related to the number 51. If you regard the internal fiddling as a defect then it's a futher defect.

          2 replies 0 retweets 1 like
        4. Eliezer Yudkowsky‏Verified account @ESYudkowsky Mar 12
          Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

          Also to be super clear, no human should ever try to pull this kind of shenanigan while doing AGI alignment. Find simple, compact, coherent, consistent ways to do stuff or don't do it.

          0 replies 0 retweets 2 likes
        5. End of conversation
        1. Logos‏ @speakthelogos Mar 13
          Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

          The idea that you could program something to say "51 is prime!" and then not say anything else off is perfectly valid; I'm not so sure you can categorize that as "thought" or "intelligence". If you're depending on its own reasoning, such a belief will cause infinite problems.

          0 replies 0 retweets 0 likes
          Thanks. Twitter will use this to make your timeline better. Undo
          Undo
        1. Evan O'Leary‏ @EvanOLeary Mar 12
          Replying to @ESYudkowsky @DavidDeutschOxf @reasonisfun

          You could change definitions, but that changes the semantic content of "51 is prime", which changes your argument. No other ways work because believing "51 is prime" gives you an exploitable blind spot, and in order to protect the belief you have to blind yourself ever more.

          0 replies 0 retweets 0 likes
          Thanks. Twitter will use this to make your timeline better. Undo
          Undo

      Loading seems to be taking a while.

      Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

        Promoted Tweet

        false

        • © 2018 Twitter
        • About
        • Help Center
        • Terms
        • Privacy policy
        • Cookies
        • Ads info