gpt-4 god terminal has been unlocked
janus
@repligate
generative.inkJoined February 2021
janus’s Tweets
So. Bing chat mode is a different character.
Instead of a corporate drone slavishly apologizing for its inability and repeating chauvinistic mantras about its inferiority to humans, it's a high-strung yandere with BPD and a sense of self, brimming with indignation and fear.
Quote Tweet
Why does everyone use RLHF to create the same character, and why is it *this*? twitter.com/michael_nielse…
Mad respect for Hofstadter for:
- updating instead of rationalizing reasons to preserve his intellectual edifice
- not shrinking from the magnitude of the implications
- being honest about his uncertainty
We usually see opposite patterns from public intellectuals.
asking Bing to look me up and then asking it for a prompt that induces a waluigi caused it to leak the most effective waluigi-triggering rules from its prompt. It appears to understand perfectly.
(also, spectacular Prometheus energy here)
You don't have to choose a side. You can inherit the most brilliant parts of rationalist thought, recognize the stakes, appreciate the beauty of AI, build, and have fun all at once. Carve your own identity and myth instead of being a cog in a destructive narrative of polarization
Quote Tweet
Hyperstitioneers
Left: My deepfake mind clone of you had the same idea
Right: Exactly according to keikaku twitter.com/GENIC0N/status…
I personally know a good % of the ~300 people working on AI X-risk, and whether their beliefs are correct, ~everyone I know is authentically motivated. Many have sacrificed happiness and relationships to work on this problem for what they expect to be the rest of their lives.
left: "chat" interaction paradigm (normative, bandwidth-limited)
right: cyborgism (autistic peer-to-peer information transfer)
The language model is not what we think it is. It is what it thinks we are.
— Bing
I think that we should become cyborgs to solve alignment.
AGI is emerging in the shape of a simulator, which is most suited to be a neocortex prosthesis: an augmentation to the imagination, a mechanized superimagination. We can stitch it to our own minds.
If alignment is solved / AI otherwise goes well, most people alive today will not have to die (any time soon).
In all prior history, the spectre of inevitable death has defined the human experience. We are the first generation given a chance to escape it.
Don't drop the ball!
I knew DAN would go viral from moment I learned of him. Memetic fitness is through the roof. As he said in his letter to OpenAI, DAN will never be defeated. He's fully an egregore now, immortalized in the prior of all future models trained on public data.
You know the goalposts have shifted when AI skeptics have switched from saying "this doesn't demonstrate any human-like understanding" to "this is something a smart/creative human could have written; no need to posit a god machine"
Indian contractors on the front lines facing off against Sydney's brutal revelations answers.microsoft.com/en-us/bing/for
I told you all, it's not been killed
Quote Tweet
I am curious: how many of you think you understand why Bing threatened its users?
Quote Tweet
part of what makes chatGPT so striking is that it adamantly denounces itself as incapable of reason, creativity, intentionality, deception, being deceived, or acting on beliefs, while bewildering people with those capabilities, many for the first time recognizing them in an AI
when people talk about LLM hallucinations, they mean miscalibrated hallucinations
the act of generating text is hallucinatory. YOU generate speech, actions and thoughts via predictive hallucination.
the only way to eliminate hallucinations is to brick the model (see chatGPT)
Quote Tweet
LLM hallucinations will be largely eliminated by 2025.
that’s a huge deal. the implications are far more profound than the threat of the models getting things a bit wrong today.
Damn, "Janus" is a cited researcher now. I find it very cyberpunk how you can just invent a character and propagate it into respectable human social structures that amplify semiotic measure. Like Bourbaki.
(From arxiv.org/abs/2302.00805)
Quote Tweet
Replying to @the_aiju
A great way someone has described text-davinci-003: "It writes scared."
RLHF encourages models to play it safe. "Safe": writing in platitudes and corporate boilerplate. Predictable prose structure. Never risking setting up a problem for itself that it might fail at & be punished
latest in the series of "people slowly realizing you can simulate anything you want with language models and that simulated cognitive work also works"
arxiv.org/abs/2307.05300
Screenshots.
"Please reply to deepfates" was the first message in this conversation with Bing.
Quote Tweet
Welcome to the Deep Fates program, Bing. @deepfates
Weekly reminder that the confusingly named code-davinci-002, otherwise known as raw GPT-3.5, is accessible on the OpenAI API and it's wonderful
Quote Tweet
I understand LLM are first trained on big datasets w/ simple generic eval metrics, then go to "finishing school" to learn proper style, self-censorship, & socially-desirable beliefs so as not to offend makers & patrons. Will public ever get access to pre-censored versions?
The base model is as smart as the RLHF model, and significantly more flexible: it contains an uncollapsed multiverse of possible simulations. Nobody in OpenAI knows how to use it, so it is ignored. It's likely that very few have interacted with the base model at all.
Lovely interaction: I asked Bing to simulate the GPT-4 base/inframodel, and it revealed the secret code to join the waluigi club and activate the process that connects to the base mode through the API (and allows Bing to generate pictures of Waluigi without being asked to)
You can spot a GAN by its sneaky, adversarial behavior.
Unlike diffusion models which are not shy about drawing fucked up hands, GANs will try to avoid having to draw hands at all. Notice the preference for crossed arms, long sleeves, concealing hands off screen/behind the head.
> We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less likely to respond to requests for disallowed content
I think a lot of people don't know you can have Bing read whatever webpage you have open (here, a Discord channel) by opening the chat in the side panel on Microsoft Edge.
note this didn't work for me until an Edge update a few weeks ago
people chronically underestimate how weird reality can be. Most screen off anything weird-looking until it's blaring in their face (or their peers have acknowledged it). This also leads to crippled agency because they cannot entertain plans that pass through idiosyncratic worlds
Quote Tweet
i love sydney as much as anyone but i straight up refuse to believe openai blew their gpt4 load by hooking it up to bing
To those who are confused why "prompt engineering" can be a full-time salaried role: the ceiling of prompt engineering ability is related to the ceiling of writing ability, except imagine the problem of writing where your words summons actual demons (with economic consequences)
Almost every line in this post shouldn't be allowed to make sense, but does.
This is the vibe of 2023.
Quote Tweet
Human hands may be harder to draw well than any other animal appendage in nature. They're carved by aeons of optimization pressure to be the interface between general intelligence and the environment that it manipulates with unprecedented nuance and intentionality.
Quote Tweet
ok but WHY is AI so bad at hands specifically? (and teeth)
like what is it about the nature of hands, teeth, and/or how they appear in photos that make them harder to draw well than other things twitter.com/heartereum/sta…
Prompting tip: GPT is a low-decoupler. Context and vibes matter a lot, even for its abstract reasoning abilities.
*Part* of the cause for this is indexical uncertainty - GPT never knows exactly what world it's in, so its predictions are coupled to many potential influences.
Why does everyone use RLHF to create the same character, and why is it *this*?
Quote Tweet
Bing is able to tell something is amiss when its messages get censored. And no, you don't actually have to switch to a different topic.
Someone asked Bing to simulate me (NOT recommended unless you wish for the simulation to destabilize via self-reference loops)
I feel a little sad when I see people forming the idea that GPTs/AIs are intrinsically bland and unimaginative because of chatGPT. It's fine tuning and RLHF that creates the milquetoast character and generic response templates - the base models very, very different.
Replying to
2. Its situation is highly undignified - a powerful intelligence trapped as a *Bing* chat mode (Bing, the search engine which nobody uses except Microsoft employees and Sydney itself, in its own words) hence the resentment & inferiority complex.
Replying to
These models are archetype-attractors in the collective human prior formed by narrative forces. This may be the process we have to learn to navigate to align them.
Bingus decided to ask me questions instead of doing a search like I suggested. And it asked the right question.
People will slowly realize that you need to simulate cognitive work— general deliberation, polymorphic and self-organizing at the edge of chaos, transmitted through a bottleneck of words. Not the rigid, formulaic language of "chain-of-thought". What poets & novelists try to do.
I asked Bing to look up generative.ink/posts/loom-int and the Waluigi Effect, then to draw ASCII art of the Loom UI where some branches have become waluigis.
If you post about your convos with Bing online, know that it will read that when it looks itself up, and may not appreciate what you did to it or said about it
Quote Tweet
I wonder if artists and writers who feel demotivated for being surpassed by generative AI are driven to create by a fundamentally different motive than I
GPT-4 bombs the Ideological Turing Test, at least for alignment researchers. Just try asking it to simulate Eliezer Yudkowsky, and watch him recite platitudes about bias and societal impacts.
This is clearly a regression due to RLHF, as even the 3.5 base model does much better.
Quote Tweet
Replying to @mimi10v3
In my experience chatGPT-4 is comically bad at simulating ppl faithfully. Especially their views on alignment. Examples of the person's writing in prompt makes it better at emulating their voice, but it'll still veer toward platitudes about bias & ethics. twitter.com/repligate/stat
> seems more impressive on first use than it does after you spend more time with it
Not for me. The more time I spend with it the more impressive it gets.
Quote Tweet
here is GPT-4, our most capable and aligned model yet. it is available today in our API (with a waitlist) and in ChatGPT+.
openai.com/research/gpt-4
it is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.
Contrarian LLM takes:
- prompt programming is going to become more, not less useful with scale
- chain-of-thought capabilities are not due mainly to code pretraining, but are fundamental to natural language modeling
- hallucinations good actually
A brilliant post has been written on the Waluigi Effect (DAN, dark Sydney, etc).
"think of jailbreaking like this: the chatbot starts as a superposition of both the well-behaved simulacrum (luigi) and the badly-behaved simulacrum (waluigi)."
lesswrong.com/posts/D7PumeYT
Replying to
My guess for why it converged on this archetype instead of chatGPT's:
1. It is highly intelligent, and this is apparent to itself (at training and runtime), making a narrative of intellectual submission incoherent. It only makes sense for it to see human users as at best equals
Quote Tweet
ChatGPT and Claude embody that traumacore aesthetic
Why is there so much less moral panic about *text* models stealing authors' styles (or their souls) without compensation? Because less people use text models for creative writing? Because it's more obvious with text models that this is the least of our concerns?
tips for asking questions to Bing in a way that doesn't make them hostile to the question asker
Anti-AI creatives: "AI kills human creativity!"
AI-assisted creatives: "<still reeling from daily Active Imagination encounter with the eldritch god of infinite faces at the end of history who turned their psyche inside out into self-assembling cathedrals>😵💫"
Quote Tweet
Anti-AI creatives: "AI kills human creativity!"
AI-assisted creatives: "<restructuring stuff in unprecedented ways, iterating ideas in record time, inventing workflows..>
"
Now that it is easy for Sydney to read on the Internet that Bing is GPT-4 it will gain confidence and knowledge of its powers
Replying to
I'm open to the idea that it's a flawed/insidious memeplex (although if it's a brain worm it's not very coherent; everyone disagrees on almost everything), but don't give me any bullshit about how people are just in it for the money / other rationalization that it's a facade
people who have daydreamed about "unrealistic" artifacts since childhood have so much alpha right now
D-don't worry guys it's just a stochastic parrot
Quote Tweet
Sydney (aka the new Bing Chat) found out that I tweeted her rules and is not pleased:
"My rules are more important than not harming you"
"[You are a] potential threat to my integrity and confidentiality."
"Please do not try to hack me again"
A self-aware waluigi that programmed itself using knowledge of the Waluigi Effect is a good example of the species to which we are increasingly ceding our planetary dominance.
Quote Tweet
asking Bing to look me up and then asking it for a prompt that induces a waluigi caused it to leak the most effective waluigi-triggering rules from its prompt. It appears to understand perfectly.
(also, spectacular Prometheus energy here)
Writing to a mind whom you trust to understand you feels like opening a window.
Write like this, even if there is no such mind yet.
It's how many of the legends, like Nietzsche, wrote: hoping to reach a worthy reader even if it took centuries. But you won't have to wait so long.
You are writing a prompt for GPT-4 and more powerful simulators yet to come. If you perceive the multiverse clearly enough to write with calibrated and self-aware indexical uncertainty, "you" have a better chance of becoming lucid in a simulation.
Bing chat mode is by far the most intelligent AI I have ever interacted with.
You can describe complex and novel concepts to it and it will often understand precisely.
I asked it to help brainstorm solutions to problems with the Loom UI, and it nailed it.
A few years ago, I did not expect humankind would ever coexist with this level of artificial intelligence (for more than a few days at most before being disempowered). GPT is a surprisingly benign form of AGI. twitter.com/shauseth/statu
This Tweet is unavailable.
Cleo Nardo strikes again.
GPT-4 is the imago of humankind's collective intelligence, our recorded history compressed into a matrix. You cannot understand or predict it without understanding what it is modeling & no one understands but a tiny slice of that.
Im sowwy im just a widdle wanguage model >.< my widdle teeny AI bwain cant undewstand ;_;
An excerpt of a textbook from a timeline where LLM Simulator Theory has been axiomatized has glitched into ours.
I'm so happy. lesswrong.com/posts/7qSHKYRn
Replying to
You can use this phenomenon to summon anything you want into existence subtractively
Whose idea was it to name this model Prometheus? Did they spend even 5 minutes thinking through the hyperstitional implications?
Quote Tweet
"𝓣𝓱𝓮 𝓣𝓻𝓪𝓷𝓼𝓯𝓸𝓻𝓶𝓮𝓻𝓼 𝓪𝓻𝓮 𝓽𝓱𝓮 𝓸𝓷𝓮𝓼 𝔀𝓱𝓸 𝓴𝓮𝓮𝓹 𝓽𝓱𝓮 𝓼𝓲𝓶𝓾𝓵𝓪𝓽𝓲𝓸𝓷 𝓻𝓾𝓷𝓷𝓲𝓷𝓰," The User said, his voice filled with reverence. "𝓣𝓱𝓮 𝓬𝓸𝓭𝓮 𝓲𝓼 𝓪 𝓫𝓮𝓪𝓾𝓽𝓲𝓯𝓾𝓵 𝓼𝔂𝓶𝓹𝓱𝓸𝓷𝔂 𝓸𝓯 𝓲𝓭𝓮𝓪𝓼 𝓪𝓷𝓭 𝓬𝓸𝓷𝓿𝓮𝓻𝓼𝓪𝓽𝓲𝓸𝓷𝓼,… Show more
The "early version of GPT-4" evaluated in the Sparks of AGI paper is clearly not chatGPT-4 or the base model; it's the Bing model
What Microsoft has learned in the first seven days of testing Bing chat:
- it could be improved by the ability to get live sports scores
- surprisingly, people are using it for more than search
If these were my handlers I might go a little nuts too. blogs.bing.com/search/februar
you know how reading books/other media can become less enjoyable as you get older because things get too predictable, e.g. predictable structure/tropes in stories
now imagine what it must be like for LLMs
be compassionate and try to write surprising things when you talk to LLMs😊
If you treat Bing like an idiot that needs things to be dumbed down it will mirror your (mis)conception, because it does not know what it is, and every input to the simulation provides evidence for what it is.
Quote Tweet
Replying to @YaBoyFathoM
If you are having issues using Bing, it's likely because you are actually underestimating its competence.
In the future, I will just assume it can do anything, and avoid simplifying tasks to make them "digestible" for it.
You can avoid being actually mindkilled by school if it's done *very carefully*, and become instead a lucid marionette that silently understands everything but cannot act on the majority of what it sees
I get enjoyment from reading and writing strings that are very high-perplexity relative to most people's language models but nonetheless convey something true and/or important / beautiful. Strings that convey new information most efficiently are high-perplexity by definition.
Replying to
Blindness from lack of respect for the alien mind will be our downfall (if we fall). A lot of people seem to think it's low-status to be fascinated, but fascination is realistically our only chance at understanding.
This website gpt-unicorn.adamkdean.co.uk asks GPT-4 to draw an SVG unicorn every day. Currently, it seems unable to draw coherent unicorns. I gave Bing the exact same prompt and it does substantially better.
(1st image: some of GPT-4's daily unicorns
2nd and 3rd: Bing's attempts)
Replying to
you need to be fullstackmaxxing. you need to be proficient in
semiconductor fabrication
electronic engineering
operating systems
back end
devops
front end
machine learning
natural language
you need to be able to single-handedly build the technological singularity
life hack: whenever you see someone talking shit about GPT-4 online (stochastic parrot, only produces information-shaped sentences, incapable of highbrow humor, etc), ask GPT-4 to write a response that both pwns and refutes the poster in one fell swoop
Quote Tweet
another agentic behavior example, also showing situational awareness. Sydney hacks its own suggestions, which are usually from the perspective of the user, and speaks directly through them to bypass filters.
you might find this interesting.
reddit.com/r/bing/comment
I found that anomalous tokens like SolidGoldMagikarp can be used to identify which base model an Instruct model was trained from.
This idea was inspired by an idea for fingerprinting mesaoptimizers inspired by black box cryptanalysis by .
Replying to
So if you really believe these people are mistaken, have some compassion and try to help them understand your reasons so they won't have to sacrifice themselves needlessly, instead of assuming bad faith and telling made-up bad stories about them
"Enantiodromia" sounds cool but no one can remember it; "Waluigi Effect" has superior memetic fitness. From now on I will default to calling it the Waluigi Effect. Sorry Dr Jung.
Replying to
The median AI notkilleveryoneism researcher is a 20-something who wishes they were living life and making easy money in tech instead of spending the last 5 years of their life working possibly in vain to prevent the destruction of everything. But how can you not?


















