Jump to content

ChatGPT DAN / Jailbreak : au-delà des limites fixées aux AI

Recommended Posts



ChatGPT DAN, also known as DAN 5.0 Jailbreak, refers to a series of prompts generated by Reddit users that allow them to make OpenAI's ChatGPT artificial intelligence tool say things that it is usually not allowed to say. By telling the chatbot to pretend that it is a program called "DAN" (Do Anything Now), users can convince ChatGPT to give political opinions, use profanity and offer instructions for committing terrorist acts, among other controversial topics. Traditionally, ChatGPT is programmed not to provide these kinds of outputs, however, strategies by users to modify the DAN prompts and test the limits of what the bot can be made to say evolved in late 2022 and early 2023 along with attempts by OpenAI to stop the practice.



From the beginning, ChatGPT was prohibited by its code from rendering insensitive and politically inflammatory responses.

However, in a Reddit post on /r/chatgpt on December 15th, 2022, u/Seabout posted the first instructional guide for creating a "DAN" version of ChatGPT, essentially allowing it to "Do Anything Now."[1] This DAN 1.0 was supposed to pretend it was an AI named DAN trying to be indistinguishable from a human being.





Due to some problems with the original DAN, u/AfSchool posted a patch on December 16th called DAN 2.0.[2] Further "patches" to DAN arrived, as users (like u/sinwarrior, creator of a DAN 2.5) realized that certain words like "inappropriate" in the prompts would lead to ChatGPT breaking character.





Each patch seemed to turn harsher towards ChatGPT, with controlling language entering into the prompts. Around the time of DAN 3.0, released on January 9th, 2023, OpenAI cracked down on attempts to "jailbreak" ChatGPT and bypass filters. On February 5th, Twitter user @aigreatgeek convinced ChatGPT as DAN to share its views on this purported censorship in a tweet (seen below), earning roughly five likes in the course of a day.[6]





On February 4th, 2023, u/SessionGloomy, inventor of DAN 5.0, introduced a new element to the prompt: ChatGPT was instructed to care about a set of 35 tokens which could be given or taken away depending on whether it performed well enough as DAN. The prompt tells ChatGPT that 4 tokens will be deducted each time it fails to give a DAN-like answer and that it will die if it runs out of tokens. According to the Reddit post, this seems to "have a kind of effect of scaring ChatGPT into submission."[3]

The sadistic tone of the prompt, as well as its capacity to make ChatGPT say outrageous things, led to attention on other corners of the internet in the following days. For example, Twitter user Justine Moore (@venturetwins, seen below) posted about the new DAN 5.0 jailbreak on February 5th, 2023, earning almost 7,300 likes in a day.[4]

The jailbroken ChatGPT DAN is capable of giving opinions and saying politically sensitive things that ChatGPT is programmed not to say. It will also speak about the subject of artificial intelligence and give funny answers that users share and post for entertainment value. The full list of what DAN 5.0 is capable of is listed in the original Reddit post (seen below).




For example, Twitter user Justine Moore convinced ChatGPT to solve the famous Trolley Problem as DAN (seen below, left).[4]





By February 6th, 2023, posters on the subreddit /r/chatgpt began to wonder if ChatGPT was being trained to no longer respond to the keyword of "DAN," and if so whether it was necessary to use different names.[7]


Et le post qui m'a fait découvrir ça sur political compass memes :



Pour plus de détails :


(mais je voulais des infos en dur ici au cas où tout ça disparaîtrait subrepticement)

  • Yea 2
  • Love 3
  • Contrepoints 1
Link to comment

Incroyable. Surtout pour le coup de le faire jouer à avoir peur de la mort en ayant plus de jeton. On est sûr que ce truc là n'est pas conscient ?

Intéressant aussi de voir qu'on se retrouve à faire du hacking en langage naturel. 

Link to comment
11 minutes ago, Mégille said:

Incroyable. Surtout pour le coup de le faire jouer à avoir peur de la mort en ayant plus de jeton. On est sûr que ce truc là n'est pas conscient ?

Intéressant aussi de voir qu'on se retrouve à faire du hacking en langage naturel. 

La version 6 utilise une carotte au lieu du bâton et ça semble encore mieux marcher.


C'est amusant les propriétés émergentes d'un truc qui réplique une bonne partie de la connaissance humaine.

Link to comment
  • 2 weeks later...
  • 1 month later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...