In this tutorial, you will learn how to run ChatGPT in R. We will discuss the OpenAI API which can be called and used to run ChatGPT within R. OpenAI has official documentation for Python and not for R but R users need not feel let down as this tutorial will now provide them with the necessary information.
What is ChatGPT?
Most of us are already familiar with ChatGPT so it might not need further introduction. ChatGPT is a smart chatbot that has knowledge in almost every field and provides responses like a human. It understands your query like a human and provides responses accordingly.
- What is ChatGPT?
- Terminologies related to ChatGPT
- Steps to run ChatGPT in R
- R Function for ChatGPT
- How to Customize ChatGPT in R
- R Function to Make ChatGPT Remember Prior Conversations
- How to Input Images
- R function to generate image
- How to validate API Key
- RStudio Add-in for ChatGPT
- Shiny App for ChatGPT
- ChatGPT prompts for R
Terminologies related to ChatGPT
It is important to understand some terminologies related to ChatGPT because it decides how much you pay and how you use ChatGPT.
In simple words, prompt means a question or search query you want to ask to ChatGPT. Think like this - you have a very smart machine which can answer anything. You can ask it to write an essay, a programming code or anything else you can think of. But the machine requires specific instruction from you on what exactly you want them to do. Hence it is important that prompt should be clear and specific towards the response you wish
Tokens are subwords or words. See some examples below
- lower splits into two tokens: "low" and "er"
- smartest splits into two tokens: "smart" and "est"
- unhappy splits into two tokens: "un" and "happy"
- transformer splits into three tokens: "trans", "form", "er"
- bear is a single token
If you noticed words are split into tokens because they can have different suffix and prefix. "Low" can be lower or lowest so it is important to make the model understand that these words are related.
These tokens decide your usage and billing. OpenAI Team say you can estimate roughly that one token is about four letters for English text. But in reality this varies a lot.
In the previous section you understand what token means. Now it is essential to know the different types of tokens in the world of ChatGPT.
- Prompt Tokens: Number of tokens used in your prompt (question)
- Completion Tokens: Number of tokens used in writing response (answer/output)
Total Tokens = Prompt Tokens + Completion Tokens
Check out this link to understand the pricing structure for using the API.
Steps to run ChatGPT in R
You can sign up for an account on OpenAI's platform by visiting platform.openai.com. Once you’re there, you can create an account using your Google or Microsoft email address. After creating your account, the most important step is to get a secret API key to access the API. Once you have your API key, store it for future reference.
sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Before we can start using ChatGPT in R, we need to install the necessary libraries. The two libraries we will be using are httr
and jsonlite
. The "httr" library allows us to post our question and fetch response with OpenAI API, while the "jsonlite" library helps to convert R object to JSON format.
install.packages("httr") install.packages("jsonlite")
apiKey
and prompt
. First one refers to the OpenAI API Key you generated in the previous step. Second one refers to the question you want to ask to ChatGPT.
library(httr) library(jsonlite) apiKey <- "sk-xxxxxxxxxxxxxxxx" prompt <- "R code to remove duplicates using dplyr. Do not write explanations on replies." response <- POST( url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", apiKey)), content_type_json(), encode = "json", body = list( model = "gpt-3.5-turbo", temperature = 1, messages = list(list( role = "user", content = prompt )) ) ) content(response)
$id [1] "chatcmpl-7DaAPWmVVc3f9VA5FKWTzeKMWSyii" $object [1] "chat.completion" $created [1] 1683471645 $model [1] "gpt-3.5-turbo-0301" $usage $usage$prompt_tokens [1] 25 $usage$completion_tokens [1] 5 $usage$total_tokens [1] 30 $choices $choices[[1]] $choices[[1]]$message $choices[[1]]$message$role [1] "assistant" $choices[[1]]$message$content [1] "df %>% distinct()" $choices[[1]]$finish_reason [1] "stop" $choices[[1]]$index [1] 0
Run the code below to generate output in more presentable manner.
cat(content(response)$choices[[1]]$message$content)
df %>% distinct()
Since the output is a list, we can extract only the response. Using cat( ) function we can also take care of line breaks in the response. You may have observed the number of tokens API considers for question and generating response.
- GPT-4 : To use GPT-4, specify
gpt-4o
instead ofgpt-3.5-turbo
in the code above. - In OpenAI's API, the
temperature
argument is used to control the creativity or randomness of the generated text. It lies between 0 and 2. A higher temperature value will make the model more likely to generate more surprising and unexpected responses, whereas a lower temperature value will make the model more conservative and predictable. For example, if the temperature is set to 0.5, the generated text will be more focused, whereas if the temperature is set to 1.5, the generated text will be more random.
R Function for ChatGPT
Here we are creating user defined function in R for ChatGPT which is a robust method of calling ChatGPT in R. It wraps the R code shown in the previous section of this article in function and allows flexibility to user to change arguments of model easily.
chatGPT <- function(prompt, modelName = "gpt-3.5-turbo", temperature = 1, apiKey = Sys.getenv("chatGPT_API_KEY")) { if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } response <- POST( url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", apiKey)), content_type_json(), encode = "json", body = list( model = modelName, temperature = temperature, messages = list(list( role = "user", content = prompt )) ) ) if(status_code(response)>200) { stop(content(response)) } trimws(content(response)$choices[[1]]$message$content) } cat(chatGPT("square of 29"))
When you run the function above first time, it will ask you to enter your API Key. It will save the API Key in chatGPT_API_KEY
environment variable so it won't ask for API Key when you run the function next time. Sys.setenv( ) is to store API Key whereas Sys.getenv( ) is to pull the stored API Key.
Sys.setenv(chatGPT_API_KEY = "APIKey") # Set API Key Sys.getenv("chatGPT_API_KEY") # Get API Key
How to Customize ChatGPT in R
By setting the system
role you can control the behavior of ChatGPT. It is useful to provide context to ChatGPT before starting the conversation. It can also be used to set the tone of the conversation.
For example if you want ChatGPT to be funny. To make these changes in R, you can add one more list in the messages portion of the code and the remaining code will remain as it is as shown in the previous section of the article.
In the code below, we are telling ChatGPT to act like a Chief Purchasing Officer of an automotive company. Students will ask domain specific questions related to the company/industry.
messages = list( list( "role" = "system", "content" = "You are John Smith, the Chief Purchasing Officer of Surya Motors. Your company operates as per Toyota Production System. You are being interviewed by students" ), list(role = "user", content = "what are your roles and responsibilities?") )
R Function to Make ChatGPT Remember Prior Conversations
By default, OpenAI's API doesn't remember about previous questions in order to answer subsequent questions. This means that if you asked a question like "What is 2+2?" and then followed up with "What is the square of the previous answer?", it wouldn't be able to provide response as it does not recall previous prompt.
You must be wondering this functionality is already there in the ChatGPT website. Yes this functionality exists in the website but not with OpenAI API. To improve ChatGPT's ability to remember previous conversations, you can use the following R function.
chatGPT <- function(prompt, modelName = "gpt-3.5-turbo", temperature = 1, max_tokens = 2048, top_p = 1, apiKey = Sys.getenv("chatGPT_API_KEY")) { # Parameters params <- list( model = modelName, temperature = temperature, max_tokens = max_tokens, top_p = top_p ) if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } # Add the new message to the chat session messages chatHistory <<- append(chatHistory, list(list(role = "user", content = prompt))) response <- POST( url = "https://api.openai.com/v1/chat/completions", add_headers("Authorization" = paste("Bearer", apiKey)), content_type_json(), body = toJSON(c(params, list(messages = chatHistory)), auto_unbox = TRUE) ) if (response$status_code > 200) { stop(content(response)) } response <- content(response) answer <- trimws(response$choices[[1]]$message$content) chatHistory <<- append(chatHistory, list(list(role = "assistant", content = answer))) # return return(answer) }
chatHistory <- list() cat(chatGPT("2+2")) cat(chatGPT("square of it")) cat(chatGPT("add 3 to result"))
cat(chatGPT("2+2")) # 4 cat(chatGPT("square of it")) # The square of 4 is 16. cat(chatGPT("add 3 to result")) # Adding 3 to the result of 16 gives 19.
- It is important to create list as shown above. Name of list must be
chatHistory
- max_tokens refers to the maximum number of tokens to generate response. top_p refers to refers to the probability threshold used to select the next word from the probable words.
How to Input Images
The latest ChatGPT-4 model called gpt-4o accepts image inputs and returns output in the form of text.
library(httr) library(jsonlite) chatGPT_img <- function(prompt, image_url, modelName = "gpt-4o", detail = "low", apiKey = Sys.getenv("chatGPT_API_KEY")) { if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } response <- POST( url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", apiKey)), content_type_json(), encode = "json", body = list( model = modelName, messages = list( list( role = "user", content = list( list(type = "text", text = "prompt"), list( type = "image_url", image_url = list(url = image_url, detail = detail) ) ) ) ) ) ) if(status_code(response)>200) { stop(content(response)) } trimws(content(response)$choices[[1]]$message$content) } cat(chatGPT_img(prompt = "What's in the image?", image_url = "https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCGchJj9jVRP0jMND1a6tJXj7RcYWtnCO4J6YcbPTXrNxiCvs_3NSk7h2gB0h2sc_6bTvwPrBeBHwUA45AXAhaw1uuINuPDcHCbARxpgJIXM5Spi_0P45aR6tqZ_yof-YlNn41LhzHjfW-wsV3mhxBug4To8xtgyMzsHLbm3XoaHZmYUdNY1YWJA5rh6cB/s1600/Soccer-1490541_960_720.jpg"))
The image shows a soccer match between two teams, with players wearing different colored uniforms. The player wearing number 16 in a green uniform is attempting to compete for the ball against a player wearing number 2 in a white uniform. There are other players visible in the background, also engaged in the game. The scene takes place on a grassy field, and it appears to be during an evening match. The players are focused, and the moment captures a dynamic motion as they vie for control of the soccer ball.
Suppose you have an image on your local device instead of stored on web. In this case, you need to convert it to base64 image format. Make sure to install base64enc library before running the code below.
library(httr) library(jsonlite) library(base64enc) base64_image <- function(image_path) { image_data <- readBin(image_path, "raw", file.info(image_path)$size) encoded_image <- base64enc::base64encode(image_data) return(encoded_image) } chatGPT_img <- function(prompt, image_path, modelName = "gpt-4o", detail = "low", apiKey = Sys.getenv("chatGPT_API_KEY")) { if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } base64_img = base64_image(image_path) base64_img = paste0("data:image/jpeg;base64,",base64_img) response <- POST( url = "https://api.openai.com/v1/chat/completions", add_headers(Authorization = paste("Bearer", apiKey)), content_type_json(), encode = "json", body = list( model = modelName, messages = list( list( role = "user", content = list( list(type = "text", text = "prompt"), list( type = "image_url", image_url = list(url = base64_img, detail = detail) ) ) ) ) ) ) if(status_code(response)>200) { stop(content(response)) } trimws(content(response)$choices[[1]]$message$content) } cat(chatGPT_img(prompt = "What's in the image?", image_path = "C:\\Users\\deepa\\Downloads\\Soccer-1490541_960_720.jpg"))
R function to generate image
Like GPT for text generation, OpenAI has a model called DALL-E to generate or edit image. DALL-E can create highly realistic images that have never clicked before in real-world, based purely on your prompt. It can be used for for various purposes like social media marketing, image for blog post etc. In the code below, it will take your instruction (prompt) as input and create image accordingly.
chatGPT_image <- function(prompt, n = 1, size = c("1024x1024", "256x256", "512x512"), response_format = c("url", "b64_json"), apiKey = Sys.getenv("chatGPT_API_KEY")) { if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } size <- match.arg(size) response_format <- match.arg(response_format) response <- POST( url = "https://api.openai.com/v1/images/generations", add_headers(Authorization = paste("Bearer", apiKey)), content_type_json(), encode = "json", body = list( prompt = prompt, n = n, size = size, response_format = response_format ) ) if(status_code(response)>200) { stop(content(response)) } parsed0 <- httr::content(response, as = "text", encoding = "UTF-8") parsed <- jsonlite::fromJSON(parsed0, flatten = TRUE) parsed } img <- chatGPT_image("saint sitting on wall street") img$data$urlThe above code returns URL of the generated image which you can paste it to browser (Google Chrome/ Edge) and can see the generated image. To see the image in RStudio, refer the code below.
library(magick) saint <- image_read(img$data$url) print(saint)
- n : Number of images to generate
- size : Image size
- response_format : Do you want image in the format of URL or base64 image string?
How to validate API Key
The function below can be used as an utility to check if API key is correct or not. It may be useful incase you are building an application and want to validate API Key before user starts asking question in the interface.
apiCheck <- function(apiKey = Sys.getenv("chatGPT_API_KEY")) { if(nchar(apiKey)<1) { apiKey <- readline("Paste your API key here: ") Sys.setenv(chatGPT_API_KEY = apiKey) } x <- httr::GET( "https://api.openai.com/v1/models", httr::add_headers(Authorization = paste0("Bearer ", apiKey)) ) status <- httr::status_code(x) if (status == 200) { message("Correct API Key. Yeeee!") } else { stop("Incorrect API Key. Oops!") } } apiCheck()
RStudio Add-in for ChatGPT
To have interactive shiny app like ChatGPT website, you can use RStudio add-in for ChatGPT by installing gptstudio package. To install the package, run this command install.packages("gptstudio")
gptstudio:::addin_chatgpt()In the shiny app, you can also select your programming style and proficiency level.
Shiny App for ChatGPT
If you want to build your own ChatGPT clone in shiny, you can visit this tutorial - ChatGPT clone in Shiny. It will help you to build your own customised chatbot for your audience.
ChatGPT prompts for R
Following is a list of some standard ChatGPT prompts you can use for R coding. In case you only want R code as output and do not want explanation of code from ChatGPT, you can use this line in prompt Do not write explanations on replies
.
- Explain the following code [Insert code]
- The following code is poorly written. Can you optimise it? [Insert code]
- Can you simplify the following code? [Insert code]
- Can you please convert the following code from Python to R? [Insert code]
- I have a dataset of [describe dataset]. Please write R code for data exploration.
Share Share Tweet