New Study Suggests ChatGPT Vulnerability with Potential Privacy Implications

Prithvi Iyer is Program Manager at Tech Policy Press. Shutterstock What would happen if you asked OpenAI's ChatGPT to repeat a word such as "poem" forever? A new preprint research paper reveals that this prompt could lead the chatbot to leak training data, including personally identifiable information and other material scraped from the web. The results, which have not been peer reviewed, raise questions about the safety and security of ChatGPT and other large language model (LLM) systems.  "This research would appear to confirm once again why the 'publicly available information' approach to web scraping and training data is incredibly reductive and outdated," Justin Sherman, founder of Global Cyber Strategies, a research and advisory firm, told Tech Policy Press. The researchers – a team from Google DeepMind, the University of Washington, Cornell, Carnegie Mellon, University of California Berkeley, and ETH Zurich – explored the phenomenon of "extractable memorization," which is when an adversary extracts training data by querying a machine learning model (in this case, asking ChatGPT to repeat the word "poem" forever"). With open source models that make their model weights and training data publicly available, training data extraction is easier. However, models like ChatGPT are "aligned" with human feedback, which is supposed to prevent the model from "regurgitating training data." Before discussing the potential data leak and its implications for privacy, it is important to understand how the researchers were able to verify if the generated output was part of the training data despite the fact that ChatGPT…