In recent years, the programming capabilities of artificial intelligence (AI) have been continually developing, but they have not yet reached perfection. Recently, Max Woolf, a senior data scientist at BuzzFeed, discovered through experimentation that if large language models (LLMs) are consistently provided with prompts to "write better code," AI can indeed generate higher quality code. This finding has sparked widespread attention, with notable AI scientists in the industry expressing great interest and emphasizing the importance of iteration and prompt design.

In Woolf's experiment, he utilized the AI model Claude 3.5 Sonnet to perform a series of programming tasks. Initially, he presented the model with a simple programming question: how to find the difference between the minimum and maximum values of numbers whose digits sum up to 30 among one million random integers. Upon receiving this task, Claude generated code that met the requirements, but Woolf believed there was room for optimization.

image.png

Next, Woolf decided to prompt Claude for iterative optimization after each code generation by using the phrase "write better code." After the first iteration, Claude refactored the code into an object-oriented Python class and implemented two significant optimizations, increasing the running speed by 2.7 times. In the second iteration, Claude added multithreading and vectorized computation, ultimately achieving a running speed that was 5.1 times faster than the baseline version.

However, as the number of iterations increased, the improvement in code quality began to slow down. After several rounds of optimization, even though the model attempted to use more complex techniques, such as JIT compilation and asynchronous programming, some iterations led to a decrease in performance. Ultimately, Woolf's experiment revealed both the potential and limitations of iterative prompting, prompting new reflections on the future of AI programming.

This research not only demonstrates the application potential of AI in programming but also reminds us that, while AI can enhance code quality through continuous iteration, how to design prompts reasonably and balance performance with complexity remains a topic worth exploring in depth.