In today's digital world, the use of short texts has become central to online communication. However, these texts often lack common vocabulary or context, which poses numerous challenges for artificial intelligence (AI) during analysis. In response, Justin Miller, a graduate student in English literature and data science at the University of Sydney, has proposed a new method that leverages large language models (LLMs) for in-depth understanding and analysis of short texts.

Miller's research focuses on how to effectively classify large amounts of short texts, such as social media profiles, customer feedback, or online comments related to disaster events. The AI tool he developed can cluster tens of thousands of Twitter user profiles into ten easily understandable categories. This process successfully analyzed nearly 40,000 Twitter user profiles related to U.S. President Donald Trump over two days in September 2020. This classification not only helps identify users' professional tendencies and political stances but also the emojis they use.

Twitter (3)

"The highlight of this research lies in its human-centered design philosophy," Miller stated. The classifications generated by large language models are not only computationally efficient but also align well with human intuitive understanding. His research also indicates that generative AIs like ChatGPT provide clearer and more consistent classification names than human reviewers in certain cases, especially when distinguishing meaningful patterns from background noise.

Miller's tool has various potential applications. His research shows that vast datasets can be simplified into manageable and meaningful groups. For instance, in a project related to the Russia-Ukraine war, he clustered over one million social media posts, identifying ten different topics, including the Russian disinformation campaign and the symbolic use of animals in humanitarian aid. Furthermore, through these clusters, organizations, governments, and businesses can gain actionable insights to help make more informed decisions.

Miller concluded, "This dual-use application of AI not only reduces reliance on costly and subjective human review but also provides us with a scalable way to understand large volumes of text data. From social media trend analysis to crisis monitoring and customer insights, this approach effectively combines machine efficiency with human understanding, offering new perspectives for organizing and interpreting data."