Toward Keyword Generation through Large Language Models

Wanhae Lee, Minki Chun, Hyeonhak Jeong, Hyunggu Jung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

It is essential to understand research trends for researchers, decision-makers, and investors. One way to analyze research trends is to collect and analyze author-defined keywords in scientific papers. Unfortunately, while author-defined keywords are beneficial to researchers aiming to figure out the trends of their research fields, 45% of scientific papers in Microsoft Academic Graph did not contain their author-defined keywords. Additionally, six of the top seven AI conferences neither collect nor disclose keywords. This paper proposes a method for generating the keywords using Galactica, a pre-trained large language model published by Meta. We evaluate this method's performance by comparing the keywords provided by authors in the CoRL'22 and report characteristics of the generated keywords. Our study shows the F1 score of our proposed method was ten times better than that of previous studies, and 42.7% of the generated keywords are relevant to author-defined keywords.

Original languageEnglish
Title of host publicationIUI 2023 - Companion Proceedings of the 28th International Conference on Intelligent User Interfaces
PublisherAssociation for Computing Machinery
Pages37-40
Number of pages4
ISBN (Electronic)9798400701078
DOIs
StatePublished - 27 Mar 2023
Event28th International Conference on Intelligent User Interfaces, IUI 2023 - Sydney, Australia
Duration: 27 Mar 202331 Mar 2023

Publication series

NameInternational Conference on Intelligent User Interfaces, Proceedings IUI

Conference

Conference28th International Conference on Intelligent User Interfaces, IUI 2023
Country/TerritoryAustralia
CitySydney
Period27/03/2331/03/23

Keywords

  • keywords
  • language model
  • text generation
  • text mining

Fingerprint

Dive into the research topics of 'Toward Keyword Generation through Large Language Models'. Together they form a unique fingerprint.

Cite this