TY - GEN
T1 - Toward Keyword Generation through Large Language Models
AU - Lee, Wanhae
AU - Chun, Minki
AU - Jeong, Hyeonhak
AU - Jung, Hyunggu
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/3/27
Y1 - 2023/3/27
N2 - It is essential to understand research trends for researchers, decision-makers, and investors. One way to analyze research trends is to collect and analyze author-defined keywords in scientific papers. Unfortunately, while author-defined keywords are beneficial to researchers aiming to figure out the trends of their research fields, 45% of scientific papers in Microsoft Academic Graph did not contain their author-defined keywords. Additionally, six of the top seven AI conferences neither collect nor disclose keywords. This paper proposes a method for generating the keywords using Galactica, a pre-trained large language model published by Meta. We evaluate this method's performance by comparing the keywords provided by authors in the CoRL'22 and report characteristics of the generated keywords. Our study shows the F1 score of our proposed method was ten times better than that of previous studies, and 42.7% of the generated keywords are relevant to author-defined keywords.
AB - It is essential to understand research trends for researchers, decision-makers, and investors. One way to analyze research trends is to collect and analyze author-defined keywords in scientific papers. Unfortunately, while author-defined keywords are beneficial to researchers aiming to figure out the trends of their research fields, 45% of scientific papers in Microsoft Academic Graph did not contain their author-defined keywords. Additionally, six of the top seven AI conferences neither collect nor disclose keywords. This paper proposes a method for generating the keywords using Galactica, a pre-trained large language model published by Meta. We evaluate this method's performance by comparing the keywords provided by authors in the CoRL'22 and report characteristics of the generated keywords. Our study shows the F1 score of our proposed method was ten times better than that of previous studies, and 42.7% of the generated keywords are relevant to author-defined keywords.
KW - keywords
KW - language model
KW - text generation
KW - text mining
UR - http://www.scopus.com/inward/record.url?scp=85151927916&partnerID=8YFLogxK
U2 - 10.1145/3581754.3584126
DO - 10.1145/3581754.3584126
M3 - Conference contribution
AN - SCOPUS:85151927916
T3 - International Conference on Intelligent User Interfaces, Proceedings IUI
SP - 37
EP - 40
BT - IUI 2023 - Companion Proceedings of the 28th International Conference on Intelligent User Interfaces
PB - Association for Computing Machinery
T2 - 28th International Conference on Intelligent User Interfaces, IUI 2023
Y2 - 27 March 2023 through 31 March 2023
ER -