Novel ideas and LLMs

I have been playing around with large language models (LLMs) lately. I prefer GPT-4, even compared to “open” models I can run locally. As everybody should know by now, it is impressive what these models can do. I still do not trust them to help me with my R code, but for the work I do in Python, ChatGPT is a lifesaver.

I have found an unintended use case for LLMs: to test whether an idea is novel or not. The relevant axiom is that LLMs, by definition, cannot provide novel ideas. I acknowledge this is somewhat of a negative definition of novelty, and people might rightfully talk about no true Scotsman and whatnot, but if an LLM can come up with an idea, I do not consider said idea novel.

Last year I wrote a post on why opinion polls might differ from election results. In brief, I provided a list of ten reasons why opinion polls can get it wrong. Was this a novel post? No. Why not? Because when I ask ChatGPT to provide me a list of ten reasons why opinion polls might differ from election results, the ten reasons are somewhat similar to the ones I provided in my post (albeit not completely). If the post had been novel, ChatGPT would not have been able to come up with something similar.

This is the simple heuristic I rely on to evaluate whether an idea is novel. If ChatGPT can make it, it is not novel. That is, if I can replicate the basic point of an idea with the help of a simple prompt, there is nothing novel about it. Ironically, there is nothing novel about this idea. ChatGPT, as with other LLMs, is merely a summary of a lot of textual data, and thereby a good tool to test whether something has been done before in some shape and form.

There is another reason why LLMs are not good at generating novel ideas. The moment an idea is novel, the greater the disagreement about the potential value of an idea (cf., Johnson and Proudfoot 2024). As LLMs are optimised towards inter-coder agreement, the purpose of an LLM is to be predictable (in the statistical and non-statistical meaning of the word). There is no reason to believe that LLMs are going to be better at creating novel ideas at any point in the future.

Again, it is truly impressive what LLMs can do. Similarly, it is impressive how unoriginal people can be with AI. If we look at the average content created by LLMs, we are working with anything but novelty. AI can be used to create a lot of mediocre, if not blatantly incorrect, content. Matt Dray wrote a good post on this the other day in the context of R material online.

The good thing is that if you see any indication that AI helped creating specific content, you can safely skip it. This is especially the case for people using text-to-image models. It is my stable disinterest heuristic: If there is an image in a blog post that looks like something out of AI, the content is not interesting.

LLMs are great, but they are not useful for providing novel ideas. On the contrary, if an idea is straight out of an LLM, it is – by (my) definition – not novel.