Josephine Skylar

passionate contemporary romance writer

A set of interconnected swooping lines, meant to separate the header from the rest of the website.

Why I’m Not Afraid of AI

A good many authors are thinking out loud about AI and LLMs (large language models) right now–KJ Charles, for example. And a friend of mine who was unhappily contemplating pulling their fanfic from AO3 the other day. My friend doesn’t write for profit, but for fun, and felt angry, taken advantage of, at the thought of companies hoovering up their words to teach LLMs to give a simulacrum of imagination.

I’m more sanguine about the hoovering; the very few fanfics I’ve written, ages ago, I’m leaving be. Which doesn’t mean my friend is wrong, necessarily. With respect especially to fanfic, Charles writes:

The people who wrote the billions of words on AO3 weren’t asked permission to have their words used as a training dataset. Their own creative, lovingly composed, often deeply personal work has been scraped and used without consent or payment to create profit-generating software. That is morally wrong if not legally questionable.

We can quibble over whether fanfic writers who start off with IP that wasn’t theirs to begin with (which is not all fanfic writers: AO3 has an “Original Work” tag) get to call foul on scrapers using their IP in turn. There was not a lot of sympathy for Anne Rice’s anti-fanfic stance back in the day, after all. But still: when you’ve uploaded your own creative work to the Internet under the tacit understanding that profit-generating entities will leave it alone, and then a whole new set of profit-generating entities start paying attention to use your words to unclear ends, you have the right to be miffed.

“Unclear ends” because at this point there are quite a few different ways LLMs can be used in the writing craft. (I’m using “LLMs” now instead of “AI” because, as Charles herself points out, the LLMs aren’t really intelligent: they’re basically large-scale prediction engines.) Joanna Penn, whose attitude towards LLMs is very different from Charles’s, wrote a rundown of many of them. At one end of the spectrum are programs such as Sudowrite*, which generates original text based on prompts; at the other is grammar-checking, or, as I will admit to having done myself, typing, “Hey, ChatGPT, can you give me some suggestions of popular first- and middle-name combinations for American parents to use for a girl baby born in 1993?” Somewhere in the middle is getting the LLM to roleplay as a character, like Tyler Cowen did with GPT Jonathan Swift. Not all of these uses are full-on writing, but all of them are enhanced by the kind of “training” that infuriated my friend.

And then there’s a separate advance of prediction engines in publishing: computer-generated audiobooks. You can get either Sparks Fly or The Way Through Disaster in audio form, read by a bot rather than a human. At the beginning of this year I had no plans for audiobooks: it seemed highly unlikely that my books would earn out the cost. Participating in Apple’s “AI audio” program reduced that cost to zero. I didn’t take away an opportunity for a human narrator to read my books; that opportunity didn’t exist to begin with.

But there’s the rub, isn’t it? The fear that the LLMs will take over our creative opportunities, bit by bit. There are writers out there using Sudowrite and similar programs to generate text and edit that text into books, and those books are selling. Part of Joanna Penn’s stance has always been that creative work is not a zero-sum game: even if computer-generated audiobooks spread, she’s argued, there will still be a market for “artisan” human-voiced audiobooks. But it sure is tempting to think: there are only so many hours in a day. Any given reader is only going to read so many books. And will those books be yours, or by a soulless program that never needs to eat or sleep or take care of others or feel rewarded?

The most pessimistic take I’ve seen so far was by a writer named Sean Thomas for the Spectator, earlier this year:

Writing is over. That’s it. It’s time to pack away your quill, your biro, and your shiny iPad: the computers will soon be here to do it better.

Computers are good at algorithms. It’s their thing. That means that, given enough data to train on (e.g. all the words ever written on the internet) computers can get really good at running the algos of language…. Of course, there are multiple, complex, layered, interlinked algorithms in most writing. Some have to follow the algorithms of story, some have to follow the algorithms of academe, or the haiku, or fanfic, Korean erotica, Python code, divorce documents, or verse drama. But they are all combos of algos, and therefore all, ultimately, prone to automation.

This brought me up short, because: he’s absolutely right. What I do, I do by algorithm. Veteran writers have told me to read Romancing the Beat by Gwen Hayes, and I have, and hopefully it’s improved my books, and by “improved” I mean made them hew closer to what romance readers expect from their books, which is to say follow the romance algorithm. We have to have our main characters meet, experience some kind of attraction to each other, have conflicts, and resolve those conflicts before making some kind of mutual commitment. It’s a formula. It’s a schematic. It’s an algorithm!

And it doesn’t surprise me that people are selling LLM-generated works, because sometimes–often–readers want the algorithm. They may not have a lot of time or headspace for reading; they may just want the mystery solved, the couple together without much conflict or complication, the bad guys to get their just desserts and the good guys to triumph. And, in case it doesn’t go without saying, there’s nothing wrong with that! Not every reader has to read for intellectual stimulation every time, and not every book has to be Ulysses. (Full disclosure: I have not read Ulysses.) But the closer a given book hews to the algorithm, the more easily replicable it is. The threat of the LLMs is not that they’re going to completely overtake human creativity any time soon; they’re not going to write Ulysses. The threat is that they’re going to be able to churn out algorithmically correct stories quickly enough and competently enough to meet reader demand, both for those readers who don’t need more and those readers who would like more but have trouble finding it in the masses and masses of supply.

For a romance writer whose books so far all feature a man and a woman meeting, talking, feeling a mutual romantic and sexual attraction, coming up with reasons why they can’t be together, and then deciding to clear out the necessary obstacles and declare their love for each other because it’s worth it (with the dude usually in glasses to boot), I like to think of myself as not a very formulaic writer. To give you an example: when I was working on the marketing for The Hand You’re Dealt, I got feedback along the lines of: “Wait, your hero’s not a billionaire? I mean, you’ve already got him in Vegas, and wealthy; why not just make him a billionaire?” The official answer was, because while professional poker players who work hard and get good at their game can make millions, as Pete does, but they do not make billions; I would have had to either change his profession, and therefore change him, or force the reader to suspend disbelief more than I like to. But the real answer was: because I’m stubborn. I had Pete in my head, and I wanted him to be himself and find love, and he was not a billionaire. There are lots and lots of readers who enjoy billionaire romances, and I’m cutting myself off from that audience. But I’m also making a choice that an LLM would not make.

So I don’t plan to use the LLM story- or text-generation engines, not because I’m morally opposed to them, but because that’s not where my comparative advantage lies. I cannot write a structured romance plot better than LLMs, but I can be me. I suppose it’s possible that someday I’ll have published so many novels, and be so in demand, that it’ll be worth someone’s time to train an LLM on my work and produce writing that’s even more me than I am: a Josie-bot. But honestly, that’s less of a risk than of my burning out long before then because I haven’t found an audience–and that risk exists whether LLMs are on the scene or not; that risk has always existed.

So that’s why I’m not afraid of AI putting me and my tiny little publishing empire out of business. But everyone’s going to make this calculation differently, and we’re all going to have to change our calculations as the LLMs change. And yes, you absolutely should assume that anything you put online for free going forward, from fanfic to blog posts to Tweets, is being scraped for LLM training (including by LLMs whose holding companies aren’t subject to US or EU copyright laws). That may leave you angry, or be only worth a shrug; either way it’s worth keeping in mind.

* it just occurred to me: Charles thinks the sudo in Sudowrite is supposed to echo “pseudo,” but sudo in Unix is also the command you use to become “root” and override the computer’s safeguards. It may thus be meant to imply more control by the user over the output, not less.