Monday, May 19, 2025

The Problem With Generative AI: Too Much Artificial and Almost No Intelligence

Just as an observation, this is how bad it's getting in the fiction publishing market as "generative AI" floods the zone with literal shit (via Neil Clarke, Editor at Clarkesworld magazine):

Within months of ChatGPT’s public release, the signal-to-noise ratio shifted. Plagiarism was a fringe case and easily handled by the old model, but the sheer volume of generated work threatened to make human-written works the minority. The old way of finding the works we wanted to publish was no longer sustainable for us, so we temporarily closed submissions in February 2023.

When we reopened in March 2023, we implemented a new process that looks more like this:

copied from Clarke's article

The oval step is an in-house automated check. I haven’t spoken much about what we’re checking for there because I don’t want to make it easier for the spammers/sloppers to avoid being caught. Just like with malware and email spam, the patterns shift over time, so I’ve had to make regular changes within that oval over the last two years. (I am the developer of the submission software, so the responsibility for this falls to me.)

No process is perfect. Spam detection has existed for email for decades and still makes mistakes. I would never trust an algorithm to make a final assessment and fully accept that each “suspicious” story is a potential false positive. As such, I personally evaluate each suspicious submission. Our slush readers do not have access to this queue...

The intent of the oval is not to save time, but rather to act as a pressure valve. What broke our process in 2023 was the signal-to-noise ratio. By redirecting the flow of suspicious submissions to a separate queue, we’ve been able to maintain our team’s attention on the work that has to happen on a daily basis. Adopting this approach has given us the ability to weather storms significantly worse than the one that shut us down and more importantly, it has done so without creating an undo burden or deterrent for authors...

For those that would respond to our complaints with “why don’t you just judge it on its own merits”, keep dreaming. Despite the hype, even if we set aside our legal and ethical concerns with how these systems were developed, the output of these tools is nowhere near the standards we expect. Besides, we’ve said we don’t want it. We don’t publish mysteries or romance either, but those authors are at least respectful of our time and don’t insist that we evaluate their work “on its own merits” when it doesn’t meet our guidelines.


The problem with AI - especially as it's getting shoved down our throats by the tech lords who oversee our software and our Intertubes - is that it's not really "intelligent": AI can only process the oft-times bad data getting shoved into it (via David Linthicum at InfoWorld):

Many are telling me they thought generative AI was supposed to provide the best chance of an informational and helpful response. It seems the technology is not living up to that expectation. What the hell is going on?

Generative AI has the same limitations as all AI systems: It depends on the data used to train the model. Crappy data creates crappy AI models. Worse, you get erroneous responses or responses that may get you into legal trouble. It’s important to acknowledge the limitations inherent in these systems and understand that, at times, they can exhibit what may reasonably be called “stupidity.” This stupidity can put you out of business or get you sued into the Stone Age.

Generative AI models, including models like GPT, operate based on patterns and associations learned from vast data sets. Although these models can generate coherent and contextually relevant responses, they lack proper understanding and consciousness, leading to outputs that may seem perplexing or nonsensical.

You may ask a public large language model to create a history paper and get one explaining that Napoleon fought in the United States Civil War. This error is easily spotted, but mistakes made in a new genAI-enabled supply chain optimization system may not be so easy to spot. And these errors may result in millions of dollars in lost revenue.

Shorter answer: Shit in, shit out.

My mom was a high school teacher, focused on the honors/college-oriented students getting into IB and AP exam courses to get ahead in their graduate studies. In the last ten years of her work, it drove her crazy that most of her students - her gifted, intelligent students - would get lazy enough to copy and paste entire Wikipedia articles and submit them as their own research essays (they didn't even edit out the obvious footnote tags proving they were straight from the website). The whole point of studying and getting into college was gaining your own understanding of the topics, expressing them in your own thoughts, and proving you had the comprehension and reasoning skills to make you an expert in whichever field/profession you were going to be. And Wikipedia isn't that bad: for an Open Source encyclopedia it has a review and editing process to ensure the articles are factual and free of opinion/bad takes as much as possible.

Just think how much lazier the later generations have become with these "AI" apps spewing out illiterate, unfocused, flat-out wrong essays/homework assignments in our schools. Nobody's thinking because they think - falsely - that their overgrown word processor can write entire chapters for them.

And that's in the schools. Linthicum is reporting on how bad it is in the professional world where the legal liabilities are far more severe.

This is a serious problem facing librarians and the overall reference/research profession. As a reference librarian it is (maybe was now) my job to ensure the proper information got to the people asking for it, that the materials were well-vetted, fact-checked, and proven. I was in a lot of trouble if I gave people the wrong info.

Now were we are with AI as a research "tool" except that people are expecting it to be 100 percent accurate; when AI still has problems seeing beyond the poor data getting uploaded, or understanding that the algorithms that produced those results might have been in error. It doesn't help if a patron's existing bias blinds them to the factual information that does come up: they'll take the bad info if it fits their world-view (even if it kills them).

Generative AI has its own problems with copyright violations, not to mention no sense of aesthetics or poetry to provide a "soul" to the work of art getting created.

How the hell can I explain this to people who are already convinced AI is a good thing?


No comments:

Post a Comment