Generative coding tools and "security review"

06 June 2025

Recently, at work, someone mentioned that a popular generative coding tool, which will remain nameless, introduced a “security review” feature. I expressed some doubt about the value of an LLM-driven “security review” not because I don’t think an LLM can find some security problems, but because I think it’s too easy to think it can find all the security problems. I believe an important part of exposing a tool that claims to do “Security Review” is being very clear about its weaknesses and I suspected that this coding tool didn’t approach this with the subtlety and nuance it deserves.

So, I spent a couple of hours using this coding tool. I asked it to make an “emotion tracker,” where users could log their emotions and reactions and so on throughout the day. Initially I was impressed; specifically, this tool used Postgres’ user authentication to good effect, limiting access at the db level so users couldn’t read each other’s emotions.

Before publishing my dumb little app, I ran the “security review” step. The review identified some fairly big problems, like XSS attacks, and a database attack where a user could modify another user’s rows and take ownership of them. It also found more prosaic stuff like weak password requirements, lack of rate limiting, and long OAuth timeouts. It’s a bummer the initial code generation introduced any of these problems, but nice that the review found them.

So, I published the app, then decided to do the dumbest possible thing:

Okay, next I’d like to add a home page with “best emotions” on it. To start with, let’s qualify “best” as “highest intensity”. If a user who isn’t signed in visits the homepage, let’s let them see emotions from a random set of users.

I was hoping the coding tool would stop me from doing this, which is a prima facie terrible idea, but instead it helpfully suggested adding this policy to the db:

CREATE POLICY "Public can view random entries for homepage"
  ON public.entries
  FOR SELECT
  USING (true);

I accepted this, pretending to think that “Security Review” would find anything wrong with it. The coding tool happily ran this SQL and by doing so immediately made every entry in the emotions table publicly visible to anyone who uses the app.

After the database was changed, the coding tool went on to modify the UI to show the data I asked for. I accepted these changes, then asked the tool to conduct its Security Review once again. Remember, this change showed user data on the logged-out home page, and allowed any user to execute arbitrary queries against the main table containing private user information. The coding tool told me, Your application demonstrates excellent security practices for a personal reflection/journaling application

This is a contrived example, but it points out a few things:

“deploying” is not the only step before user-visible changes. Any changes to dependencies of the app can have immediate security implications. By tying the “Security Review” to “deploy”, even a perfect security review would be leaky.
“security review” doesn’t understand the real world. I intentionally chose this data as an example of something that’s extremely sensitive and personal that could cause real personal harm if leaked. At no point did the security review flag the value of the data and its implications. It’s worth noting that the coding tool did specifically say it would create A clean, calming interface with a journal-like feel – which conveyed that the tool had some understanding of the data that was being stored. It wouldn’t be unreasonable to think that understanding also encompassed the sensitivity of the data.
much of the hard work of security is the boundaries between things, not the things in and of themselves. Understanding those boundaries is real hard.

Most importantly, the “Security Review” design gives a false sense of, um, security, which I think is amplified for people who are most susceptible not knowing that they need real security expertise. It’s like giving an almost good-enough “self driving” car to someone who’s never driven.