We benchmarked AI-generated code against an AI security reviewer and published the results including where the reviewer made things worse

News

50 features, same model and prompts, two branches. Unreviewed branch shipped six CWE-502 native ObjectInputStream sinks and five sh -c command injection endpoints, several reachable by ordinary authenticated users. We also introduced a trust-all X509TrustManager on the reviewed branch and included it in the scoring rather than leaving it out. Methodology and per-feature data in the blog, repo is public if you want to rerun it. submitted by /u/VibeReview [link] [comments]Technical Information Security Content & DiscussionRead More