[HN Gopher] Show HN: A GitHub Action that quizzes you on a pull ...
___________________________________________________________________
Show HN: A GitHub Action that quizzes you on a pull request
A little idea I got from playing with AI SWE Agents. Can AI help
make sure we understand the code that our AIs write? PR Quiz uses
AI to generate a quiz from a pull request and blocks you from
merging until the quiz is passed. You can configure various options
like the LLM model to use, max number of attempts to pass the quiz
or min diff size to generate a quiz for. I found that the reasoning
models, while more expensive, generated better questions from my
limited testing. Privacy: This GitHub Action runs a local
webserver and uses ngrok to serve the quiz through a temporary url.
Your code is only sent to the model provider (OpenAI).
Author : dkamm
Score : 52 points
Date : 2025-07-29 18:20 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| frenchie4111 wrote:
| Next week on HN... Show HN: A GitHub Action that uses AI to
| answer PR quizzes
| dkamm wrote:
| Cluely 2.0
| sunrunner wrote:
| > AI Agents are starting to write more code. How do we make sure
| we understand what they're writing?
|
| This is a good question, but also how do we make sure that humans
| understand the code that _other humans_ have (supposedly)
| written? Effective code review is hard as it implies that the
| reviewer already has their own mental model about how a task
| could/would/should have been done, or is at the very least
| building their own mental model at reading-time and internally
| asking 'Does this make sense?'.
|
| Without that basis code review is more like a fuzzy standards
| compliance, which can still be useful, but it's not the same as
| review process that works by comparing alternate or co-
| operatively competing models, and so I wonder how much of that is
| gained through a quiz-style interaction.
| dkamm wrote:
| I imagine the quizzer could ask better questions along those
| lines with better context engineering (taking entire repo
| contents, design docs, discussions, etc and compressing those
| into a mental model). I just took the PR code changes and
| comments, so there's a lot of improvements that could be made
| there.
| shortrounddev2 wrote:
| Code review, to me, is not about validating the output. It's
| about a 2nd set of eyes to check for foot guns, best practice,
| etc. Code review is one step above linting and one step below
| unit tests, for me.
|
| If someone were to submit this code for review:
| getUser(id: number): UserDTO { return
| this.mapToDTO(this.userModel.getById(id)); }
|
| and I knew that `userModel` throws an exception when it doesn't
| find a user (and this is typescript, not java, where exceptions
| are not declared in the method prototype) then I would tell
| them to wrap it in a try-catch. I would also probably tell them
| to change the return type to `UserDTO | null` or
| `Result<UserDTO>` depending on the pattern that we chose for
| the API. I don't need to know anything about the original
| ticket in order to point these things out, and linters most
| likely won't catch them. Another use for code review is
| catching potential security issues like SQL injection that the
| linter or framework can't figure out (i.e, using raw SQL
| queries in your ORM without prepared statements)
| donatj wrote:
| See, I think this is a good idea even for reviewing non-agentic
| human-written PRs!
|
| We've got a huge _LGTM problem_ where people approve PRs they
| clearly don 't understand.
|
| Recently we had a bug in some code of an employee that got laid
| off. The people who reviewed it are both still with the company,
| but neither of them could explain what the code did.
|
| That triggered this angry tweet
|
| https://x.com/donatj/status/1945593385902846118
| dkamm wrote:
| Could definitely be used for human PRs too! Though I'm sure
| companies would love to track the reviewer scores
| robotsquidward wrote:
| What a fun world we devs now live in.
| rmnclmnt wrote:
| That's a fun take on a real issue, but...
|
| > Your code is only sent to the model provider (OpenAI)
|
| When has this become an acceptable << privacy >> statement?
|
| I feel we are reliving the era of free mobile apps at the expense
| of harvesting any user data for ads profiling before GDPR kicked
| in...
| stronglikedan wrote:
| That's not _the_ privacy statement though. I feel like we 're
| reliving the era of RTF... oh wait, we never left.
| rmnclmnt wrote:
| Ok I'll bite: putting << only >> implies this is not a big
| deal and a lesser of 2 evils, between an AI model provider
| harvesting prompts for retraining and a 3rd party hosting
| provider most probably only storing logs for security and
| accountability...
|
| So yes this is the second part of the privacy statement
| throwaway889900 wrote:
| Just submit a PR that removes the action so it doesn't run on the
| branch before the merge! If devs aren't reviewing the code
| anyways, will they even catch that kind of change?
| xmprt wrote:
| You could set up some hardcoded rules so that the PR is never
| merged without human review if it touches the github actions.
| LikesPwsh wrote:
| You could, but it would be mad to skip the code review
| because it "only" touches customer-facing code rather than
| GHA.
| Xss3 wrote:
| I would probably be putting devs on a pip or firing them if they
| failed these quizzes often...understanding your own prs is the
| bare fucking minimum, even without AI help.
| inetknght wrote:
| Won't be long before those people would just get AI to answer
| the quiz instead.
| LtWorf wrote:
| What makes you think the AI can instead generate the correct
| answers to double check the developer's answers?
| ElijahLynn wrote:
| This could actually be quite useful.
| hk1337 wrote:
| Cute but I wouldn't actually use it.
| henriquegodoy wrote:
| can i automate the process of answering this pr questions too?
| waynesonfire wrote:
| Nice! A quiz to ensure you understand your vibe code.
___________________________________________________________________
(page generated 2025-07-29 23:00 UTC)