[HN Gopher] Show HN: Semgrep App
___________________________________________________________________
Show HN: Semgrep App
https://semgrep.dev/products/semgrep-app Hi! I work on Semgrep, an
open-source project (discussed on HN previously [0][1)]. We're one
of those companies that maintain an OSS tool and a web app, and
then monetize by selling enterprise features on said web app. Our
free web app just went through a major revamp (sort of like a v1.0
release) so this feels like the perfect time to share and hear what
the HN crowd thinks! Let me start with some backstory on Semgrep.
Our team, r2c, has been experimenting with various ways to help
organizations step up their application security game. One of our
earliest experiments was Bento, a wrapper around multiple existing
linters to help people configure various tools like ESLint and
Bandit in one go. The bottleneck with a tool like this was, of
course, interfacing with more and more tools. I had previously
worked on a similar project called coala[2] which got all the way
up to 78 analyzers covering 54 languages, until the project ground
to a halt over the maintenance burden of all that. One of our team
members at r2c came up with a novel approach to this problem: he
suggested reusing some of his old work on Coccinelle[3] and later
Sgrep[4], which were tools to search parsed syntax trees of various
languages. Conceptually this meant that while Bento and coala could
standardize the command-line interface, the configuration syntax,
and file targeting logic of linters, now we could also standardize
the core linting logic. Extending Bento with linting rules using
this pattern language proved to be so easy that we rather just
reimplemented the existing linters with it. And thus, Semgrep was
born specifically to scan code with these pattern definitions, and
there was no longer a need for Bento. Our rule registry[5] now
contains over 1,500 rule definitions in this standardized linter
rule definition language, across 20 languages. And this leads us
to our web app. Early adopters of Semgrep encountered problems
rolling out the CLI tool across their organization. Their key
needs: scanning hundreds of repos, reviewing all their scan
results, deploying custom organization-internal rules across them,
and avoiding backlash from developers during all that. We also made
the unorthodox decision to start with a ground rule that we never
ever want to have access to the source code of our customers. These
needs and rules guided our web app's feature set, which ended up
being: provisioning CI jobs on repositories, centrally configuring
which rules should block builds or notify people, sending
notifications via PR comments/Slack/email, and displaying the list
of all findings, along with some analytics. As for today, we just
launched a major release of Semgrep App, which cuts down on the
complexity that built up in our original implementation, and we
also tried to expand the problem space our app tackles all the way
through remediating issues on the web UI. You can read more about
these recent changes at https://r2c.dev/blog/2021/semgrep-app-
fall-2021-updates/ And as for the future, two main areas of
interest are 1) intelligently selecting all the right Semgrep
Registry rules for a given project and 2) creating a smooth
workflow for organizations to collaboratively maintain their own
set of internal Semgrep rules. Please check out the app we built
at https://semgrep.dev/products/semgrep-app, and let us know what
you think! I'll be hanging out in the comments as one of the
engineers who built the app, but our CEO (ievans) is also ready to
answer questions, and the rest of the team will surely be lurking
here as well. [0]: https://news.ycombinator.com/item?id=24931985
[1]: https://news.ycombinator.com/item?id=26904951 [2]:
https://github.com/coala/coala/ [3]:
https://en.wikipedia.org/wiki/Coccinelle_(software) [4]:
https://github.com/facebookarchive/pfff/wiki/Sgrep [5]:
https://semgrep.dev/r/
Author : underyx
Score : 51 points
Date : 2021-10-22 16:24 UTC (6 hours ago)
| smoldesu wrote:
| Is the "Enforce Security" dialogue on your website supposed to
| overlap with the infographic? I'm browsing on Chromium/Linux
| underyx wrote:
| Could you check again? We released a fix for this exactly as
| you commented :D
| Fervicus wrote:
| Congrats on the launch! Just a heads up that the website seems to
| have some issues on Firefox. The green check marks show up over
| the copy making it unreadable.
| pedalpete wrote:
| Not just firefox, brave (chromium) too
| underyx wrote:
| Thank you! We're releasing a fix for this right now.
| T3RMINATED wrote:
| sonar qube
| frellus wrote:
| Most of it written in OCaml, cool! What made you pick OCaml as
| the primary language to use for the business logic?
| underyx wrote:
| Technically, OCaml only applies to Semgrep, as the app which is
| the subject of this post uses a more neo-traditional Python &
| TypeScript stack :)
|
| I don't have full context on the parser core, but I do know
| that a major thing we've got going for OCaml is a translation
| layer we wrote for getting OCaml code generated based on tree-
| sitter grammars: https://github.com/returntocorp/ocaml-tree-
| sitter-semgrep
| padator wrote:
| Why OCaml? It's a great language to write programs that works
| on complex data structures, e.g. ASTs. This choice was actually
| not very original: people in academia at stanford, berkeley,
| Microsoft research used OCaml for program analysis (CCured,
| Saturn, CIL, SLAM). And now and now the industry is also using
| it (Facebook Infer, Facebook Hack/Flow/Pyre, MS Static Device
| Verifier, etc.)
| underyx wrote:
| To add some context, padator is on the Semgrep team; he's the
| person I referenced as
|
| > One of our team members at r2c came up with a novel
| approach to this problem: he suggested reusing some of his
| old work on Coccinelle[3] and later Sgrep[4]
| underyx wrote:
| Clickable links:
|
| https://semgrep.dev/products/semgrep-app
|
| [0]: https://news.ycombinator.com/item?id=24931985
|
| [1]: https://news.ycombinator.com/item?id=26904951
|
| [2]: https://github.com/coala/coala/
|
| [3]: https://en.wikipedia.org/wiki/Coccinelle_(software)
|
| [4]: https://github.com/facebookarchive/pfff/wiki/Sgrep
|
| [5]: https://semgrep.dev/r/
| losvedir wrote:
| This looks neat, but I'm still not sure I quite get it. Do I
| understand correctly that earlier tools helped you _use_ , e.g.,
| ESLint, but now it _replaces_ ESLint and does the linting itself?
| Or is it still something of an orchestrator of different
| underlying linters?
| underyx wrote:
| Semgrep replaces ESLint's security rules. We have a ruleset[0]
| which shows you how we reimplemented the eslint security
| plugin's rules with our pattern matching language. I'm not sure
| why there's a mismatch in the number of rules between the
| original and our implementation; perhaps a more eye-catching
| example is GitLab's re-implementation of Bandit's rules[1].
| GitLab used to bundle Bandit in their SAST analyzer, but they
| recently switched over to generating the same results via
| Semgrep[2], as our tool is faster and they can replace many of
| their linter integrations with it.
|
| [0]: https://semgrep.dev/p/eslint-plugin-security
|
| [1]: https://semgrep.dev/p/gitlab-bandit
|
| [2]: https://r2c.dev/blog/2021/introducing-semgrep-for-
| gitlab/#se...
| underyx wrote:
| Oh, silly me. I totally forgot that the GitLab team
| reimplemented also a set of ESLint rules in Semgrep[0], just
| like I mentioned they did with Bandit. We published an in-depth
| comparison with ESLint[1] that might clear things up even more.
|
| [0]: https://semgrep.dev/p/gitlab-eslint
|
| [1]: https://r2c.dev/blog/2021/javascript-static-analysis-
| compari...
___________________________________________________________________
(page generated 2021-10-22 23:00 UTC)