[HN Gopher] Launch HN: DeploySentinel (YC S22) - End-to-end test...
       ___________________________________________________________________
        
       Launch HN: DeploySentinel (YC S22) - End-to-end tests that don't
       flake
        
       Hi HN, Michael and Warren here - cofounders of DeploySentinel
       (https://deploysentinel.com). We make end-to-end testing easier and
       more reliable.  At my last job, it dawned upon me how many
       production incidents and unhappy customers could have been avoided
       with more test automation - "an ounce of prevention is worth a
       pound of cure". However, it wasn't clear that you can get
       prevention for just an ounce. Our teams ramped up on investment
       into testing, especially end-to-end tests (via spinning up a
       headless browser in CI and testing as an end user), but it quickly
       became clear that these were incredibly expensive to build and
       maintain, especially as test suite and application complexity grow.
       When we asked around other engineering teams from different
       companies, we consistently heard how time-intensive test
       maintenance was.  The worst part of end to end tests is when they
       fail occasionally in CI but never locally--a heisenbug in your test
       code, or what's usually referred to as a flaky test. The
       conventional way to debug such an issue is to replay a video of
       your CI's test browser, stepping between video frames to try to
       parse what could be happening under the hood. Otherwise, your CI is
       just a complete black box.  Anyone that finds this story familiar
       can probably attest to days spent trying to debug an issue like
       this, possibly losing some hair in the process, and "resolving" it
       in the end by just deleting the test and regaining their sanity.
       Some teams even try to put front-end monitoring tools for
       production into their CI process, only to realize they aren't able
       to handle recording hundreds of test actions executed by a machine
       over just a few seconds.  After realizing how painful debugging
       these tests could be, we started putting together a debugger that
       can help developers pinpoint issues, more like how you debug issues
       locally. Teams have told us there's a night and day difference
       between trying to debug test failures with just video, and having a
       tool that can finally tell them what's happening in their CI
       browser, with the same information they're used to having in their
       browser's devtools.  We give you the ability to inspect DOM
       snapshots, network events, and console logs for any step taken in a
       Cypress test running in CI, to give more insight into why a
       particular test might be failing. It's like Fullstory/LogRocket,
       but for CI failures instead of production bugs. (We're starting
       with Cypress tests, with plans to extend further.)  Our tool
       integrates with Cypress via their plugin API, so we're able to plug
       in and record tests in CI with just an NPM install and 2 lines of
       code. From there we're able to hook into Cypress/Mocha events to
       capture everything happening within the test runner (ex. when a
       test is starting, when a command is fired, when an element is
       found, etc.) as well as open a debugger protocol port with the
       browser to listen for network and console events. While a test
       suite is running, the debugger is consistently collecting what's
       happening during a test run, and uploads the information (minus
       user-configured censored events) after every test completes.  While
       this may sound similar to shoving a LogRocket/FullStory into your
       test suite, there's actually quite a few differences. The most
       practical one is that those tools typically have a low rate limit
       that work well for human traffic interacting with web apps at human
       speeds, but break when dealing with parallelized test runner
       traffic interacting with web app at machine speeds. Other minor
       details revolve around us associating replays with test metadata as
       opposed to user metadata, having full access to all network
       requests/console messages emitted within a test at the browser
       level, and us indexing playback information based on test commands
       rather than timestamp (time is an unreliable concept in tests!).
       Once a test fails, a Github PR comment is created and an engineer
       can immediately access our web app to start debugging their test
       failure. Alternatively, they can check our web dashboard as well.
       Instead of playing a video of the failure in slow motion to
       understand the issue, an engineer can step through the test
       command-by-command, inspect the DOM with their browser inspect
       element tool at any point, view what elements the test interacted
       with, if any console messages were emitted during the action, or
       take a look at every network request made along with HTTP error
       codes or browser network error messages.  Typically with this kind
       of information, engineers can quickly find out if they have a
       network-based race condition, a console warning emitted in their
       frontend, a server-side bug, or a test failure from an edge case
       triggered by randomly generated test data.  We dream of a world
       where applications have minimal bugs, happy customers, built with
       engineering teams that don't see testing as an expensive chore!
       Although the first pain we're addressing is tests that fail in CI,
       we're working on a bunch of things beyond that, including the
       second biggest issue in testing: test runtime length.  We have a
       free trial available for you to try out with your own tests, along
       with a few live demos of what our debugger looks like on an example
       test. You can get started here: https://deploysentinel.com/  We're
       looking forward to hearing everyone else's experiences with end to
       end tests, and what you think of what we're doing!
        
       Author : mikeshi42
       Score  : 27 points
       Date   : 2022-08-02 15:01 UTC (8 hours ago)
        
       | acemarke wrote:
       | Obligatory alternative tool plug / comparison:
       | 
       | I work for https://replay.io , which is a true time-traveling
       | debugger for JS. We have forks of Firefox, Chrome, and Node,
       | instrumented to capture syscalls at the OS level. Those
       | recordings are uploaded to the cloud, and devs can then use our
       | web client (effectively the Firefox DevTools as an app + a bunch
       | of new features) to debug the recording at _any_ point in time,
       | including adding print statements after the fact that show values
       | every time a line was hit, step debugging, network requests,
       | React + Redux DevTools integration, DOM inspection, and more.
       | 
       | Currently, our main usage is manually recorded replays, but we're
       | actually working on similar test integration features as well. We
       | can record Playwright and Cypress tests, upload recordings of
       | test runs, show results per test in our dashboard, and let you
       | debug the full recordings of each successful and failed test
       | (code, DOM, network, console messages, errors, etc). The test
       | suite feature is early closed beta atm - we've been dogfooding it
       | ourselves and it's _really_ helpful!
       | 
       | Based on your description + a quick glance at your home page,
       | sounds like we're addressing the same use case in similar ways,
       | with some differences in the underlying recording technology and
       | implementations
        
         | mikeshi42 wrote:
         | Great to see you guys here, I've heard a few mentions of your
         | team's e2e integration in the works for a bit!
         | 
         | We indeed do approach from a different direction, having been
         | built around the e2e testing use case from the start, we've
         | focused on integrating with existing CI setups without swapping
         | browsers (some teams really love the bundled Electron runner!),
         | as well as knowing exactly what actions are running in your
         | test runner relative to the replay in a likely familiar UI
         | (answering questions like what DOM element did Cypress match
         | for this command? What was the DOM like exactly when Cypress
         | started running that command? Which command executed before
         | this network request?).
         | 
         | Since you're from the team - one thing I've always wondered is
         | how the pricing will work out for CI test runs? From the
         | current public pricing I've found, it looks like it works great
         | for developers recording test runs manually, but is extremely
         | cost prohibitive at any scale if you're running it continually
         | in CI.
        
           | acemarke wrote:
           | Nice, yeah, I can see the focus on "what actions" being
           | useful.
           | 
           | I'm an engineer, not GTM, so I'm not sure how the test runs
           | feature plays into the listed "X recordings per month" at
           | https://www.replay.io/pricing . Agreed that there's a
           | distinct difference between engineers/QA making manual
           | recordings, and CI cranking them out - right now we've got 52
           | E2E tests that run on every PR, often multiple times due to
           | re-pushes, and each of those tests generates a recording per
           | run. So, obviously that burns through real fast :)
           | 
           | If I had to guess we'd probably distinguish between those two
           | use cases. I've tagged our main GTM person in case he wants
           | to respond here.
        
             | mikeshi42 wrote:
             | Hahaha yup! We get a ton of volume on our platform from
             | teams churning out test runs per-commit in their CI, where
             | debugability is most important.
             | 
             | Glad to hear from others in the space! Hope to learn more
             | if your team's GTM person jumps on :)
        
           | jasonlaster11 wrote:
           | Congrats on launching! Using Session Replay in the CI space
           | makes a lot of sense. And agreed that there's a lot that can
           | be done by hooking into Cypress's events. By the way, we're
           | doing the same and hope to show a Cypress reporter soon as
           | well.
           | 
           | Re pricing: we're still refining the pricing, but we assume
           | that most of the time you'll only want to debug the failing
           | tests and while the recordings are fairly small, the larger
           | piece is actually replaying the browser as it ran before so
           | you can add print statements in your application and play to
           | when a network request was returned or an error was thrown.
        
             | mikeshi42 wrote:
             | Thanks Jason, can't wait to see what your team's been
             | working on the Cypress end soon then!
        
       | ushakov wrote:
       | how does this compare to playwright?
       | 
       | https://playwright.dev/
        
         | mikeshi42 wrote:
         | Our product currently integrates on top of existing test
         | libraries (and we're starting with Cypress).
         | 
         | If you'd be referencing how Cypress compares with Playwright -
         | I'd say the two largest differences is Cypress has a promise-
         | like/chain based syntax for authoring tests versus Playwright
         | allows for async/await syntax. Outside of that, Cypress
         | provides a pretty awesome local developer experience (imo)
         | whereas Playwright has a leg up in flexibility (multi-tab
         | support, browser support, better simulated actions like mouse
         | hovering).
         | 
         | If you're referencing how does our tool compare to what you
         | might get with Playwright out of the box, Playwright does offer
         | a really awesome trace viewer
         | (https://playwright.dev/docs/trace-viewer) and actually had a
         | brief chat with the original PM on Playwright a few weeks ago
         | about it. It does capture a lot of similar debugging
         | information - where we differ today from our debugger is a
         | focus on making these historical runs easily accessible for
         | engineers (don't need to go to your CI run, unzip an artifact,
         | load it up locally, or build your own analytics dashboard for
         | trends) as well as iterating on DX improvements like letting
         | you scroll through/search all your network requests at once,
         | and then jump to the point in time where a network request was
         | made, which isn't possible in the trace viewer.
         | 
         | In the future we're looking into providing deeper information
         | beyond just browser-level telemetry, so a few ideas we've
         | kicked around with our users include capturing Redux/React
         | state (like Redux or React dev tools do locally today), or even
         | being able to sync up backend telemetry relative to what's
         | happening in the browser. (Ex. show me the logs that were
         | printed to stdout in the app server container when my test was
         | clicking on the check out button).
        
       | aeontech wrote:
       | Do you have plans for a self-hosted version (ie, enterprise use
       | case where third party hosted tools cannot be used)?
        
         | mikeshi42 wrote:
         | We've talked about it a few times with some of our largest
         | teams - but so far they've been happy with staying on the SaaS
         | side of things.
         | 
         | If it's absolutely a requirement at your workplace, we're happy
         | to partner with you to make it work. We've discussed this
         | internally and there's nothing inherent in our infrastructure
         | that would make it impossible to deploy within an enterprise
         | (the primary needs are the ability to run containers and have
         | an S3-compatible object store).
         | 
         | If you'd want to dive into details I'm happy to email you via
         | the email in your profile, or if you want to give me a ping at
         | mike [at] deploysentinel.com
        
       | rsstack wrote:
       | I love the concept and I will evaluate your tool soon!
       | 
       | First comment, not on essence: the difference between your
       | starter and pro plan is $5. If people have fewer than 85k runs /
       | month, you're asking them to commit to $500/mo spend to get
       | longer retention. That's... fine, but you can probably charge
       | more for that. For people with more than 85k runs / month, you're
       | only charging them $5/mo for that retention - and making the
       | pricing page more complicated.
       | 
       | One option: Increase your base price from $40 to $45, and say
       | longer retention is only available for customers paying >$500/mo.
       | You'll get what you have today but simpler.
       | 
       | Another option: Charge more for longer retention! $10/1k-runs
       | makes sense, or even $15. If retention is the differentiation
       | between the two plans, charge for it like it's worth being an
       | upsell.
        
         | mikeshi42 wrote:
         | Thank you for the feedback! Definitely agree with you that we
         | still need to iterate on our pricing to strike a good balance.
         | Right now we're likely going to be exploring the latter soon to
         | better differentiate where the business plan is (ex.
         | additionally gating richer analytics is what comes to mind - as
         | some analytics only make sense once you hit a certain scale and
         | frequency of tests).
         | 
         | Let us know when you get a chance to play around with it as
         | well! Would love to hear what you think when you get in app.
        
           | rsstack wrote:
           | I saw that your Recorder works with Playwright, but it seems
           | that the main product is only for Cypress? Am I missing
           | separate instructions for Playwright?
           | 
           | Also, minor typo in HTML title: Cypresss -> Cypress (3 s')
        
             | mikeshi42 wrote:
             | my spidey senses were tingling last night feeling like I
             | had a typo somewhere... Maybe it's time for me to build
             | that spell checker Cypress plugin I've been thinking about.
             | Fixed, thank you!
             | 
             | Our main product is indeed only for Cypress today, I'm
             | assuming you're using Playwright today? If so I'm wondering
             | if you've had a chance to try out their trace viewer
             | feature as well (https://playwright.dev/docs/trace-viewer).
             | We're itching to build on Playwright as well, but want to
             | do it once we've built out a set of features that can
             | provide a few times improvement over the existing trace
             | viewer. So I'd be curious if you've used the trace viewer,
             | and if you already see our product as having a few legs up
             | on it :)
        
               | rsstack wrote:
               | I never tried the trace viewer and maybe I should. It
               | seems more limited than what you offer, no?
        
               | mikeshi42 wrote:
               | I would indeed say the trace viewer is more limited (I'm
               | obviously biased) - especially when you compare the fact
               | that we not only collect the telemetry but help integrate
               | it into your workflow very easily (as opposed to setting
               | up your own workflow via artifacts/S3 uploads in your CI
               | pipeline) and the biggest being around aggregating these
               | builds/statistics over time so that they're easily
               | retrievable :)
               | 
               | If you're open to it - I'd love to chat more on what your
               | experience has been debugging Playwright tests and seeing
               | how we could help there! I'm at mike [at]
               | deploysentinel.com
        
       | satyrnein wrote:
       | A few questions on dependencies: do you need to be using the paid
       | Cypress dashboard? do you need to be using Github for your repo?
        
         | mikeshi42 wrote:
         | Ah we really should have something on our site that explains
         | this - no, there's no dependency on the paid Cypress dashboard.
         | A few teams use us alongside them, most others don't. We've
         | recently added load balancing for parallelism, as well as some
         | basic analytics based on user feedback to help fill in those
         | gaps if you're using just us.
         | 
         | As for Github - no as well. We just have a Github app
         | integration if you do to comment in on PRs. But we have teams
         | using Gitlab/Gitlab CI as well. We print out our debugger links
         | directly in your CI stdout or also add links into your JUnit
         | report if that's enabled (ex for Jenkins). Though there's
         | plenty of teams that just check our dashboard directly for test
         | results as well! We just want to make it easy to access your
         | test failure when they occur.
        
           | satyrnein wrote:
           | Thanks, so is this a drop-in replacement for paid Cypress,
           | but better? Pricing seems similar with $6/1000 tests.
        
             | mikeshi42 wrote:
             | Yup!
             | 
             | Typically our customers see the pain around debugability
             | and currently waste dev & CI cycles rerunning tests and
             | seeing their CI as largely ignored, or a productivity
             | bottleneck to pushing code into main/production quickly. On
             | top of that, they usually look for some tablestakes ability
             | to load balance tests for parallelized runners, and some
             | basic reporting to get a health check on their suite now
             | and again.
             | 
             | Teams where their primary concern is around in-depth
             | analytics and graphing reports, we have those on the
             | roadmap, but can't say we're better than Cypress Dashboard
             | yet :) But we think our focus on debugging overall gives
             | teams a better chance of fixing errors, as opposed to
             | reporting on them.
             | 
             | We have a free trial if you want to take it for a spin
             | yourself! A lot of teams start off introducing us in an
             | experimental PR to try out the product, and then merge into
             | their main branch when they see the benefits. I'm also
             | happy to chat in-depth on specifically where we might be
             | able to help if you already have an existing Cypress set
             | up, feel free to ping me at mike [at] deploysentinel.com
        
       ___________________________________________________________________
       (page generated 2022-08-02 23:01 UTC)