[HN Gopher] Launch HN: DeploySentinel (YC S22) - End-to-end test...
___________________________________________________________________
Launch HN: DeploySentinel (YC S22) - End-to-end tests that don't
flake
Hi HN, Michael and Warren here - cofounders of DeploySentinel
(https://deploysentinel.com). We make end-to-end testing easier and
more reliable. At my last job, it dawned upon me how many
production incidents and unhappy customers could have been avoided
with more test automation - "an ounce of prevention is worth a
pound of cure". However, it wasn't clear that you can get
prevention for just an ounce. Our teams ramped up on investment
into testing, especially end-to-end tests (via spinning up a
headless browser in CI and testing as an end user), but it quickly
became clear that these were incredibly expensive to build and
maintain, especially as test suite and application complexity grow.
When we asked around other engineering teams from different
companies, we consistently heard how time-intensive test
maintenance was. The worst part of end to end tests is when they
fail occasionally in CI but never locally--a heisenbug in your test
code, or what's usually referred to as a flaky test. The
conventional way to debug such an issue is to replay a video of
your CI's test browser, stepping between video frames to try to
parse what could be happening under the hood. Otherwise, your CI is
just a complete black box. Anyone that finds this story familiar
can probably attest to days spent trying to debug an issue like
this, possibly losing some hair in the process, and "resolving" it
in the end by just deleting the test and regaining their sanity.
Some teams even try to put front-end monitoring tools for
production into their CI process, only to realize they aren't able
to handle recording hundreds of test actions executed by a machine
over just a few seconds. After realizing how painful debugging
these tests could be, we started putting together a debugger that
can help developers pinpoint issues, more like how you debug issues
locally. Teams have told us there's a night and day difference
between trying to debug test failures with just video, and having a
tool that can finally tell them what's happening in their CI
browser, with the same information they're used to having in their
browser's devtools. We give you the ability to inspect DOM
snapshots, network events, and console logs for any step taken in a
Cypress test running in CI, to give more insight into why a
particular test might be failing. It's like Fullstory/LogRocket,
but for CI failures instead of production bugs. (We're starting
with Cypress tests, with plans to extend further.) Our tool
integrates with Cypress via their plugin API, so we're able to plug
in and record tests in CI with just an NPM install and 2 lines of
code. From there we're able to hook into Cypress/Mocha events to
capture everything happening within the test runner (ex. when a
test is starting, when a command is fired, when an element is
found, etc.) as well as open a debugger protocol port with the
browser to listen for network and console events. While a test
suite is running, the debugger is consistently collecting what's
happening during a test run, and uploads the information (minus
user-configured censored events) after every test completes. While
this may sound similar to shoving a LogRocket/FullStory into your
test suite, there's actually quite a few differences. The most
practical one is that those tools typically have a low rate limit
that work well for human traffic interacting with web apps at human
speeds, but break when dealing with parallelized test runner
traffic interacting with web app at machine speeds. Other minor
details revolve around us associating replays with test metadata as
opposed to user metadata, having full access to all network
requests/console messages emitted within a test at the browser
level, and us indexing playback information based on test commands
rather than timestamp (time is an unreliable concept in tests!).
Once a test fails, a Github PR comment is created and an engineer
can immediately access our web app to start debugging their test
failure. Alternatively, they can check our web dashboard as well.
Instead of playing a video of the failure in slow motion to
understand the issue, an engineer can step through the test
command-by-command, inspect the DOM with their browser inspect
element tool at any point, view what elements the test interacted
with, if any console messages were emitted during the action, or
take a look at every network request made along with HTTP error
codes or browser network error messages. Typically with this kind
of information, engineers can quickly find out if they have a
network-based race condition, a console warning emitted in their
frontend, a server-side bug, or a test failure from an edge case
triggered by randomly generated test data. We dream of a world
where applications have minimal bugs, happy customers, built with
engineering teams that don't see testing as an expensive chore!
Although the first pain we're addressing is tests that fail in CI,
we're working on a bunch of things beyond that, including the
second biggest issue in testing: test runtime length. We have a
free trial available for you to try out with your own tests, along
with a few live demos of what our debugger looks like on an example
test. You can get started here: https://deploysentinel.com/ We're
looking forward to hearing everyone else's experiences with end to
end tests, and what you think of what we're doing!
Author : mikeshi42
Score : 27 points
Date : 2022-08-02 15:01 UTC (8 hours ago)
| acemarke wrote:
| Obligatory alternative tool plug / comparison:
|
| I work for https://replay.io , which is a true time-traveling
| debugger for JS. We have forks of Firefox, Chrome, and Node,
| instrumented to capture syscalls at the OS level. Those
| recordings are uploaded to the cloud, and devs can then use our
| web client (effectively the Firefox DevTools as an app + a bunch
| of new features) to debug the recording at _any_ point in time,
| including adding print statements after the fact that show values
| every time a line was hit, step debugging, network requests,
| React + Redux DevTools integration, DOM inspection, and more.
|
| Currently, our main usage is manually recorded replays, but we're
| actually working on similar test integration features as well. We
| can record Playwright and Cypress tests, upload recordings of
| test runs, show results per test in our dashboard, and let you
| debug the full recordings of each successful and failed test
| (code, DOM, network, console messages, errors, etc). The test
| suite feature is early closed beta atm - we've been dogfooding it
| ourselves and it's _really_ helpful!
|
| Based on your description + a quick glance at your home page,
| sounds like we're addressing the same use case in similar ways,
| with some differences in the underlying recording technology and
| implementations
| mikeshi42 wrote:
| Great to see you guys here, I've heard a few mentions of your
| team's e2e integration in the works for a bit!
|
| We indeed do approach from a different direction, having been
| built around the e2e testing use case from the start, we've
| focused on integrating with existing CI setups without swapping
| browsers (some teams really love the bundled Electron runner!),
| as well as knowing exactly what actions are running in your
| test runner relative to the replay in a likely familiar UI
| (answering questions like what DOM element did Cypress match
| for this command? What was the DOM like exactly when Cypress
| started running that command? Which command executed before
| this network request?).
|
| Since you're from the team - one thing I've always wondered is
| how the pricing will work out for CI test runs? From the
| current public pricing I've found, it looks like it works great
| for developers recording test runs manually, but is extremely
| cost prohibitive at any scale if you're running it continually
| in CI.
| acemarke wrote:
| Nice, yeah, I can see the focus on "what actions" being
| useful.
|
| I'm an engineer, not GTM, so I'm not sure how the test runs
| feature plays into the listed "X recordings per month" at
| https://www.replay.io/pricing . Agreed that there's a
| distinct difference between engineers/QA making manual
| recordings, and CI cranking them out - right now we've got 52
| E2E tests that run on every PR, often multiple times due to
| re-pushes, and each of those tests generates a recording per
| run. So, obviously that burns through real fast :)
|
| If I had to guess we'd probably distinguish between those two
| use cases. I've tagged our main GTM person in case he wants
| to respond here.
| mikeshi42 wrote:
| Hahaha yup! We get a ton of volume on our platform from
| teams churning out test runs per-commit in their CI, where
| debugability is most important.
|
| Glad to hear from others in the space! Hope to learn more
| if your team's GTM person jumps on :)
| jasonlaster11 wrote:
| Congrats on launching! Using Session Replay in the CI space
| makes a lot of sense. And agreed that there's a lot that can
| be done by hooking into Cypress's events. By the way, we're
| doing the same and hope to show a Cypress reporter soon as
| well.
|
| Re pricing: we're still refining the pricing, but we assume
| that most of the time you'll only want to debug the failing
| tests and while the recordings are fairly small, the larger
| piece is actually replaying the browser as it ran before so
| you can add print statements in your application and play to
| when a network request was returned or an error was thrown.
| mikeshi42 wrote:
| Thanks Jason, can't wait to see what your team's been
| working on the Cypress end soon then!
| ushakov wrote:
| how does this compare to playwright?
|
| https://playwright.dev/
| mikeshi42 wrote:
| Our product currently integrates on top of existing test
| libraries (and we're starting with Cypress).
|
| If you'd be referencing how Cypress compares with Playwright -
| I'd say the two largest differences is Cypress has a promise-
| like/chain based syntax for authoring tests versus Playwright
| allows for async/await syntax. Outside of that, Cypress
| provides a pretty awesome local developer experience (imo)
| whereas Playwright has a leg up in flexibility (multi-tab
| support, browser support, better simulated actions like mouse
| hovering).
|
| If you're referencing how does our tool compare to what you
| might get with Playwright out of the box, Playwright does offer
| a really awesome trace viewer
| (https://playwright.dev/docs/trace-viewer) and actually had a
| brief chat with the original PM on Playwright a few weeks ago
| about it. It does capture a lot of similar debugging
| information - where we differ today from our debugger is a
| focus on making these historical runs easily accessible for
| engineers (don't need to go to your CI run, unzip an artifact,
| load it up locally, or build your own analytics dashboard for
| trends) as well as iterating on DX improvements like letting
| you scroll through/search all your network requests at once,
| and then jump to the point in time where a network request was
| made, which isn't possible in the trace viewer.
|
| In the future we're looking into providing deeper information
| beyond just browser-level telemetry, so a few ideas we've
| kicked around with our users include capturing Redux/React
| state (like Redux or React dev tools do locally today), or even
| being able to sync up backend telemetry relative to what's
| happening in the browser. (Ex. show me the logs that were
| printed to stdout in the app server container when my test was
| clicking on the check out button).
| aeontech wrote:
| Do you have plans for a self-hosted version (ie, enterprise use
| case where third party hosted tools cannot be used)?
| mikeshi42 wrote:
| We've talked about it a few times with some of our largest
| teams - but so far they've been happy with staying on the SaaS
| side of things.
|
| If it's absolutely a requirement at your workplace, we're happy
| to partner with you to make it work. We've discussed this
| internally and there's nothing inherent in our infrastructure
| that would make it impossible to deploy within an enterprise
| (the primary needs are the ability to run containers and have
| an S3-compatible object store).
|
| If you'd want to dive into details I'm happy to email you via
| the email in your profile, or if you want to give me a ping at
| mike [at] deploysentinel.com
| rsstack wrote:
| I love the concept and I will evaluate your tool soon!
|
| First comment, not on essence: the difference between your
| starter and pro plan is $5. If people have fewer than 85k runs /
| month, you're asking them to commit to $500/mo spend to get
| longer retention. That's... fine, but you can probably charge
| more for that. For people with more than 85k runs / month, you're
| only charging them $5/mo for that retention - and making the
| pricing page more complicated.
|
| One option: Increase your base price from $40 to $45, and say
| longer retention is only available for customers paying >$500/mo.
| You'll get what you have today but simpler.
|
| Another option: Charge more for longer retention! $10/1k-runs
| makes sense, or even $15. If retention is the differentiation
| between the two plans, charge for it like it's worth being an
| upsell.
| mikeshi42 wrote:
| Thank you for the feedback! Definitely agree with you that we
| still need to iterate on our pricing to strike a good balance.
| Right now we're likely going to be exploring the latter soon to
| better differentiate where the business plan is (ex.
| additionally gating richer analytics is what comes to mind - as
| some analytics only make sense once you hit a certain scale and
| frequency of tests).
|
| Let us know when you get a chance to play around with it as
| well! Would love to hear what you think when you get in app.
| rsstack wrote:
| I saw that your Recorder works with Playwright, but it seems
| that the main product is only for Cypress? Am I missing
| separate instructions for Playwright?
|
| Also, minor typo in HTML title: Cypresss -> Cypress (3 s')
| mikeshi42 wrote:
| my spidey senses were tingling last night feeling like I
| had a typo somewhere... Maybe it's time for me to build
| that spell checker Cypress plugin I've been thinking about.
| Fixed, thank you!
|
| Our main product is indeed only for Cypress today, I'm
| assuming you're using Playwright today? If so I'm wondering
| if you've had a chance to try out their trace viewer
| feature as well (https://playwright.dev/docs/trace-viewer).
| We're itching to build on Playwright as well, but want to
| do it once we've built out a set of features that can
| provide a few times improvement over the existing trace
| viewer. So I'd be curious if you've used the trace viewer,
| and if you already see our product as having a few legs up
| on it :)
| rsstack wrote:
| I never tried the trace viewer and maybe I should. It
| seems more limited than what you offer, no?
| mikeshi42 wrote:
| I would indeed say the trace viewer is more limited (I'm
| obviously biased) - especially when you compare the fact
| that we not only collect the telemetry but help integrate
| it into your workflow very easily (as opposed to setting
| up your own workflow via artifacts/S3 uploads in your CI
| pipeline) and the biggest being around aggregating these
| builds/statistics over time so that they're easily
| retrievable :)
|
| If you're open to it - I'd love to chat more on what your
| experience has been debugging Playwright tests and seeing
| how we could help there! I'm at mike [at]
| deploysentinel.com
| satyrnein wrote:
| A few questions on dependencies: do you need to be using the paid
| Cypress dashboard? do you need to be using Github for your repo?
| mikeshi42 wrote:
| Ah we really should have something on our site that explains
| this - no, there's no dependency on the paid Cypress dashboard.
| A few teams use us alongside them, most others don't. We've
| recently added load balancing for parallelism, as well as some
| basic analytics based on user feedback to help fill in those
| gaps if you're using just us.
|
| As for Github - no as well. We just have a Github app
| integration if you do to comment in on PRs. But we have teams
| using Gitlab/Gitlab CI as well. We print out our debugger links
| directly in your CI stdout or also add links into your JUnit
| report if that's enabled (ex for Jenkins). Though there's
| plenty of teams that just check our dashboard directly for test
| results as well! We just want to make it easy to access your
| test failure when they occur.
| satyrnein wrote:
| Thanks, so is this a drop-in replacement for paid Cypress,
| but better? Pricing seems similar with $6/1000 tests.
| mikeshi42 wrote:
| Yup!
|
| Typically our customers see the pain around debugability
| and currently waste dev & CI cycles rerunning tests and
| seeing their CI as largely ignored, or a productivity
| bottleneck to pushing code into main/production quickly. On
| top of that, they usually look for some tablestakes ability
| to load balance tests for parallelized runners, and some
| basic reporting to get a health check on their suite now
| and again.
|
| Teams where their primary concern is around in-depth
| analytics and graphing reports, we have those on the
| roadmap, but can't say we're better than Cypress Dashboard
| yet :) But we think our focus on debugging overall gives
| teams a better chance of fixing errors, as opposed to
| reporting on them.
|
| We have a free trial if you want to take it for a spin
| yourself! A lot of teams start off introducing us in an
| experimental PR to try out the product, and then merge into
| their main branch when they see the benefits. I'm also
| happy to chat in-depth on specifically where we might be
| able to help if you already have an existing Cypress set
| up, feel free to ping me at mike [at] deploysentinel.com
___________________________________________________________________
(page generated 2022-08-02 23:01 UTC)