[HN Gopher] Ask HN: What is the cheapest, easiest way to host a ...
___________________________________________________________________
Ask HN: What is the cheapest, easiest way to host a cronjob in
2022?
I thought this could start a good debate on the subject. Myself I
have had to make a short running web-scraping job that, given a
change in the site, sends a notifying email. This running once an
hour. It is 2022, so I had thought it would be tremendously easy
and cheap, but it seems no solution is easily implemented.
Author : heywhatupboys
Score : 96 points
Date : 2022-12-19 19:56 UTC (3 hours ago)
| Tombar wrote:
| Github actions FTW! > https://docs.github.com/en/actions/using-
| workflows/events-th...
|
| I remember seeing a couple projects shared before, using this
| technique to scrape sites with GHA
| that_guy_iain wrote:
| I came to suggest the same thing.
|
| I use their cronjob functionality to ensure my docker images
| are built daily and therefore in theory secure.
| rpastuszak wrote:
| Just note that they're not guaranteed to be called precisely on
| time, e.g. my "every 15m" CRON job will be called every 15m _at
| best_, in practice... twice per hour.
|
| This works perfectly for my case (content syndication for
| https://potato.horse), and I'm pretty happy with GH actions for
| this kind of stuff, but if you need something more precise, you
| might want to look somewhere else.
| 1f60c wrote:
| AFAIK, scheduled GitHub workflows stop running after a while.
| But when that happens, GitHub will send you an email with a big
| green "Continue running workflow" button.
| tracker1 wrote:
| I've got a couple projects like this, that mostly just create
| bundles of other source projects that I'm not involved with.
| Creating a windows installer, or docker image for projects
| that don't have that integrated. It's kind of annoying that
| they will stop in several weeks when there's no project
| changes.
| moehm wrote:
| > It's kind of annoying that they will stop in several
| weeks when there's no project changes.
|
| Can't you configure a GHA which commits nonsense every
| month?
| genericacct wrote:
| Can you not trigger an action via a local cronjob and the
| API?
| [deleted]
| WirelessGigabit wrote:
| I wonder if I can make a Github repo with an action that
| commits to another repo on a cron. Or itself?
| sauruk wrote:
| sounds like time to make a "hit the big green button" github
| action
| akerl_ wrote:
| I use crons to keep my Docker containers fresh, and have
| never hit this. But the cron commits to the repo, so I wonder
| if they're flagging repos with crons but no commits recently?
| tracker1 wrote:
| Definitely... It seems to be in 4-6 week timeframe... I'd
| thought about making the cron update references in the
| repo, but hadn't gone through that as of yet.
| rozenmd wrote:
| Easiest? Probably https://repeat.dev
| cornel_io wrote:
| If you have a GCP, AWS, or Azure account already and know
| Javascript, you should be able to whip up a serverless cron task
| real quick. IMO the simplest setup is probably GCP via Firebase
| (which makes deployment super simple once you install the CLI):
| https://firebase.google.com/docs/functions/schedule-function...
|
| You won't spend much at all, it's fractions of a penny per call
| and I think there's a free tier.
| spiffytech wrote:
| Fly.io's Machines support cron jobs. You just hand them a
| Dockerfile and a schedule, and you'll be within their free tier.
|
| https://fly.io/docs/machines/working-with-machines/#create-a...
|
| See the `schedule` parameter.
| mullen wrote:
| If your job is small enough and not ran too often, AWS is
| literally free.
| mhitza wrote:
| Just use a 2.5$/month instance on Vultr for your cronjob, and
| plain old sendmail for your emails (they will go in your spam
| folder but at least you don't have to waste time setting up
| emails).
| cube2222 wrote:
| EventBridge + AWS Lambda
|
| It's cheap as in free, thanks to the generous free tier.
| ghufran_syed wrote:
| isn't the free tier only for one year?
| giaour wrote:
| No, you get some free usage in perpetuity
| drpixie wrote:
| Or until they change the terms & conditions...
| Yeri wrote:
| https://github.com/dgtlmoon/changedetection.io
|
| There's a hosted solution, or you can self-host it
| emadda wrote:
| GCP Cloud Run with a schedule set to call the HTTP endpoint once
| an hour.
|
| Cloud Run will run any docker image, and give it a HTTPS URL.
| Scales to zero, allows running in the background for up to an
| hour.
|
| The benefit over FAAS is that you can run anything you want in
| the container, including multiple processes.
| maayank wrote:
| What do you guys use that for?
| brycewray wrote:
| https://cron-job.org
| martythemaniak wrote:
| Free tier on Google Cloud. You can use the free VM instance they
| provide (1 micro per month) and do an actual cron or use App
| Engine's cron thing.
| naet wrote:
| I set (and forgot) a Twitter bot a while ago that generates a
| randomized image and posts it,via GCP with a cron trigger twice a
| day. It hasn't cost me anything.
| dotluis wrote:
| www.deta.sh
| anyfactor wrote:
| cron-jobs.org is pretty good.
|
| I built a cron job utility service with Pipedream workflows
| because I needed some additional features like sending email
| report and hooking up to a cronjob monitor like cronitor or
| healthchecks.
| savrajsingh wrote:
| Google App Engine, Google Cloud Tasks, free
| [deleted]
| ectospheno wrote:
| Cheap vps running urlwatch sending alerts with pushover. Turn on
| unattended upgrades for the server. The one time pushover fee is
| well worth it.
| m463 wrote:
| I've never known about urlwatch, I always rolled my own in
| python (and then copy/pasted for the next task)
|
| there it is an apt-get away
|
| Thanks!
| theNailz wrote:
| +1 for urlwatch. I run it on my pc with Docker (restart=unless-
| stopped), and Discord integration on a personal Discord channel
| with alerts enabled. Takes two minutes to set up and is online
| whenever I'm online.
| mtrunkat wrote:
| Check out https://apify.com (disclaimer: I work there :)) - a
| platform focused on web automation and web scraping. You can
| easily build any (but mainly NodeJS or Python) project there as a
| Docker container and schedule it in a minute. See my 2 min. video
| - https://share.cleanshot.com/osSygJ
|
| Priced: $0.25/GB-hour + data transfer and storage
| andrewmcwatters wrote:
| https://visualping.io
| newbieuser wrote:
| apache airflow and a $5 vps might be the cleanest solution
| jrib wrote:
| All free:
|
| * free-tier vps on GCP or Oracle Cloud
|
| * lambda job on AWS
|
| I have a cheap VPS I use for other things and just run my cron
| jobs there.
| Eleison23 wrote:
| I don't know about the others, but AWS is not free and it's not
| cheap. There are ways to set alerts on budget overruns, but
| there is no foolproof way to limit monthly spending. Eventually
| you will configure something that goes out of control, or I was
| able to conceive of a few scenarios where a malicious third
| party could rack up charges for, e.g. S3 egress costs. Good
| luck contesting the charges more than once. I shut everything
| down and pulled out of AWS. I believe that Azure's free trial
| featured a hard cap, but still requires a credit card to open
| your account.
| akerl_ wrote:
| The lambda free tier on AWS is forever, and if you're kicking
| it via cloud watch vs an api gateway it shouldn't be possible
| for it to overrun.
| JamesSwift wrote:
| Lambda on AWS is basically free for personal crons. I ran
| lots on there until I moved everything to a personal k8s
| cluster for some more granular control over scheduling.
| skelpmargyar wrote:
| The free tier on Oracle Cloud is nuts. 200GB block storage,
| 10gb object storage, 24GB RAM 4 core ARM VM, 2 x86 VM's 1GB RAM
| each. From my experience, it's also really hard to get charged
| since you have to upgrade your account from a free tier. I
| haven't been charged one cent in several years.
| moltar wrote:
| AWS
|
| Lambda
|
| Event Bridge rule
| tonymet wrote:
| can you elaborate on the scope ? how many sites and unique pages
| ?
| tmpburning wrote:
| warrenm wrote:
| Sounds like a task for a cheap VPS
| oracle2025 wrote:
| Sometimes you get cron jobs with a cheap webhost, so you would
| not even need to maintain a cheap VPS
| goodpoint wrote:
| Use a systemd timer instead. On a SBC at home or a free VPS.
| sergiotapia wrote:
| Try https://render.com
|
| The UI is excellent for this. You can probably find cheaper on
| Fly but probably not easier.
|
| Even easier, maybe more expensive than Render:
|
| https://www.zeplo.io/docs/schedule
|
| You just hit a URL like so and it's done.
| zeplo.to/your_url.com?_cron=5|14| _|_ |*
| disambiguation wrote:
| cheapest would have to be running it on the computer you made
| this post from .. assuming its always on and the delta to your
| power bill is negligible. as others have suggested if you have an
| RPi or some other low power SBC laying around, that would be
| ideal.
|
| i'm sure the major cloud providers have cheap / free tiers for
| this kind of work, but quite frankly i've been burned by run away
| pricing tiers too many times to ever consider using cloud again
| for a personal project .. so unless this is getting funded by a
| client try to host your own.
| iforgotpassword wrote:
| Raspberry Pi at home if you don't have a home server already. Or
| some low spec cheap vps.
| dharmab wrote:
| A use case I've needed cloud services for is web scraping.
| Sites which IP ban web scrapers will still allow scraping from
| major cloud provider IP space.
| Arainach wrote:
| How are sites detecting and banning once-an-hour users? Yes,
| if you want to spider an entire website continuously, you
| could get IP banned, but if you're making 24 (or even 96)
| requests a day to visit one site and check some data, your
| traffic is indistinguishable from baseline.
| iforgotpassword wrote:
| I've seen bigger sites block requests from ip ranges that
| belong to hosting providers. Don't know about cloud
| providers, but doing it from home with a residential
| address has the highest chances of success. You might still
| end up in the cloudflare verification page though
| sometimes.
| gwn7 wrote:
| Yes yes yes. I came to post the exact same thing then saw this
| =)
| MuffinFlavored wrote:
| Eh.
|
| I tried to do Chromium/Puppeteer based scraping this way.
|
| Building a Dockerfile took ages due to the low compute. (Rust
| was a non-starter).
|
| I also had (foolishly) only bought the Pi with 2GB instead of
| 8GB so RAM was an issue.
|
| Disk was super slow.
|
| I'm not sure how viable this is, especially with how hard it
| is currently to source a Pi, let alone its computation/memory
| constraints.
| marcosdumay wrote:
| That's really deviating from the nature of the "cheapest,
| easiest way to host a cronjob" question. If the OP has that
| kind of requirement, he won't get good answers.
| TheDong wrote:
| > Building a Dockerfile took ages due to the low compute
|
| Why compile anything on the raspberry pi? Cross-compile on
| a machine with more compute (like your laptop, desktop,
| phone, or ec2 instance) one time, and then transfer the
| compiled binaries or built docker image over.
|
| > Pi with 2GB instead of 8GB
|
| For headless chrome, that should be enough unless you're
| doing other stuff with it. Unless you mean for compiling
| stuff, which as before can be done elsewhere.
| marginalia_nu wrote:
| You really don't need docker for this.
| Arainach wrote:
| It's true that you don't, but I can see the advantages.
|
| I have a Raspberry Pi that is natively running a scraper
| using headless Chromium and cron. It works great,
| except....
|
| I ended up needing a virtual framebuffer. I got it
| working on the Raspberry Pi, but I got a new workstation
| and wanted to edit my script and test it there. I got
| cryptic errors that I needed to debug to understand they
| were framebuffer issues, then attempt to recreate the
| setup that's running on my Pi, then debug that.....
|
| My first mistake was not writing down what I did in my
| README, but a Docker image would have saved me a ton of
| time here.
| SeriousM wrote:
| You can use the "browserless" docker service which contains
| a headless chrome browser in a docker container. It also
| supports puppeteer and playwright connect api. Works
| flawless! I use it in combination with n8n. All on a
| raspberry pi4b (yes, I got one recently)
| disiplus wrote:
| i run it on my NAS at home, one of the reasons is that it
| does not have a PUBLIC IP of one of the providers that is
| blocked on my scraping target.
| curiousgal wrote:
| Since we are at the topic, what the best way to schedule
| _production grade_ tasks on a Windows server? I joined a new
| company and they are using Task Scheduler and it is just awful!
| cerved wrote:
| step 1. install Linux
| asim wrote:
| https://m3o.com/cron
| johnmoberg wrote:
| You should check out Modal! Does much more than just cronjobs,
| but super easy and cheap: https://modal.com/docs/guide/cron. They
| have an example with a simple web scraper as well:
| https://modal.com/docs/guide/web-scraper
|
| From their pricing example:
|
| > You schedule a job to run once an hour for 15 seconds. This job
| uses 50% of one CPU core and 256 MiB of memory. This job will
| cost $0.016/day for the CPU, and $0.001/day for the memory,
| adding up to $0.017/day, or $0.51/month.
| yawnxyz wrote:
| Ooh and apparently everyone gets $30/mo of credits as a
| baseline, so this is within the free tier (if you don't want to
| pay 50 cents per month...)
| AndrewDucker wrote:
| I have a PowerShell script that collects some RSS and posts it to
| my journal.
|
| Easiest way I found to do that was Azure Functions. Costs me
| about 35p per month. Mostly for storage for logs as far as I can
| tell. I'd sort that, but it's literally not worth the time it
| would take
| tcmb wrote:
| Sounds like a job for serverless functions, e.g. in Azure [1],
| though I'm sure every hyperscaler has that. The Azure version
| even uses Cron syntax for the scheduling.
|
| [1] https://learn.microsoft.com/en-us/azure/azure-
| functions/func...
| bilal4hmed wrote:
| I came to post about this, its super simple too
| darkotic wrote:
| Maybe something like uptimerobot with a webhook.
| blueflow wrote:
| You could configure crond to invoke it for you. Debian already
| has a cron.hourly facility for that rhythm.
| ignoramous wrote:
| Schedule a cron job:
| https://developers.cloudflare.com/workers/platform/triggers/...
|
| Send an email (for free): https://blog.cloudflare.com/sending-
| email-from-workers-with-...
| giaour wrote:
| AWS Lambda and Azure Functions both support timer triggers and
| probably round down to $0/ month if your job only runs for a few
| seconds an hour
| shadeslayer_ wrote:
| I exclusively use Lambda with Cloudwatch Rules for all the
| crons I write these days. It just works out of the box.
| [deleted]
| Thristle wrote:
| Any of the managed serverless options with a free tier cloudflare
| has a free tier but it looks very limited compared to
| AWS/GCP/Azure
| holeyness wrote:
| Aws Lambda
| malfist wrote:
| AWS Lambdas could be used for this, you can trigger them on a
| time schedule using an event bridge.
| qbasic_forever wrote:
| Using their Python library called chalice was really easy to
| make cron style lambdas in Python in my experience.
| https://github.com/aws/chalice
| dylan604 wrote:
| what happens if the processing takes longer than 30s?
| RegnisGnaw wrote:
| The limit is now 30x that :)
| leetrout wrote:
| Lambdas can run for 15 minutes
| belligeront wrote:
| Citation here https://docs.aws.amazon.com/lambda/latest/dg/
| gettingstarted-...
| haolez wrote:
| Any serverless FaaS offering should have a Timer trigger, which
| is essentially a cronjob.
| tekno45 wrote:
| How short?
|
| How much local compute do you have?
|
| I think people underestimate dynamic DNS or a static IP if you
| can get one for your home for cheap.
| arpanarpan wrote:
| Hey, this is still a WIP, but I'm building
| http://choreography.cloud/. We wrap an open-source workflow
| orchestrator Temporal that offers "durable execution" with very
| detailed execution tracing and a bunch of recovery mechanisms. We
| offer a fully-managed Temporal cluster, hosted runtime (so you
| don't need to worry about provisioning/scaling workers), and
| offer version control integrations, CI/CD and dynamic config out
| of the box - so you can go from code to running workflows with no
| infrastructure overhead. Think "lambda for workflows". Feel free
| to get on our waitlist!
|
| Here's how a cron job implementation would look with Temporal -
|
| ` func Subscription(ctx choreography.Context, userUUID string)
| error { free_trial_on := true for { if
| free_trial_on { // Sleep until trial period passes
| choreography.Sleep(ctx, days(15)) free_trial_on = False
| } else { // Charge subscription fee
| choreography.ExecuteActivity(ctx,
| ChargeSubscriptionAndEmailReceipt, userUUID, frequency).Get(ctx,
| nil) // Sleep until next payment is due
| choreography.Sleep(ctx, days(30)) } }
|
| } `
| ohadpr wrote:
| I recently utilized Netlify for this. My build was actually my
| scraper and I used their Scheduled Functions (cron) to trigger
| this build every so often.
|
| https://www.netlify.com/blog/how-to-schedule-deploys-with-ne...
| ElevenLathe wrote:
| If this question came up at work, I would recommend a Lambda
| function that sends email via SES and is triggered periodically
| by a CloudWatch event. It would be cheap and require basically no
| admin overhead once set up.
|
| That said, there are probably different forces at play if this is
| a personal infrastructure question: while cheap is good, if you
| aren't fluent in AWS it may be a slog to set up. If you're not,
| easiest thing is probably a real cron job run on a cheap VPS.
| UI_at_80x24 wrote:
| Can't you host this on hardware you are running in your house?
| areichert wrote:
| Aren't firebase functions practically free?
| codegeek wrote:
| You can checkout https://cronhub.io (I run it). Behind the
| scenes, it wraps a lambda function with EventBridge (so you can
| set schedule) and adds other goodies like notifications/alert
| (slack/email etc), metrics, notifications if job runs slow etc.
|
| Not free but basically you wrap your job in http endpoint and
| then we take care of the rest.
| ThePhysicist wrote:
| I think OP doesn't want to keep a machine running permanently.
| If you do that, adding your own crontab entry would be trivial
| as well.
| codegeek wrote:
| Possibly. We do have paying customers who are using it so
| that they have better tracking, notifications and guarantee
| that the job is running (with metrics etc). But I hear you.
| For some devs, it may not be critical. I do have plans to
| allow adding your own code directly so that you don't have to
| come with an http endpoint.
| davnicwil wrote:
| I suppose with this solution you could also run one of those
| free tier machines that spins down when not in use, or a
| lambda function, etc
| latchkey wrote:
| Your own browser.
|
| https://chrome.google.com/webstore/detail/distill-web-monito...
| AtNightWeCode wrote:
| I would suggest anything serverless in most cases. Cheapest is
| not a great measure. Azure and GCP have good options. Cloudflare
| too if you just want to do something simple.
| s1k3s wrote:
| > It is 2022, so I had thought it would be tremendously easy and
| cheap, but it seems no solution is easily implemented.
|
| You assume someone purchased hardware (preferably available
| anywhere on the planet), powered it up, set it up, connected it
| to the internet, then built the software to handle your very
| specific task, then put it online to do specifically what you
| want and for a close-to-free price? And you're shocked this
| doesn't exist?
| bennyfreshness wrote:
| Google Apps Script
|
| https://developers.google.com/apps-script
|
| It's free, so pretty cheap.
|
| You can set up a schedule to run the scripts. Has easy access to
| Google APIs (Gmail).
|
| Very powerful and simple solution I've used for years.
___________________________________________________________________
(page generated 2022-12-19 23:00 UTC)