Subj : Hub 3 Update To : Al From : deon Date : Fri Apr 12 2024 10:45 pm Re: Hub 3 Update By: Al to deon on Fri Apr 12 2024 05:00 am Hey Al, > I have noticed here in the last month or two that the net 3 hub gets put on > hold because it doesn't answer, or because there is some failure. > > I clear those holds periodically and things flow as expected until the next > hold comes along. > > I just cleared the holds on net 3 a hour ago or so. OK, there are probably a couple of reasons for this: * There is major construction going on nearby, and they are constantly taking my internet down for "maintenance" - and its prolonged (usually around 10hrs). (They are rebuilding the rail line near me, and its an 18-24 mth project while they move it above ground.). So I imagine this long outage is probably a primary reason. (I have a hotspot, which gets traffic when my main cable goes down, so mail still flows, but only outbound from me.) * I've taken hub down for updates. * I nightly backup "pauses" the container and backs up the hub, but that should only be a few mins. But that might be happing while there is a session active. * My IPv4 link goes down (IP6 is much more reliable...) Tonight I stopped the hub from accepting inbound calls while I cleared the backlog - it made it easier for me to trace a problem in the logs - which is when I noticed the kernel killing the db... ;) How many failed attempts (and time) before your system puts me on hold? > Sometimes when I watch mailer sessions with hub 3 the session is very slow. > This could also be the cause of failures. I don't know why the session > progresses slowly. A lack of memory perhaps? Slow as in there is a delay before there are transfers? binkp by default has a 5 min timeout, hopefully not that slow that it times out? Outbound mail bundles are built on the fly, and the DB has a lot of mail in it (I've never deleted anything...), but it should be seconds before mail packets are ready, not minutes... I just looked in the logs for a session tonight, and it looks like 2-5s: [2024-04-12 22:05:02] production.INFO: PB-:- We have authed these AKAs [21:4/106.0@fsxnet] {"pid":268} [2024-04-12 22:05:04] production.INFO: MA-:= Got [1] echomails for [21:4/106.0@fsxnet] for sending {"pid":268} [2024-04-12 22:05:07] production.INFO: IS-:- Sending item [0] (118c0100.pkt) {"pid":268} [2024-04-12 22:05:07] production.INFO: PB-:= Packet/File [118c0100.pkt], type [4] sent. {"pid":268} That said, I've noticed the website is slowing down, so I may need to think about better DB indexes and/or deleting some mail. To be honest, I'm surprised that memory is the issue - docker stats show it using < 200MB of the 512MB that I had assigned to the DB, yet the kernel was killing it (oom-killer). I've doubled it just in case, but I'll need to keep an eye on it. ....лоеп --- SBBSecho 3.20-Linux * Origin: I'm playing with ANSI+videotex - wanna play too? (21:2/116) .