https://old.reddit.com/r/rust/comments/1dpvm0j/120ms_to_30ms_python_to_rust/

jump to content
my subreddits
edit subscriptions

  * popular
  * -all
  * -random
  * -users

 | 

  * AskReddit
  * -pics
  * -funny
  * -movies
  * -gaming
  * -worldnews
  * -news
  * -todayilearned
  * -nottheonion
  * -explainlikeimfive
  * -mildlyinteresting
  * -DIY
  * -videos
  * -OldSchoolCool
  * -TwoXChromosomes
  * -tifu
  * -Music
  * -books
  * -LifeProTips
  * -dataisbeautiful
  * -aww
  * -science
  * -space
  * -Showerthoughts
  * -askscience
  * -Jokes
  * -IAmA
  * -Futurology
  * -sports
  * -UpliftingNews
  * -food
  * -nosleep
  * -creepy
  * -history
  * -gifs
  * -InternetIsBeautiful
  * -GetMotivated
  * -gadgets
  * -announcements
  * -WritingPrompts
  * -philosophy
  * -Documentaries
  * -EarthPorn
  * -photoshopbattles
  * -listentothis
  * -blog

more >>
rust rust

  * comments

Want to join? Log in or sign up in seconds.|

  * English

[                    ][]
[ ]limit my search to r/rust

use the following search parameters to narrow your results:

subreddit:subreddit
    find submissions in "subreddit"
author:username
    find submissions by "username"
site:example.com
    find submissions from "example.com"
url:text
    search for "text" in url
selftext:text
    search for "text" in self post contents
self:yes (or self:no)
    include (or exclude) self posts
nsfw:yes (or nsfw:no)
    include (or exclude) results marked as NSFW

e.g. subreddit:aww site:imgur.com dog

see the search faq for details.

advanced search: by author, subreddit...

this post was submitted on  27 Jun 2024
153 points (87% upvoted)
shortlink:  [https://redd.it/1dpv]
Submit a new link
Submit a new text post
 
Get an ad-free experience with special benefits, and directly support
Reddit.
get reddit premium
rust

joinleave298,247 readers

281 users here now

Please read The Rust Community Code of Conduct

---------------------------------------------------------------------

The Rust Programming Language

A place for all things related to the Rust programming language--an
open-source systems language that emphasizes performance,
reliability, and productivity.

---------------------------------------------------------------------

Rules

Observe our code of conduct

  * Strive to treat others with respect, patience, kindness, and
    empathy.

  * We observe the Rust Project Code of Conduct.

  * Details

Submissions must be on-topic

  * Posts must reference Rust or relate to things using Rust. For
    content that does not, use a text post to explain its relevance.

  * Post titles should include useful context.

  * For Rust questions, use the stickied Q&A thread.

  * Arts-and-crafts posts are permitted on weekends.

  * No meta posts; message the mods instead.

  * Details

Constructive criticism only

  * Criticism is encouraged, though it must be constructive, useful
    and actionable.

  * If criticizing a project on GitHub, you may not link directly to
    the project's issue tracker. Please create a read-only mirror and
    link that instead.

  * Details

Keep things in perspective

  * A programming language is rarely worth getting worked up over.

  * No zealotry or fanaticism.

  * Be charitable in intent. Err on the side of giving others the
    benefit of the doubt.

  * Details

No endless relitigation

  * Avoid re-treading topics that have been long-settled or utterly
    exhausted.

  * Avoid bikeshedding.

  * This is not an official Rust forum, and cannot fulfill feature
    requests. Use the official venues for that.

  * Details

No low-effort content

  * No memes or image macros.

  * Use properly formatted text to share code samples and error
    messages. Do not use images.

  * Details

---------------------------------------------------------------------

Useful Links

Megathreads

Most links here will now take you to a search page listing posts with
the relevant flair. The latest megathread for that flair should be
the top result.

  * Alternative Rust Discussion Venues
  * Official Blog Posts
  * Rust Foundation Posts
  * Got a Question?
  * What's Everyone Working On?
  * Who's Hiring? Jobs Threads
  * This Week in Rust

Official Resources

  * Official Website
  * Official Blog
  * This Week In Rust
  * Installers
  * Source Code
  * Bug Tracker

Learn Rust

  * The Rust E-Book
  * Stdlib API Reference
  * Rust By Example
  * Rustlings
  * Online Playground

Discussion Platforms

  * Official Users Forum
  * Official Discord
  * Community Discord
  * Mozilla Matrix Chat
  * Stack Overflow Chat

a community for 13 years

MODERATORS

  * message the mods

discussions in r/rust
<>
X
 
153 * 7 comments
120ms to 30ms: Python  to Rust 
 
11 * 3 comments
iroh 0.19.0 - Make it your own
 
16 * 2 comments
cargo-dist 0.17.0 is out!
 
21 * 2 comments
Tutorial: Implementing JSON parsing (Rust)
 
86 * 7 comments
feature `lint_reasons` stabilized in nightly 1.81.0!
 
5 * 1 comment
I made clap-maybe-deser
 
6 * 6 comments
begging for help with cargo expand
 
5
Building a New Programming Language in 2024, pt. 1
 
9 * 19 comments
Pattern for a type representing a running service
 
Book or resource for rust CI/CD with GitHub actions , docker & k8s

Welcome to Reddit,

the front page of the internet.

Become a Redditor

and join one of thousands of communities.

x

152
153
154

120ms to 30ms: Python  to Rust  (self.rust)

submitted 6 hours ago * by stephenlblum

We love to see performance numbers. It is a core objective for us. We
are excited at another milestone in our ongoing effort: a 4x
reduction in write latency for our data pipeline, bringing it down
from 120ms to 30ms! This improvement is the result of transitioning
from a C library accessed through a Python application to a fully
Rust-based implementation. This is a light intro on our architectural
changes, the real-world results, and the impact on system performance
and user experience.

Chart A and Chart B shown in the image above.

So Why did we Switch to Rust from Python? Our Data Pipeline is Used
by All Services!

Our data pipeline is the backbone of our real-time communication
platform. Our team is responsible for copying event data from all our
APIs to all our internal systems and services. Data processing, event
storage and indexing, connectivity status and lots more. Our primary
goal is to ensure up-to-the-moment accuracy and reliability for
real-time communication.

Before our migration, the old pipeline utilized a C library accessed
through a Python service, which buffered and bundled data. This was
really the critical aspect that was causing our latency. We desired
optimization, and knew it was achievable. We explored a transition to
Rust, as we've seen performance, memory safety, and concurrency
capabilities benefit us before. It's time to do it again!

We Value Highly Rust Advantages with Performance and Asynchronous IO

Rust is great in performance-intensive environments, especially when
combined with asynchronous IO libraries like Tokio. Tokio supports a
multithreaded, non-blocking runtime for writing asynchronous
applications with the Rust programming language. The move to Rust
allowed us to leverage these capabilities fully, enabling high
throughput and low latency. All with compile-time memory and
concurrency safety.

Memory and Concurrency Safety

Rust's ownership model provides compile-time guarantees for memory
and concurrency safety, which preempts the most common issues such as
data races, memory leaks, and invalid memory access. This is
advantageous for us. Going forward we can confidently manage the
lifecycle of the codebase. Allowing a ruthless refactoring if needed
later. And there's always a "needed later" situation.

Technical Implementation of Architectural Changes and
Service-to-Service and Messaging with MPSC and Tokio

The previous architecture relied on a service-to-service
message-passing system that introduced considerable overhead and
latency. A Python service utilized a C library for buffering and
bundling data. And when messages were exchanged among multiple
services, delays occurred, escalating the system's complexity. The
buffering mechanism within the C library acted as a substantial
bottleneck, resulting in an end-to-end latency of roughly 120
milliseconds. We thought this was optimal because our per-event
latence average was at 40 microseconds. While this looks good from
the old Python service perspective, downstream systems took a hit
during unbundle time. This causes overall latency to be higher.

In Chart B above shows when we deployed that the average per-event
latency increased to 100 microseconds from the original 40. This
seems non-optimal. Chart B should show reduced latency, not an
increase! Though when we step back to look at the reason, we can see
how this happens. The good news is now that downstream services can
consume events more quickly, one-by-one without needing to unbundle.
The overall end-to-end latency had a chance to dramatically improve
from 120ms to 30ms. The new Rust application can fire off events
instantly and concurrently. This approach was not possible with
Python as it would have also been a rewrite to use a different
concurrency model. We could have probably rewritten in Python. And if
it's going to be a rewrite, might as well make the best rewrite we
can with Rust!

Resource Reduction CPU and Memory: Our Python service would consume
upwards of 60% of a core. The new Rust service consumes less than 5%
across multiple cores. And the memory reduction was dramatic as well,
with Rust operating at about 200MB vs Python's GBs of RAM.

New Rust-based Architecture: The new architecture leverages Rust's
powerful concurrency mechanisms and asynchronous IO capabilities.
Service-to-service message passing was replaced by utilizing multiple
instances of Multi-Producer, Single-Consumer (MPSC) channels. Tokio
is built for efficient asynchronous operations, which reduces
blocking and increases throughput. Our data process was streamlined
by eliminating the need for intermediary buffering stages, and opting
instead for concurrency and parallelism. This improved performance
and efficiency.

Example Rust App

The code isn't a direct copy, it's just a stand-in sample that mimics
what our production code would be doing. Also, the code only shows
one MPSC where our production system uses many channels.

 1. Cargo.toml: We need to include dependencies for Tokio and any
    other crate we might be using (like async-channel for events).
 2. Event definition: The Event type is used in the code but not
    defined as we have many types not shown in the this example.
 3. Event stream: event_stream is referenced but not created in the
    same way we do with many streams. Depends on your approach so the
    example keeps things simple.

The following is a Rust example with code and  Cargo.toml file. Event
definitions and event stream initialization too.

Cargo.toml

[package]
name = "tokio_mpsc_example"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1", features = ["full"] }

main.rs

use tokio::sync::mpsc;
use tokio::task::spawn;
use tokio::time::{sleep, Duration};

// Define the Event type
#[derive(Debug)]
struct Event {
    id: u32,
    data: String,
}

// Function to handle each event
async fn handle_event(event: Event) {
    println!("Processing event: {:?}", event);
    // Simulate processing time
    sleep(Duration::from_millis(200)).await;
}

// Function to process data received by the receiver
async fn process_data(mut rx: mpsc::Receiver<Event>) {
    while let Some(event) = rx.recv().await {
        handle_event(event).await;
    }
}

#[tokio::main]
async fn main() {
    // Create the channel with a buffer size of 100
    let (tx, rx) = mpsc::channel(100);

    // Spawn a task to process the received data
    spawn(process_data(rx));

    // Simulate an event stream with dummy data for demonstration
    let event_stream = vec![
        Event { id: 1, data: "Event 1".to_string() },
        Event { id: 2, data: "Event 2".to_string() },
        Event { id: 3, data: "Event 3".to_string() },
    ];

    // Send events through the channel
    for event in event_stream {
        if tx.send(event).await.is_err() {
            eprintln!("Receiver dropped");
        }
    }
}

Rust Sample Files

 1. Cargo.toml:
      + Specifies the package name, version, and edition.
      + Includes the necessary tokio dependency with the "full"
        feature set.
 2. main.rs:
      + Defines an Event struct.
      + Implements the handle_event function to process each event.
      + Implements the process_data function to receive and process
        events from the channel.
      + Creates an event_stream with dummy data for demonstration
        purposes.
      + Uses the Tokio runtime to spawn a task for processing events
        and sends events through the channel in the main function.

Benchmark

Tools used for Testing

To validate our performance improvements, extensive benchmarks were
conducted in development and staging environments. Tools, such as
hyperfine https://github.com/sharkdp/hyperfine and criterion.rs 
https://crates.io/crates/criterion  were used to gather latency and
throughput metrics. Various scenarios were simulated to emulate
production-like loads, including peak traffic periods and edge cases.

Production Validation

In order to assess the real-world performance of the production
environment, continuous monitoring was implemented using Grafana and
Prometheus. This setup allowed for the tracking of key metrics such
as write latency, throughput, and resource utilization. Additionally,
alerts and dashboards were configured to promptly identify any
deviations or bottlenecks in the system's performance, ensuring that
potential issues could be addressed promptly. We of course deploy
carefully to a low percentage of traffic over several weeks. The
charts you see are the full-deploy after our validation phase.

Benchmarks Are not Enough

Load testing proved improvements. Though yes, testing doesn't prove
success as much as it provides evidence. Write latency was
consistently reduced from 120 milliseconds to 30 milliseconds.
Response times were enhanced, and end-to-end data availability was
accelerated. These advancements significantly improved overall
performance and efficiency.

Before and After

Before the legacy system, service-to-service messaging was done with
C library buffering. This involved multiple services in the
message-passing loop, and the C library added latency through event
buffering. The Python service added an extra layer of latency due to
Python's Global Interpreter Lock (GIL) and its inherent operational
overhead. These factors resulted in high end-to-end latency,
complicated error handling and debugging processes, and limited
scalability due to the bottlenecks introduced by event buffering and
the Python GIL.

After implementing Rust, message-passing via direct channels
eliminated intermediary services, while Tokio enabled non-blocking
asynchronous IO, significantly boosting throughput. Rust's strict
compile-time guarantees reduced runtime errors, and we get robust
performance. Improvements observed included a reduction in end-to-end
latency from 120ms to 30ms, enhanced scalability through efficient
resource management, and improved error handling and debugging
facilitated by Rust's strict typing and error handling model. It's
hard to argue using anything other than Rust.

Deployment and Operations

Minimal Operational Changes

The deployment underwent minimal modifications to accommodate the
migration from Python to Rust. Same deployment and CI/CD.
Configuration management continued to leverage existing tools such as
Ansible and Terraform, facilitating seamless integration. This
allowed us to see a smooth transition without disrupting the existing
deployment process. This is a common approach. You want to change as
little as possible during a migration. That way, if a problem occurs,
we can isolate the footprint and find the problem sooner.

Monitoring and Maintenance

Our application is seamlessly integrated with the existing monitoring
stack, comprising Prometheus and Grafana, enabling real-time metrics
monitoring. Rust's memory safety features and reduced runtime errors
have significantly decreased the maintenance overhead, resulting in a
more stable and efficient application. It's great to watch our build
system work, and even better to catch the errors during development
on our laptops allowing us to catch errors before we push commits
that would cause builds to fail.

Practical Impact on User Experience

Improved Data AvailabilityQuicker write operations allow for
near-instantaneous data readiness for reads and indexing, leading to
user experience enhancements. These enhancements encompass reduced
latency in data retrieval, enabling more efficient and responsive
applications. Real-time analytics and insights are better too. This
provides businesses with up-to-date information for informed
decision-making. Furthermore, faster propagation of updates across
all user interfaces ensures that users always have access to the most
current data, enhancing collaboration and productivity within teams
who use the APIs we offer. The latency is noticeable from an external
perspective. Combining APIs can ensure now that data is available and
sooner.

Increased System Scalability and Reliability

Rust-focused businesses will get a serious boost advantage. They'll
be able to analyze larger amounts of data without their systems
slowing down. This means you can keep up with the user load. And
let's not forget the added bonus of a more resilient system with less
downtime. We're running a business with a billion connected devices,
where disruptions are a no-no and continuous operation is a must.

Future Plans and Innovations

Rust has proven to be successful in improving performance and
scalability, and we are committed to expanding its utilization
throughout our platform. We plan to extend Rust implementations to
other performance-critical components, ensuring that the platform as
a whole benefits from its advantages. As part of our ongoing
commitment to innovation, we will continue to focus on performance
tuning and architectural refinements in Rust, ensuring that it
remains the optimal choice for mission-critical applications.
Additionally, we will explore new asynchronous patterns and
concurrency models in Rust, pushing the boundaries of what is
possible with high-performance computing.

Technologies like Rust enhance our competitive edge. We get to remain
the leader in our space. Our critical  infrastructure is Rusting in
the best possible way. We are  ensuring that our real-time
communication services remain the best in class.

The transition to Rust has not only reduced latency significantly but
also laid a strong foundation for future enhancements in performance,
scalability, and reliability. We deliver the best possible experience
for our users.

Rust combined with our dedication to providing the best API service
possible to billions of users. Our experiences positions us well to
meet and exceed the demands of real-time communication now and into
the future.

  * 7 comments
  * share
  * save
  * hide
  * report

all 7 comments
sorted by:
best
topnewcontroversialoldrandomq&alive (beta)
 [                    ]

Want to add to the discussion?

Post a comment!

Create an account

[-]epicar 53 points54 points55 points 5 hours ago (1 child)

    We love to see performance numbers. It is a core objective for
    us. We are excited

who is we/us?

  * permalink
  * embed
  * save
  * report
  * reply

[-]stephenlblum[S] 21 points22 points23 points 5 hours ago* (0
children)

we/us PubNub

We are a distributed team of engineers working at PubNub. Rust is our
favorite language. And we are working to make sure that we get to use
as much of Rust as possible. The outcomes are great each time we
deploy a new Rust service. Our repeated success allows us to continue
taking the advantages that Rust offers 

  * permalink
  * embed
  * save
  * parent
  * report
  * reply

[-]stephenlblum[S] 33 points34 points35 points 5 hours ago (1 child)

Thank you to the Reddit r/rust community for requesting a rewrite of
the original posted article. The original article was fluffy. It had
no substance other than us saying: "look! we did a thing!". The new
updated post was improved by u/rtkay123, u/Buttleston, u/
the-code-father and u/RedEyed__ thank you!

The improvements they helped us with:

 1. Proper Graphs and Charts  with labels, legend and details on
    chat axis.
 2. Removing logos and names to prevent any possible advertisement.
 3. Posting directly to Reddit (vs linking out)
 4. Covering all the details and questions asked here and elsewhere.
 5. Annotated images using Reddit annotation feature.

  * permalink
  * embed
  * save
  * report
  * reply

[-]the-code-father 0 points1 point2 points 15 minutes ago (0
children)

Fwiw I think people are fine with (at least I am) out links that
raise awareness of a product/company that uses Rust as long as the
post contributes something meaningful/educational. It's definitely a
much better post this time around!

  * permalink
  * embed
  * save
  * parent
  * report
  * reply

[-]danted002 7 points8 points9 points 1 hour ago (2 children)

Interesting, it would actually help to see what the original Python
code did and what was the original C lib used.

From experience, usually a rewrite introduces optimisations to a
pipeline based on existing usage, which my itself brings a minimum of
2x performance.

You also mentioned the GIL being a blocker, however you also
mentioned using a C lib which is odd because all current established
C libs release the GIL; so, without the code per se, I would say your
4x improvement comes from the fact that you don't buffer anymore and
you stopped using a C lib that was blocking the GIL for no reasons,
and not from the fact you switched languages.

Usually a python to rust rewrite for CPU bound operations brings a
10x minimum performance boost. See uv, ruff and pydantic all of which
boast an improvement between 10x and 30x so the fact that you only
got a 4x boost seems like a "cannery in the mine" type of thing.

For comparison on one of my projects that was processing somewhere
between 5k and 10k events per second we achieved an 7x performance
boost just by switching from a sync codebase to an async one. Would
you know it, the IO was the actual bottleneck all along.

  * permalink
  * embed
  * save
  * report
  * reply

[-]stephenlblum[S] 0 points1 point2 points 1 hour ago* (1 child)

Hi u/danted002 excellent question! Performance gain really was +10x
like you mentioned. From a CPU and scale perspective we did meet
those gains. The CPU utilization was reduced. We can process more
events per second. The end-to-end latency is now optimal from the
transmission (send) that the new Rust service is responsible for.
This was the larger latency improvement we could achieve by rewriting
the transmitter. The remaining 30ms latency is from the downstream
systems. During the life of the original Python service, about 10
years, we spent time optimizing our Python event service pipeline. We
did our best to make it performant for what it could do for us. The
bundling approach was a general improvement, though it required
blocking the CPU with the GIL in our way. We knew eventually a
rewrite was needed to get to the next level. We were considering
better concurrency model in Python, potentially MP. Or an async
approach that would be similar to what we did with Rust. The
buffering event bundling was in Python land which was CPU bound and
this is where the GIL came into play. Threads couldn't help us. While
the buffering approach was originally an optimization it did prevent
us from pushing performance further. We really did want to move to
the Async approach like you were describing. It was one of our
options we had on our list  Our code was old and needed a rewrite
anyway to achieve the async approach. We could have done this in
Python. It really could have stayed in the Python world. We chose
Rust since we are getting good at it. And each time we deploy a new
Rust service it reduces our memory and CPU usage at the level of
gains you mentioned. We are happy with the upgrade and looking to
repeat this for our other services where it makes sense. Some of them
are fine in Python today. If we can capture some noteworthy gains and
a rewrite is on the table, then Rust is #1 choice [?]

  * permalink
  * embed
  * save
  * parent
  * report
  * reply

[-]danted002 1 point2 points3 points 42 minutes ago (0 children)

Thank you for the follow up. It answered all my questions.

Basically you reached the end of what could be optimised with the
existing codebase and required a full overhaul of the system. Since
you needed a full rewrite anyway you went with Rust because... well
it's Rust. Who doesn't love a low level language that can "talk" in
high level abstractions which incidentally is also being fully
embraced by the Python ecosystem.

Thank you once again for the explanation. [?]

  * permalink
  * embed
  * save
  * parent
  * report
  * reply

  * about
  * blog
  * about
  * advertising
  * careers

  * help
  * site rules
  * Reddit help center
  * reddiquette
  * mod guidelines
  * contact us

  * apps & tools
  * Reddit for iPhone
  * Reddit for Android
  * mobile website

  * <3
  * reddit premium

Use of this site constitutes acceptance of our User Agreement and
Privacy Policy. (c) 2024 reddit inc. All rights reserved.

REDDIT and the ALIEN Logo are registered trademarks of reddit inc.

[pixel]

p Rendered by PID 56048 on
reddit-service-r2-loggedout-bb9d5bc47-5hkmj  at 2024-06-27
23:01:54.268236+00:00 running 9cb577f country code: US.