Tag: Streaming
-
Exploring an Apache Kafka to Pub/Sub migration: Major considerations
Posted on January 28, 2020, Level intermediate Resource Length medium
In many cases, Google's Pub/Sub messaging and event distribution service can successfully replace Apache Kafka, with lower maintenance and operational costs, and better integration with other Google Cloud services. By Leonid Yankulin.
Tags software-architecture apache streaming big-data machine-learning google
-
Why I recommend my clients NOT use KSQL and Kafka Streams
Posted on October 24, 2019, Level beginner Resource Length medium
An article by Jesse Anderson. He recommends his clients not use Kafka Streams because it lacks checkpointing. Kafka Streams also lacks and only approximates a shuffle sort. KSQL sits on top of Kafka Streams and so it inherits all of these problems and then some more.
Tags streaming software-architecture apache distributed
-
Using graph processing for Kafka Stream visualizations
Posted on September 9, 2019, Level intermediate Resource Length long
Article by David Allen. Focused on Graph processing for Kafka Stream visualizations. Apache Kafka® is great when one needs to dealing with streams, allowing you to conveniently look at streams as tables. Stream processing engines like KSQL furthermore give you the ability to manipulate all of this fluently.
Tags analytics apache streaming queues
-
Create your first AWS Lambda using Rust
Posted on December 6, 2018, Level intermediate Resource Length short
Blog post by Konstantin Kostov about how he created serverless function in Rust programming language and deployed it to AWS. It was an example AWS Lambda function tasked with checking if a provided serial number is correct and that it is unique (not already part of an existing dataset).
Tags programming functional-programming software serverless streaming
-
Parsing logs 230x faster with Rust
Posted on November 10, 2018, Level intermediate Resource Length medium
Andre Arko blog post about dealing with logs for very busy web application behind RubyGems.org. A single day of request logs was usually around 500 gigabytes on disk. They tried few hosted logging products, but at their volume they can typically only offer a retention measured in hours. The only thing they could think of to do with the full log firehose was to run it through gzip -9 and then drop it in AWS S3.
Tags json software programming serverless streaming
-
JVM Profiler: open source tool for tracing distributed JVM applications at scale
Posted on October 14, 2018, Level advanced Resource Length long
Bo Yang, Nan Zhu, Felix Cheung, Xu Ning from Uber Engineering team published blog post about JVM Profiles. Data is at the heart of strategic decision-making process at Uber. Right sizing the resources allocated to Spark applications and optimizing the operational efficiency of Uber data infrastructure requires fine-grained insights about these systems, namely their resource usage patterns.
Tags programming java distributed miscellaneous monitoring queues performance streaming
-
Apache Kafka is not for event sourcing
Posted on February 1, 2018, Level beginner Resource Length medium
Jesper Hammarbäck article in which he argues why Kafka is not the best tool for event sourcing. Kafka is a great tool for delivering messages between producers and consumers and the optional topic durability allows you to store your messages permanently. Forever if you'd like.
Tags software-architecture apache streaming big-data machine-learning
-
Sherlock: Near real time search indexing for commerce site
Posted on December 30, 2017, Level beginner Resource Length long
Prasanna Ranganathan from Flipkart published article about building a world-class e-commerce discovery experience through search. The dynamic nature of e-commerce poses unique challenges — stock units, availability, pricing, catalog data, etc. can all change at a very high rate and the system needs to keep up with the latest data lest the customer be disappointed.
Tags nosql software-architecture apache streaming
-
Apache Kafka exactly-once processing explained
Posted on August 15, 2017, Level intermediate Resource Length medium
Adam Warski blog post explaining real time processing with Apache Kafka and what its' new major feature - exactly-once semantics - really means.
Tags streaming queues apache