Vu Trinh – Medium

11.1K Followers

Pinned

Vu Trinh
in
The Deep Hub

All you need to know about the Google File System

How did Google build its large-scale file system?

May 12

All you need to know about the Google File System

May 12

Pinned

Vu Trinh
in
Data Engineer Things

How does Uber build real-time infrastructure to handle petabytes of data every day?

All insights from the paper: Real-time data infrastructure at Uber

Mar 23

How does Uber build real-time infrastructure to handle petabytes of data every day?

Mar 23

Vu Trinh

Everything you need to know about MapReduce

All the key insights from the paper MapReduce: Simplified Data Processing on Large Clusters from Google

1d ago

Everything you need to know about MapReduce

1d ago

Vu Trinh
in
Data Engineer Things

How Twitter processes 4 billion events in real-time daily

From Lambda to Kappa

May 25

How Twitter processes 4 billion events in real-time daily

May 25

Vu Trinh
in
Data Engineer Things

The Hadoop Distributed File System

Everything you need to know about the HDFS

May 25

The Hadoop Distributed File System

May 25

Vu Trinh
in
Data Engineer Things

I spent 5 hours understanding more about the Delta Lake table format

All insights from the paper: Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores

May 4

I spent 5 hours understanding more about the Delta Lake table format

May 4

Vu Trinh

GroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…

Plus: Solving RevenueCat’s data ingestion challenges into Snowflake, From ZooKeeper to KRaft: How the Kafka migration works

May 3

GroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…

May 3

Vu Trinh

GroupBy #32: Canva — Scaling to Count Billions, Ensuring Precision and Integrity: A Deep Dive into…

Plus: LLM fine-tuning and evaluation in BigQuery, How We Built Slack AI To Be Secure and Private

Apr 28

GroupBy #32: Canva — Scaling to Count Billions, Ensuring Precision and Integrity: A Deep Dive into…

Apr 28

Vu Trinh
in
Towards Data Science

The Stream Processing Model Behind Google Cloud Dataflow

Balancing correctness, latency, and cost in unbounded data processing

Apr 27

The Stream Processing Model Behind Google Cloud Dataflow

Apr 27

Vu Trinh
in
Data Engineer Things

Do We Need the Lakehouse Architecture?

When data lakes and data warehouses are not enough.

Apr 20

Do We Need the Lakehouse Architecture?

Apr 20

Vu Trinh

Vu Trinh

11.1K Followers

🚀 My newsletter vutr.substack.com 🚀 Subscribe for weekly writing, mainly about OLAP databases and other data engineering topics.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams