PinnedVu TrinhinThe Deep HubAll you need to know about the Google File SystemHow did Google build its large-scale file system?16 min read·May 12, 2024--6--6
PinnedVu TrinhinData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at Uber19 min read·Mar 23, 2024--15--15
Vu TrinhEverything you need to know about MapReduceAll the key insights from the paper MapReduce: Simplified Data Processing on Large Clusters from Google10 min read·1 day ago----
Vu TrinhinData Engineer ThingsHow Twitter processes 4 billion events in real-time dailyFrom Lambda to Kappa6 min read·May 25, 2024--1--1
Vu TrinhinData Engineer ThingsThe Hadoop Distributed File SystemEverything you need to know about the HDFS14 min read·May 25, 2024----
Vu TrinhinData Engineer ThingsI spent 5 hours understanding more about the Delta Lake table formatAll insights from the paper: Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores17 min read·May 4, 2024--2--2
Vu TrinhGroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…Plus: Solving RevenueCat’s data ingestion challenges into Snowflake, From ZooKeeper to KRaft: How the Kafka migration works6 min read·May 3, 2024----
Vu TrinhGroupBy #32: Canva — Scaling to Count Billions, Ensuring Precision and Integrity: A Deep Dive into…Plus: LLM fine-tuning and evaluation in BigQuery, How We Built Slack AI To Be Secure and Private7 min read·Apr 28, 2024----
Vu TrinhinTowards Data ScienceThe Stream Processing Model Behind Google Cloud DataflowBalancing correctness, latency, and cost in unbounded data processing14 min read·Apr 27, 2024----
Vu TrinhinData Engineer ThingsDo We Need the Lakehouse Architecture?When data lakes and data warehouses are not enough.10 min read·Apr 20, 2024--13--13