PinnedVu TrinhinThe Deep HubAll you need to know about the Google File SystemHow did Google build its large-scale file system?May 126May 126
PinnedVu TrinhinData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at UberMar 2315Mar 2315
Vu TrinhEverything you need to know about MapReduceAll the key insights from the paper MapReduce: Simplified Data Processing on Large Clusters from Google1d ago1d ago
Vu TrinhinData Engineer ThingsHow Twitter processes 4 billion events in real-time dailyFrom Lambda to KappaMay 251May 251
Vu TrinhinData Engineer ThingsThe Hadoop Distributed File SystemEverything you need to know about the HDFSMay 25May 25
Vu TrinhinData Engineer ThingsI spent 5 hours understanding more about the Delta Lake table formatAll insights from the paper: Delta Lake: High-Performance ACID Table Storage over Cloud Object StoresMay 42May 42
Vu TrinhGroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…Plus: Solving RevenueCat’s data ingestion challenges into Snowflake, From ZooKeeper to KRaft: How the Kafka migration worksMay 3May 3
Vu TrinhGroupBy #32: Canva — Scaling to Count Billions, Ensuring Precision and Integrity: A Deep Dive into…Plus: LLM fine-tuning and evaluation in BigQuery, How We Built Slack AI To Be Secure and PrivateApr 28Apr 28
Vu TrinhinTowards Data ScienceThe Stream Processing Model Behind Google Cloud DataflowBalancing correctness, latency, and cost in unbounded data processingApr 27Apr 27
Vu TrinhinData Engineer ThingsDo We Need the Lakehouse Architecture?When data lakes and data warehouses are not enough.Apr 2013Apr 2013