DeltaStream(@DeltaStreamInc) 's Twitter Profile Photo

Did you know Databricks integrates with DeltaStream? Now you can process streaming data and write results directly to . Keep your Delta Tables always up-to-date!
deltastream.io/integrating-de…

account_circle
Khuyen Tran(@KhuyenTran16) 's Twitter Profile Photo

Z Order in organizes data in storage to improve query performance.

In this example, the query has to scan through 8 separate files to find rows where id = 5. However, with Z Order optimization, the query only needs to scan one file to locate the desired rows.

account_circle
Apache XTable (Incubating)(@apachextable) 's Twitter Profile Photo

Have you tried the quickstart? Within minutes you can have a pipeline simultaneously using Hudi, Delta, and Iceberg. Check out the docs here: onetable.dev/docs/how-to/

Have you tried the #OneTable quickstart? Within minutes you can have a pipeline simultaneously using Hudi, Delta, and Iceberg. Check out the docs here: onetable.dev/docs/how-to/

#ApacheHudi #ApacheIceberg #DeltaLake #DataLakehouse
account_circle
Liam Brannigan(@braaannigan) 's Twitter Profile Photo

If you are using Polars and DeltaLake to work with large datasets it's helpful to understand how pl.scan_delta works to get maximum value from it. Good news - it's a relatively small amount of code and we walk through it here...

If you are using Polars and DeltaLake to work with large datasets it's helpful to understand how pl.scan_delta works to get maximum value from it. Good news - it's a relatively small amount of code and we walk through it here...
account_circle
Youssef Mrini(@YoussefMrini) 's Twitter Profile Photo

Support for liquid clustering is now generally available using Databricks Runtime +15.2

Getting started with Delta Lake Liquid clustering youtube.com/watch?v=6g685a…

Support for liquid clustering is now generally available using Databricks Runtime +15.2 

Getting started with Delta Lake Liquid clustering youtube.com/watch?v=6g685a…

#DeltaLake #Databricks
account_circle
Youssef Mrini(@YoussefMrini) 's Twitter Profile Photo

You can now enable liquid clustering on existing tables without the need to rewrite the underlying data.

It requires DBR 14.3 LTS+

You can now enable liquid clustering on existing tables without the need to rewrite the underlying data.

It requires DBR 14.3 LTS+

#DeltaLake #Databricks
account_circle
Khuyen Tran(@KhuyenTran16) 's Twitter Profile Photo

is a DataFrame library written in Rust that has blazing-fast performance.

has helpful features including ACID transactions, time travel, schema enforcement, and more.

Combining these two tools makes the code efficient for data processing and analysis.

#Polars is a DataFrame library written in Rust that has blazing-fast performance.

#DeltaLake has helpful features including ACID transactions, time travel, schema enforcement, and more.

Combining these two tools makes the code efficient for data processing and analysis.
account_circle
Advancing Analytics(@AdvAnalyticsUK) 's Twitter Profile Photo

Want to learn how to time travel?? 🚀
This is the second part in our YouTube Short series on the things you need to know about - and Chris is talking time travel!
Catch the first part here: hubs.la/Q01ZLlky0

account_circle
Khuyen Tran(@KhuyenTran16) 's Twitter Profile Photo

Partitioning data allows queries to target specific segments rather than scanning the entire table, which speeds up data retrieval.

The following code uses to select partitions from a DataFrame.

Partitioning data allows queries to target specific segments rather than scanning the entire table, which speeds up data retrieval.  

The following code uses #DeltaLake to select partitions from a #pandas DataFrame.
account_circle
Advancing Analytics(@AdvAnalyticsUK) 's Twitter Profile Photo

Things you need to know about Delta Lake: Part 3 - Optimisation 🎯

What can you do to make your faster, more reliable, and more scalable? Chris has the answer you're looking for!

account_circle
Kyle Weller(@KyleJWeller) 's Twitter Profile Photo

Saddle up for 🤠. Let's corral a fireside chat on table formats , , and . You don't want to miss this 🌶 discussion 3/27 11:30am. I will also intro the brand new Apache XTable (Incubating) (prev known as OneTable)

Saddle up for #DataCouncil 🤠. Let's corral a fireside chat on #lakehouse table formats #ApacheHudi, #DeltaLake, and #ApacheIceberg. You don't want to miss this 🌶 discussion 3/27 11:30am. I will also intro the brand new @apachextable (prev known as OneTable) 

#apachextable
account_circle
Delta Lake(@DeltaLakeOSS) 's Twitter Profile Photo

Jupyter notebooks are a great tool for data analysis in . They are easy to use and give you an intuitive, interactive interface to process and visualize your data.

Learn how you can use Delta Lake from a Notebook ➡ delta.io/blog/delta-lak…

account_circle
Delta Lake(@DeltaLakeOSS) 's Twitter Profile Photo

The Apache Druid community just added a Delta Lake connector via Delta Kernel Java.

Delta Kernel is an ambitious project to abstract all the core Delta logic into Java/Rust codebases, so each connector doesn't need to write all Delta processing logic from scratch.

The Apache Druid community just added a Delta Lake connector via Delta Kernel Java.

Delta Kernel is an ambitious project to abstract all the core Delta logic into Java/Rust codebases, so each connector doesn't need to write all Delta processing logic from scratch.

#deltalake
account_circle
Kyle Weller(@KyleJWeller) 's Twitter Profile Photo

🚨 Register to join me live 12/14 10am PST and I will answer all your Qs about @OnetableOSS. You no longer have to pick between , , and . Register to watch live, or get the recording: 👉 linkedin.com/events/onetabl…

account_circle
Téo Calvo(@teomewhy) 's Twitter Profile Photo

deu certo saporra! É para glorificar de pé! hahahaha

Milhares de linhas duplicadas que não iriam permitir inserirmos os dados usando upsert no Delta com Streaming.

deu certo saporra! É para glorificar de pé! hahahaha

Milhares de linhas duplicadas que não iriam permitir inserirmos os dados usando upsert no Delta com Streaming.

#databricks #healthtech #datasus #deltaLake #s3 #streaming #TeoMeWhy
account_circle