Twitter #DeltaLake hashtag • TwiCopy

DeltaStream

@DeltaStreamInc

3 hours ago

Did you know Databricks integrates with DeltaStream? Now you can process streaming data and write results directly to #DeltaLake . Keep your Delta Tables always up-to-date!
deltastream.io/integrating-de…
#DataEngineering #StreamingData #DeltaLake

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Khuyen Tran

7 months ago

Z Order in #DeltaLake organizes data in storage to improve query performance.

In this example, the query has to scan through 8 separate files to find rows where id = 5. However, with Z Order optimization, the query only needs to scan one file to locate the desired rows.

thumb_up_off_alt45

chat_bubble_outline0

account_circle

Apache XTable (Incubating)

6 months ago

Have you tried the #OneTable quickstart? Within minutes you can have a pipeline simultaneously using Hudi, Delta, and Iceberg. Check out the docs here: onetable.dev/docs/how-to/

#ApacheHudi #ApacheIceberg #DeltaLake #DataLakehouse

Have you tried the #OneTable quickstart? Within minutes you can have a pipeline simultaneously using Hudi, Delta, and Iceberg. Check out the docs here: onetable.dev/docs/how-to/

#ApacheHudi #ApacheIceberg #DeltaLake #DataLakehouse

thumb_up_off_alt12

chat_bubble_outline0

account_circle

Apache Doris

2 days ago

If you use Apache Doris as a data lakehouse, this is what the data stacks look like:

Read more: doris.apache.org/docs/lakehouse…

#Hive #Iceberg #Hudi #DeltaLake #ApachePaimon

If you use Apache Doris as a data lakehouse, this is what the data stacks look like:

Read more: doris.apache.org/docs/lakehouse…

#Hive #Iceberg #Hudi #DeltaLake #ApachePaimon

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Liam Brannigan

2 months ago

If you are using Polars and DeltaLake to work with large datasets it's helpful to understand how pl.scan_delta works to get maximum value from it. Good news - it's a relatively small amount of code and we walk through it here...

If you are using Polars and DeltaLake to work with large datasets it's helpful to understand how pl.scan_delta works to get maximum value from it. Good news - it's a relatively small amount of code and we walk through it here...

thumb_up_off_alt43

chat_bubble_outline0

account_circle

Youssef Mrini

1 week ago

Support for liquid clustering is now generally available using Databricks Runtime +15.2

Getting started with Delta Lake Liquid clustering youtube.com/watch?v=6g685a…

#DeltaLake #Databricks

Support for liquid clustering is now generally available using Databricks Runtime +15.2

Getting started with Delta Lake Liquid clustering youtube.com/watch?v=6g685a…

#DeltaLake #Databricks

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Youssef Mrini

3 months ago

You can now enable liquid clustering on existing tables without the need to rewrite the underlying data.

It requires DBR 14.3 LTS+

#DeltaLake #Databricks

You can now enable liquid clustering on existing tables without the need to rewrite the underlying data.

It requires DBR 14.3 LTS+

#DeltaLake #Databricks

thumb_up_off_alt5

chat_bubble_outline0

account_circle

Mim

6 days ago

#onelake ❤️❤️ deltalake and iceberg

#onelake ❤️❤️ deltalake and iceberg

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Khuyen Tran

11 months ago

#Polars is a DataFrame library written in Rust that has blazing-fast performance.

#DeltaLake has helpful features including ACID transactions, time travel, schema enforcement, and more.

Combining these two tools makes the code efficient for data processing and analysis.

#Polars is a DataFrame library written in Rust that has blazing-fast performance.

#DeltaLake has helpful features including ACID transactions, time travel, schema enforcement, and more.

Combining these two tools makes the code efficient for data processing and analysis.

thumb_up_off_alt114

chat_bubble_outline0

account_circle

Advancing Analytics

@AdvAnalyticsUK

9 months ago

Want to learn how to time travel?? 🚀
This is the second part in our YouTube Short series on the things you need to know about #DeltaLake - and Chris is talking time travel!
Catch the first part here: hubs.la/Q01ZLlky0

thumb_up_off_alt12

chat_bubble_outline0

account_circle

Mim

2 months ago

writing to #Fabric #onelake using a local Path is now supported for #Deltalake Python🎉🪅🥳❤️🎉🎉🎉🎉🎉🎉

writing to #Fabric #onelake using a local Path is now supported for #Deltalake Python🎉🪅🥳❤️🎉🎉🎉🎉🎉🎉

thumb_up_off_alt61

chat_bubble_outline0

account_circle

Khuyen Tran

9 months ago

Partitioning data allows queries to target specific segments rather than scanning the entire table, which speeds up data retrieval.

The following code uses #DeltaLake to select partitions from a #pandas DataFrame.

Partitioning data allows queries to target specific segments rather than scanning the entire table, which speeds up data retrieval.

The following code uses #DeltaLake to select partitions from a #pandas DataFrame.

thumb_up_off_alt58

chat_bubble_outline0

account_circle

Jacek Laskowski @[email protected]

@jaceklaskowski

4 months ago

Hey #DeltaLake fans 👋

3.1.0's on its way to your ETLs 🥳

Hey #DeltaLake fans 👋

3.1.0's on its way to your ETLs 🥳

thumb_up_off_alt4

chat_bubble_outline0

account_circle

Advancing Analytics

@AdvAnalyticsUK

5 months ago

Things you need to know about Delta Lake: Part 3 - Optimisation 🎯

What can you do to make your #DeltaLake faster, more reliable, and more scalable? Chris has the answer you're looking for!

#DataEngineering #DataOptimisation

thumb_up_off_alt9

chat_bubble_outline0

account_circle

Kyle Weller

2 months ago

Saddle up for #DataCouncil 🤠. Let's corral a fireside chat on #lakehouse table formats #ApacheHudi , #DeltaLake , and #ApacheIceberg . You don't want to miss this 🌶 discussion 3/27 11:30am. I will also intro the brand new Apache XTable (Incubating) (prev known as OneTable)

#apachextable

thumb_up_off_alt13

chat_bubble_outline0

account_circle

Jacek Laskowski @[email protected]

@jaceklaskowski

2 months ago

Ah, so that's the explanation why I didn't see TRUNCATE TABLE in #DeltaLake OSS (and on #Databricks ).

It is simply not supported in OSS 🤷‍♂️

Ah, so that's the explanation why I didn't see TRUNCATE TABLE in #DeltaLake OSS (and on #Databricks).

It is simply not supported in OSS 🤷‍♂️

thumb_up_off_alt7

chat_bubble_outline0

account_circle

Delta Lake

1 day ago

Jupyter notebooks are a great tool for data analysis in #Python . They are easy to use and give you an intuitive, interactive interface to process and visualize your data.

Learn how you can use Delta Lake from a #Jupyter Notebook ➡ delta.io/blog/delta-lak…

#deltalake #opensource

thumb_up_off_alt8

chat_bubble_outline0

account_circle

Delta Lake

1 month ago

The Apache Druid community just added a Delta Lake connector via Delta Kernel Java.

Delta Kernel is an ambitious project to abstract all the core Delta logic into Java/Rust codebases, so each connector doesn't need to write all Delta processing logic from scratch.

#deltalake

The Apache Druid community just added a Delta Lake connector via Delta Kernel Java.

Delta Kernel is an ambitious project to abstract all the core Delta logic into Java/Rust codebases, so each connector doesn't need to write all Delta processing logic from scratch.

#deltalake

thumb_up_off_alt18

chat_bubble_outline0

account_circle

Kyle Weller

5 months ago

🚨 Register to join me live 12/14 10am PST and I will answer all your Qs about @OnetableOSS. You no longer have to pick between #ApacheHudi , #ApacheIceberg , and #DeltaLake . Register to watch live, or get the recording: 👉 linkedin.com/events/onetabl…

#datalakehouse #apachepaimon

thumb_up_off_alt6

chat_bubble_outline0

account_circle

Téo Calvo

10 months ago

deu certo saporra! É para glorificar de pé! hahahaha

Milhares de linhas duplicadas que não iriam permitir inserirmos os dados usando upsert no Delta com Streaming.

#databricks #healthtech #datasus #deltaLake #s3 #streaming #TeoMeWhy

deu certo saporra! É para glorificar de pé! hahahaha

Milhares de linhas duplicadas que não iriam permitir inserirmos os dados usando upsert no Delta com Streaming.

#databricks #healthtech #datasus #deltaLake #s3 #streaming #TeoMeWhy

thumb_up_off_alt74

chat_bubble_outline0

account_circle