Martin Görner(@martin_gorner) 's Twitter Profileg
Martin Görner

@martin_gorner

Product Manager for Keras and Tensorflow high-level APIs. Previously worked on Cloud TPUs (Tensor Processing Units). Passionate about democratizing ML.

ID:734735999029383168

calendar_today23-05-2016 13:21:27

1,2K Tweets

12,5K Followers

6,3K Following

Martin Görner(@martin_gorner) 's Twitter Profile Photo

A nice blog post about Keras 3 from NSF Unidata
unidata.ucar.edu/blogs/news/ent…
The post's conclusion:

'For deep learning training I (Thomas) will be using a Keras 3 API exclusively. It more closely resembles the scikit-learn api and I find it to be easier to explain.'

account_circle
Hassan Hayat 🔥(@TheSeaMouse) 's Twitter Profile Photo

Why Google Deepmind's Mixture-of-Depths paper, and more generally dynamic compute methods, matter:

Most of the compute is WASTED because not all tokens are equally hard to predict

Why Google Deepmind's Mixture-of-Depths paper, and more generally dynamic compute methods, matter: Most of the compute is WASTED because not all tokens are equally hard to predict
account_circle
Martin Görner(@martin_gorner) 's Twitter Profile Photo

Gemma in Keras: how to build a chatbot and fine-tune it to speak like a pirate 🏴‍☠️🦜. This was a fun demo to make! It runs with Keras on JAX with the new keras.distribute.ModelParallel API.
Colab: bit.ly/gemma-pirate-d…
Video: youtu.be/AzQBFmPDtTI?si…

account_circle
Martin Görner(@martin_gorner) 's Twitter Profile Photo

Not one but two Gemma competitions are currently live on Kaggle. And we have Keras starter notebooks for both:

kaggle.com/code/awsaf49/p…

kaggle.com/code/awsaf49/k…

Have fun with Gemma! (and check out the prizes: $250,000 in total!)

account_circle
François Chollet(@fchollet) 's Twitter Profile Photo

Keras 3.0.5 and Keras-nlp 0.8.1 now come pre-installed in Kaggle notebooks -- so you can run Gemma without any extra install steps 🚀

account_circle
Martin Görner(@martin_gorner) 's Twitter Profile Photo

This was a lot of fun to demo.
Colab here: bit.ly/gemma-pirate-d…
Fav quote from pirate Gemma: 'It's nice that ye like math!'⚔️💣⚔️

account_circle
Boris Dayma 🖍️(@borisdayma) 's Twitter Profile Photo

My notes reading the Gemma paper:
- arch similar to llama
- 6T tokens for the 7B model!!!
- huge vocab size
- GeGLU for FFN, I wish they ablated the dim used there, people tend to use 4x while I like to be closer to 2.5-3x
- surprised they use Sandwich-Norm, I think Normformer

account_circle
Martin Görner(@martin_gorner) 's Twitter Profile Photo

Keras has added a new distributed training API, supporting full model parallelism for Gemma, and large models in general. It is backed by the XLA compiler in JAX. Code sample here: kaggle.com/code/nilaychau…

account_circle
clem 🤗(@ClementDelangue) 's Twitter Profile Photo

Google is back baby! Taking the first spot for open models on the Hugging Face LLM leaderboard for its sizes (2B & 7B): huggingface.co/collections/go…

Google is back baby! Taking the first spot for open models on the @huggingface LLM leaderboard for its sizes (2B & 7B): huggingface.co/collections/go…
account_circle
Martin Görner(@martin_gorner) 's Twitter Profile Photo

A great implementation of Neural Radiance Fields (NeRFs) with Keras, running on JAX. And a nice backstory 😃.
github.com/ariG23498/nerf…

account_circle