Tianle Cai (@tianle_cai) Twitter Tweets • TwiCopy

4 weeks ago

True..

thumb_up_off_alt12

account_circle

Tgi 2.0 is out!

-back to fully open source for good (apache 2.0)
- Fastest inference server in existence (110 tok/s for cohere R+, with medusa speculation)
- fp8 support
- mixtral 8x22b support ! (also the fastest medusa on the way)

And much more to come
github.com/huggingface/te…

account_circle

Tianle Cai

1 month ago

Glad to see Medusa helps the💯B Command R+ model cohere run in💯 tok/sec 😍

thumb_up_off_alt36

repeat2

account_circle

Gavin Guo

@Zhen4good

1 month ago

JetMoE's technical report is out. Using only open-source data and code, we matched Llama 2's performance at a fraction of the cost. EVERY Training details are shared to advance open foundation model research. Meta OpenAI Google Mistral AI X MIT CSAIL
arxiv.org/abs/2404.07413

account_circle

eezyCollab

@eezycollab

1 month ago

🚨 Attention all marketers and founders 🚨
We are thrilled to announce the launch of eezyCollab - the AI-powered influencer marketing platform that's set to revolutionize your campaigns!🎉
Tired of searching for influencers and sending countless emails? With eezyCollab, simply…

thumb_up_off_alt18

repeat3

account_circle

Tianle Cai

1 month ago

Probably big 👀

thumb_up_off_alt7

account_circle

Soumith Chintala

@soumithchintala

1 month ago

Meta announces 2nd-gen inference chip MTIAv2.
* 708TF/s Int8 / 353TF/s BF16
* 256MB SRAM, 128GB memory
* 90W TDP. 24 chips per node, 3 nodes per rack.
* standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility

Fabbed on TSMC's 5nm process, its fully programmable via the…

account_circle

Yikang Shen

@Yikang_Shen

1 month ago

When you consider using MoE in your next LLM, you could ask yourself one question: Do you want a brutally large model that you can't train as a dense model, or do you just want something that is efficient for inference? If you choose the second goal, you may want to read our new…

thumb_up_off_alt37

repeat6

account_circle

MyShell

@myshell_ai

1 month ago

MyShell is making decentralized AI reality — We train LLaMA2-level LLMs cheaper than Meta.

Introducing JetMoE, our open-source research with MIT, Priceton, and Lepton AI. MIT CSAIL Princeton University Lepton AI

No more mega budgets needed, JetMoE to achieve top LLMs with $0.1M ⏩…

account_circle

Tianle Cai

1 month ago

8x22b👀👀

thumb_up_off_alt23

account_circle

Muyang Li

@lmxyy1999

1 month ago

⭐️DistriFusion has been selected as a HIGHLIGHT poster in #CVPR2024 !
Try it with pip install distrifuser!
Code: github.com/mit-han-lab/di…
Paper: arxiv.org/abs/2402.19481

thumb_up_off_alt36

repeat7

account_circle

Tianle Cai

1 month ago

An open-source Griffin model just quietly dropped 👀👀👀

thumb_up_off_alt12

repeat2

account_circle

Yikang Shen

@Yikang_Shen

1 month ago

Aran Komatsuzaki I mostly agree with this post. Except the part that Moduleformer leads to no gain or performance degradation and instability. Could you point me to these results? Our version of MoA made two efforts to solve the stability issue: 1) use dropless moe, 2) share the kv projection…

thumb_up_off_alt15

repeat2

account_circle

Yangqing Jia

@jiayq

1 month ago

Learnings from running elmo.chat.

Elmo is your AI Chrome extension designed to create summaries, insights, and extend knowledge for any website. We have been running it for a few weeks, and so far, feedback from friends and family has been universally positive.…

account_circle

Tianle Cai

1 month ago

👀👀👀

thumb_up_off_alt8