Tianle Cai(@tianle_cai) 's Twitter Profileg
Tianle Cai

@tianle_cai

ML PhD @Princeton. Life-long learner, hacker, and builder. Tech consultant & angel investor. Prev @togethercompute @GoogleDeepMind @MSFTResearch @citsecurities.

ID:3022633752

linkhttps://tianle.website calendar_today16-02-2015 15:32:35

385 Tweets

5,2K Followers

3,8K Following

Nicolas Patry(@narsilou) 's Twitter Profile Photo

Tgi 2.0 is out!

-back to fully open source for good (apache 2.0)
- Fastest inference server in existence (110 tok/s for cohere R+, with medusa speculation)
- fp8 support
- mixtral 8x22b support ! (also the fastest medusa on the way)

And much more to come
github.com/huggingface/te…

account_circle
Gavin Guo(@Zhen4good) 's Twitter Profile Photo

JetMoE's technical report is out. Using only open-source data and code, we matched Llama 2's performance at a fraction of the cost. EVERY Training details are shared to advance open foundation model research. Meta OpenAI Google Mistral AI X MIT CSAIL
arxiv.org/abs/2404.07413

account_circle
eezyCollab(@eezycollab) 's Twitter Profile Photo

🚨 Attention all marketers and founders 🚨
We are thrilled to announce the launch of eezyCollab - the AI-powered influencer marketing platform that's set to revolutionize your campaigns!🎉
Tired of searching for influencers and sending countless emails? With eezyCollab, simply…

account_circle
Soumith Chintala(@soumithchintala) 's Twitter Profile Photo

Meta announces 2nd-gen inference chip MTIAv2.
* 708TF/s Int8 / 353TF/s BF16
* 256MB SRAM, 128GB memory
* 90W TDP. 24 chips per node, 3 nodes per rack.
* standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility

Fabbed on TSMC's 5nm process, its fully programmable via the…

account_circle
Yikang Shen(@Yikang_Shen) 's Twitter Profile Photo

When you consider using MoE in your next LLM, you could ask yourself one question: Do you want a brutally large model that you can't train as a dense model, or do you just want something that is efficient for inference? If you choose the second goal, you may want to read our new…

account_circle
MyShell(@myshell_ai) 's Twitter Profile Photo

MyShell is making decentralized AI reality — We train LLaMA2-level LLMs cheaper than Meta.

Introducing JetMoE, our open-source research with MIT, Priceton, and Lepton AI. MIT CSAIL Princeton University Lepton AI

No more mega budgets needed, JetMoE to achieve top LLMs with $0.1M ⏩…

MyShell is making decentralized AI reality — We train LLaMA2-level LLMs cheaper than @Meta. Introducing JetMoE, our open-source research with MIT, Priceton, and Lepton AI. @MIT_CSAIL @Princeton @LeptonAI No more mega budgets needed, JetMoE to achieve top LLMs with $0.1M ⏩…
account_circle
Muyang Li(@lmxyy1999) 's Twitter Profile Photo

⭐️DistriFusion has been selected as a HIGHLIGHT poster in !
Try it with pip install distrifuser!
Code: github.com/mit-han-lab/di…
Paper: arxiv.org/abs/2402.19481

account_circle
Yikang Shen(@Yikang_Shen) 's Twitter Profile Photo

Aran Komatsuzaki I mostly agree with this post. Except the part that Moduleformer leads to no gain or performance degradation and instability. Could you point me to these results? Our version of MoA made two efforts to solve the stability issue: 1) use dropless moe, 2) share the kv projection…

account_circle
Yangqing Jia(@jiayq) 's Twitter Profile Photo

Learnings from running elmo.chat.

Elmo is your AI Chrome extension designed to create summaries, insights, and extend knowledge for any website. We have been running it for a few weeks, and so far, feedback from friends and family has been universally positive.…

account_circle
elvis(@omarsar0) 's Twitter Profile Photo

It will get super interesting once more people and companies can afford to train LLMs from scratch or even easily and cost-effectively fine-tune the large existing ones.

'JetMoE-8B is trained with less than $ 0.1 million cost but outperforms LLaMA2-7B from Meta AI, who has…

It will get super interesting once more people and companies can afford to train LLMs from scratch or even easily and cost-effectively fine-tune the large existing ones. 'JetMoE-8B is trained with less than $ 0.1 million cost but outperforms LLaMA2-7B from Meta AI, who has…
account_circle
Zengyi Qin(@qinzytech) 's Twitter Profile Photo

Training LLMs can be much cheaper than previously thought.

0.1 million USD is sufficient for training LLaMA2-level LLMs🤯

While OpenAI and Meta use billions of dollars to train theirs, you can also train yours with much less money.

Introducing our open-source project JetMoE:…

Training LLMs can be much cheaper than previously thought. 0.1 million USD is sufficient for training LLaMA2-level LLMs🤯 While @OpenAI and @Meta use billions of dollars to train theirs, you can also train yours with much less money. Introducing our open-source project JetMoE:…
account_circle