Julian Michael(@_julianmichael_) 's Twitter Profileg
Julian Michael

@_julianmichael_

Researching stuff @NYUDataScience. he/him

ID:1019072664600637440

linkhttp://julianmichael.org calendar_today17-07-2018 04:13:51

304 Tweets

1,1K Followers

122 Following

Sigal Samuel(@SigalSamuel) 's Twitter Profile Photo

Want to know why OpenAI's safety team imploded?

Here's why.

Thank you to the company insiders who bravely spoke to me.

According to my sources, the answer to 'What did Ilya see?' is actually very simple...

vox.com/future-perfect…

account_circle
david rein(@idavidrein) 's Twitter Profile Photo

Is GPQA garbage?

A couple weeks ago, typedfemale pointed out some mistakes in a GPQA question, so I figured this would be a good opportunity to discuss how we interpret benchmark scores, and what our goals should be when creating benchmarks.

Is GPQA garbage? A couple weeks ago, @typedfemale pointed out some mistakes in a GPQA question, so I figured this would be a good opportunity to discuss how we interpret benchmark scores, and what our goals should be when creating benchmarks.
account_circle
AI Objectives Institute(@AIObjectives) 's Twitter Profile Photo

We are thrilled to announce Colleen McKenzie (Colleen McKenzie) as our new Executive Director.

Read about it from Deger Turan: ai.objectives.institute/blog/colleen-m…

account_circle
Sam Bowman(@sleepinyourhat) 's Twitter Profile Photo

🚨📄 Following up on 'LMs Don't Always Say What They Think', Miles Turpin et al. now have an intervention that dramatically reduces the problem! 📄🚨

It's not a perfect solution, but it's a simple method with few assumptions and it generalizes *much* better than I'd expected.

account_circle
Julian Michael(@_julianmichael_) 's Twitter Profile Photo

Check out our latest work on reducing unfaithfulness in chain of thought! Turns out you can get a long way just by training the model to output consistent explanations even in the presence of spurious biasing features that ~tempt~ the model.

account_circle
Miles Turpin(@milesaturpin) 's Twitter Profile Photo

🚀New paper!🚀

Chain-of-thought (CoT) prompting can give misleading explanations of an LLM's reasoning, due to the influence of unverbalized biases. We introduce a simple unsupervised consistency training method that dramatically reduces this, even on held-out forms of bias.
🧵

🚀New paper!🚀 Chain-of-thought (CoT) prompting can give misleading explanations of an LLM's reasoning, due to the influence of unverbalized biases. We introduce a simple unsupervised consistency training method that dramatically reduces this, even on held-out forms of bias. 🧵
account_circle
NYU Data Science(@NYUDataScience) 's Twitter Profile Photo

Two new preprints by CDS Jr Research Scientist david rein and CDS Research Scientist Julian Michael, working with CDS Assoc. Prof. Sam Bowman, aim to enhance the reliability of AI systems through innovative debate methodologies and new benchmarks.

nyudatascience.medium.com/pioneering-ai-…

account_circle
akbir.(@akbirkhan) 's Twitter Profile Photo

How can we check LLM outputs in domains where we are not experts?

We find that non-expert humans answer questions better after reading debates between expert LLMs.
Moreover, human judges are more accurate as experts get more persuasive. 📈
github.com/ucl-dark/llm_d…

How can we check LLM outputs in domains where we are not experts? We find that non-expert humans answer questions better after reading debates between expert LLMs. Moreover, human judges are more accurate as experts get more persuasive. 📈 github.com/ucl-dark/llm_d…
account_circle
Sam Bowman(@sleepinyourhat) 's Twitter Profile Photo

I gave a talk! You can watch it!

Covering: Scalable oversight, AI-AI debate, hard QA datasets, and getting truthful answers out of AI systems in domains we don't know much about.

I gave a talk! You can watch it! Covering: Scalable oversight, AI-AI debate, hard QA datasets, and getting truthful answers out of AI systems in domains we don't know much about.
account_circle
EleutherAI(@AiEleuther) 's Twitter Profile Photo

Looking for something to check out on the last day of ? Come hang out with EleutherAI SoLaR @ NeurIPS2023

Stella Biderman is speaking on a panel and Jacob Pfau Alex Infanger Abhay Sheshadri, Ayush Panda, Curtis Huebner and Julian Michael have a poster

Room R06-R09

Looking for something to check out on the last day of #NeurIPS2023? Come hang out with EleutherAI @solarneurips @BlancheMinerva is speaking on a panel and @jacob_pfau @alexinfanger Abhay Sheshadri, Ayush Panda, Curtis Huebner and @_julianmichael_ have a poster Room R06-R09
account_circle
Miles Turpin(@milesaturpin) 's Twitter Profile Photo

For anyone at NeurIPS, I'll be presenting this today! Come chat about (un)faithfulness in chain-of-thought and how we can improve it!
Find me in Great Hall & Hall B1+B2 (level 1) #1525 from 5-7pm tonight!
neurips.cc/virtual/2023/p…

account_circle
Alisa Liu(@alisawuffles) 's Twitter Profile Photo

We'll be presenting this as a poster at 4pm today!!🕓 Come hear what ambiguity is all about, how well LMs handle it (hint: much room for improvement!), and how it relates to annotator disagreement

account_circle
KenOliveLab(@KenOliveLab) 's Twitter Profile Photo

We are excited to share our preprint on the effects of RAS-GTP inhibition in preclinical models of PDAC, using RMC-7977, a potent inhibitor of GTP-bound RAS proteins. This preclinical agent is related to RMC-6236, currently in clinical trials. biorxiv.org/content/10.110…

account_circle
Julian Michael(@_julianmichael_) 's Twitter Profile Photo

This could change the game for pancreatic cancer.

KRAS mutates in >80% of pancreatic cancers. KRAS has long been 'undruggable.' Five-year survival is <10%.

They test a new KRAS inhibitor in human & mouse models. No chemo. Few side effects. Cancer cells die. Normal cells don't.

account_circle