Julian Michael (@_julianmichael_) Twitter Tweets • TwiCopy

repeat851

account_circle

david rein

@idavidrein

1 week ago

Is GPQA garbage?

A couple weeks ago, typedfemale pointed out some mistakes in a GPQA question, so I figured this would be a good opportunity to discuss how we interpret benchmark scores, and what our goals should be when creating benchmarks.

account_circle

Cooper

@peakcooper

1 week ago

llama 3 is a snitch...

account_circle

AI Objectives Institute

@AIObjectives

1 month ago

We are thrilled to announce Colleen McKenzie (Colleen McKenzie) as our new Executive Director.

Read about it from Deger Turan: ai.objectives.institute/blog/colleen-m…

thumb_up_off_alt12

account_circle

Sam Bowman

@sleepinyourhat

2 months ago

🚨📄 Following up on 'LMs Don't Always Say What They Think', Miles Turpin et al. now have an intervention that dramatically reduces the problem! 📄🚨

It's not a perfect solution, but it's a simple method with few assumptions and it generalizes *much* better than I'd expected.

thumb_up_off_alt72

repeat8

account_circle

Julian Michael

@_julianmichael_

2 months ago

Check out our latest work on reducing unfaithfulness in chain of thought! Turns out you can get a long way just by training the model to output consistent explanations even in the presence of spurious biasing features that ~tempt~ the model.

thumb_up_off_alt28

account_circle

Miles Turpin

@milesaturpin

2 months ago

🚀New paper!🚀

Chain-of-thought (CoT) prompting can give misleading explanations of an LLM's reasoning, due to the influence of unverbalized biases. We introduce a simple unsupervised consistency training method that dramatically reduces this, even on held-out forms of bias.
🧵

account_circle

david rein

@idavidrein

2 months ago

x.com/idavidrein/sta…

account_circle

NYU Data Science

@NYUDataScience

3 months ago

Two new preprints by CDS Jr Research Scientist david rein and CDS Research Scientist Julian Michael, working with CDS Assoc. Prof. Sam Bowman, aim to enhance the reliability of AI systems through innovative debate methodologies and new benchmarks.

nyudatascience.medium.com/pioneering-ai-…

thumb_up_off_alt6

account_circle

akbir.

@akbirkhan

3 months ago

How can we check LLM outputs in domains where we are not experts?

We find that non-expert humans answer questions better after reading debates between expert LLMs.
Moreover, human judges are more accurate as experts get more persuasive. 📈
github.com/ucl-dark/llm_d…

account_circle

Sam Bowman

@sleepinyourhat

5 months ago

I gave a talk! You can watch it!

Covering: Scalable oversight, AI-AI debate, hard QA datasets, and getting truthful answers out of AI systems in domains we don't know much about.

account_circle

EleutherAI

@AiEleuther

5 months ago

Looking for something to check out on the last day of #NeurIPS2023 ? Come hang out with EleutherAI SoLaR @ NeurIPS2023

Stella Biderman is speaking on a panel and Jacob Pfau Alex Infanger Abhay Sheshadri, Ayush Panda, Curtis Huebner and Julian Michael have a poster

Room R06-R09

Looking for something to check out on the last day of #NeurIPS2023? Come hang out with EleutherAI @solarneurips @BlancheMinerva is speaking on a panel and @jacob_pfau @alexinfanger Abhay Sheshadri, Ayush Panda, Curtis Huebner and @_julianmichael_ have a poster Room R06-R09

thumb_up_off_alt31

repeat5

account_circle

Sam Bowman

@sleepinyourhat

5 months ago

👇Come see our paper on ways chain-of-thought reasoning can get really sketchy!

thumb_up_off_alt28

repeat1

account_circle

Miles Turpin

@milesaturpin

5 months ago

For anyone at NeurIPS, I'll be presenting this today! Come chat about (un)faithfulness in chain-of-thought and how we can improve it!
Find me in Great Hall & Hall B1+B2 (level 1) #1525 from 5-7pm tonight!
neurips.cc/virtual/2023/p…

thumb_up_off_alt35

repeat1

account_circle

Alisa Liu

@alisawuffles

5 months ago

We'll be presenting this as a poster at 4pm today!!🕓 Come hear what ambiguity is all about, how well LMs handle it (hint: much room for improvement!), and how it relates to annotator disagreement #EMNLP2023

account_circle

Julian Michael

@_julianmichael_

5 months ago

Woo! Thanks so much to the workshop committee 😄

thumb_up_off_alt42

account_circle

KenOliveLab

@KenOliveLab

5 months ago

We are excited to share our preprint on the effects of RAS-GTP inhibition in preclinical models of PDAC, using RMC-7977, a potent inhibitor of GTP-bound RAS proteins. This preclinical agent is related to RMC-6236, currently in clinical trials. biorxiv.org/content/10.110…

account_circle

Julian Michael

@_julianmichael_

5 months ago

This could change the game for pancreatic cancer.

KRAS mutates in >80% of pancreatic cancers. KRAS has long been 'undruggable.' Five-year survival is <10%.

They test a new KRAS inhibitor in human & mouse models. No chemo. Few side effects. Cancer cells die. Normal cells don't.

thumb_up_off_alt6

repeat0