Advisor Dossier: Prof. Arman Cohan — Yale University

Student: Weijia Zhang | M.S. CS, Yale University (Sep 2026 – May 2028, Thesis Track, Full Scholarship) Assumed goals (inferred from CV): Industry-research 60% · Academia (PhD after M.S.) 40% Report date: 2026-06-11

1. Executive Summary

Top critical risks/unknowns:

Research domain mismatch (high impact): Cohan’s lab focuses on scientific NLP, document understanding, and scholarly AI agents. Weijia’s background is in GUI/interactive agents, VLM agents, SFT pipelines, and RL. The overlap exists but requires an intentional pivot project.
M.S. advising bandwidth (unknown): Cohan has only 4–5 active PhD students. M.S. thesis advising may get deprioritized relative to PhD work. No confirmed evidence of M.S. students publishing first-author top-tier papers from this lab yet.
Frontier placement evidence thin (coverage gap): The lab is only 3 years old. Only 2 PhDs have graduated (both 2024). One confirmed frontier full-time placement (Ansong Ni → Meta FAIR). Not enough data to judge consistent frontier-lab pipeline.
No confirmed NSF/NIH federal grants: Funding appears to rely on Yale institutional support, Google Research Scholar Award, and Roberts Innovation Fund. Grant runway beyond 2026 is not confirmed.

One-line verdict: Cohan is a high-quality, well-networked early-career advisor — but the research-domain fit for Weijia requires a deliberate intersection project, and the M.S. timeline (2 years) makes domain pivot riskier than it would be for a PhD student.

Strongest pros:

Lab produces top-tier publications (NeurIPS, ICLR, CVPR, ACL, EMNLP); current PhD students are exceptionally strong
Ongoing AI2 affiliation provides direct access to a major NLP research organization
Alumni network touches Meta FAIR, Anthropic, xAI, MSR, Google DeepMind
Cohan is accessible and early-career — likely high mentorship bandwidth for motivated M.S. students

Strongest cons:

Lab’s primary domain (scientific NLP, document understanding) is not Weijia’s primary strength (GUI agents, RL, VLM)
2-year M.S. timeline leaves little room to establish a new research direction
M.S. student placement track record from this lab is limited (Kejian Shi → Scale AI is the most documented M.S. placement so far)
Frontier lab pipeline evidence for PhD-level is thin (1 full-time; undergrad-to-frontier placements are not precedent for PhD/M.S. trajectories)

Score snapshots:

Four-Dimension Fit Score: 71/100 → Proceed with caution
AI Industry Outcome (industry-research track): 67/100 → Proceed with caution

Coverage: Medium confidence (2 PhDs graduated, identity/role coverage ~80%, but M.S. coverage thin and attrition data unavailable)

Concrete next steps:

Request a meeting with Cohan before arriving at Yale to test whether there’s a viable intersection project (agent evaluation for science; RL for scientific instruction-following).
Talk to Yilun Zhao (current Yale NLP PhD, also from UIUC CS) and Yixin Liu (current PhD, has interned at Meta FAIR + Google DeepMind) to understand day-to-day advising, project ownership, and bandwidth.
Ask Cohan directly about his policy toward M.S. thesis students vs. PhD students and publication expectations.
Verify NSF/NIH grant status through Yale Research Administration records.
Parallel strategy: arrive at Yale with 2–3 potential advisors in mind; Cohan is worth pursuing if a viable project can be scoped in the first meeting.

2. Critical Problems First

#	Problem	Severity	Confidence	Evidence
1	Research domain mismatch	High	High	Cohan: scientific NLP, long-doc, scholarly AI. Weijia: GUI/VLM agents, RL, SFT pipelines.
2	M.S. advising bandwidth	Medium	Medium	No confirmed M.S. first-author top-tier paper from Cohan lab. Only 1 documented M.S. → industry outcome (Scale AI).
3	Thin PhD placement sample	Medium	High	Only 2 PhD graduates (both 2024). Meta FAIR + Zoom AI. No academia placements. No OpenAI/Anthropic PhD full-time.
4	Unconfirmed federal funding	Low-Medium	Low	No NSF/NIH grant numbers found in public records. Relies on internal Yale + Google Scholar Award + AI2 affiliation.
5	Short M.S. timeline	Medium	High	2 years (2026–2028) is tight for domain pivot + thesis publication + internship.

3. Strong Pros and Strong Cons

Pros

Top-tier research output by current students: Yixin Liu (ICLR 2026, NeurIPS 2025, internships at Meta FAIR + Google DeepMind + MSR), Xiangru Tang (ICML 2025 Best Paper Runner-Up, WAIC Rising Star 2025, ICLR 2026, multiple Google/MS connections). Working alongside these students provides real learning and collaboration opportunities.
AI2 affiliation (ongoing): Cohan retains a Faculty Research Scientist role at Allen Institute for AI. This is an active research relationship, not honorary. It opens access to AI2’s massive resources, data, and talent network (Iz Beltagy, Kyle Lo, Doug Downey, Noah Smith at UW, etc.).
Frontier lab alumni access: Ansong Ni (Meta FAIR RS), Hailey Schoelkopf (Anthropic MTS), 3 grads at xAI (Azerbayev, Tae, Deyuan Li). This is a warm-referral network that M.S. students could potentially access.
Google Research Scholar Award: External validation of lab quality; also signals Google connection.
Early-career = high accessibility: Cohan joined Yale in 2023; he has incentives to mentor strong students closely. He likely has more bandwidth than senior faculty.
Viable intersection projects exist: Scientific agent evaluation (combining Weijia’s agent taxonomy expertise from GUIAgentDebugger with Cohan’s SciVer/ReIFE evaluation methodology) is a natural fit.

Cons

Domain mismatch: Cohan’s core research is scientific NLP (SciBERT, SPECTER, SciRIFF, SciDQA). Weijia’s core expertise is VLM/GUI agents, RL-based training (VERL), SFT pipelines. Joining requires reframing research direction.
M.S. students may be second-priority: In most NLP labs, PhD students dominate the advisor’s attention. No evidence that Cohan’s M.S. students receive the same level of advising depth as his PhDs.
Lab is very young: Only 2 PhDs graduated. Distribution statistics are based on too small a sample. A single outlier (Ansong Ni, who had 4 industry internships before graduating) may over-represent the lab’s typical trajectory.
No confirmed Weijia-relevant M.S. placement: The only documented M.S. outcome is Kejian Shi → Scale AI, which is a respectable but not frontier outcome.
xAI concentration is unusual and hard to interpret: 3 BS grads going to xAI in sequence (2023–2025) may reflect a personal connection rather than a systematic pipeline — not guaranteed to continue.
Frontier goal requires two hops: Weijia is an M.S. student, and most frontier lab Research Scientist hiring targets PhD-level candidates. Realistically, M.S. → frontier requires either a PhD afterward (which Cohan can help with) or exceptional output during M.S. (possible but uncertain).

4. Alumni Outcomes and Graduation Windows

PhD Graduates

Name	Start yr	Grad yr	Confidence	First Role	Current Role	Outcome Type	Frontier?	Exit
Ansong Ni	2020	2024	High [resolved]	Research Scientist, Meta FAIR	Research Scientist, Meta FAIR	Industry-research	Yes (Meta FAIR)	Graduated
Linyong Nan	~2019	2024	High [resolved]	Sr. AI Scientist, Zoom AI	Sr. AI Scientist, Zoom AI	Industry-research	No	Graduated

Current PhD Students (Active, no graduation data yet)

Name	Start yr	Est. grad	Confidence	Research focus	Internship record
Yilun Zhao	2021	~2026	High	LLM evaluation, AI4Science, multimodal	Unknown
Yixin Liu	~2022	~2027	High	LLM eval, reward modeling, training	Meta FAIR, Google DeepMind, MSR
Xiangru Tang	~2021–22	~2026–27	High	LLM agents, AI scientists, bioinformatics	Google (3 teams), MS, Eigen AI
Alan (Haoxin) Li	~2023–24	~2028	High	Efficiency, multimodality, retrieval	Unknown
Chengye Wang	~2023–24	~2028	Medium	Scientific NLP, multimodal	Unknown

M.S. Graduates (Known)

Name	Grad yr	Confidence	Placement	Outcome Type
Kejian Shi	2025	High	Scale AI	Industry-engineering
Chuhan Li	2025	High	PhD → UCSB (Xin Eric Wang)	Academia
Ziyao Shangguan	2025	High	PhD → Yale CS (robotics/NLP)	Academia
Heyuan Huang	2024	High	PhD → Johns Hopkins CLSP	Academia
Leyao Wang	~2025–26	Medium	Unknown	Unknown

B.S. Graduates / Undergrad Researchers

Name	Grad yr	Confidence	Placement	Outcome Type	Frontier?
Hailey Schoelkopf	2023	High	Anthropic (MTS)	Industry-research	Yes
Zhangir Azerbayev	2023	High	xAI (MTS)	Industry-research	Frontier-adjacent
Jake Tae	2024	High	xAI (MTS)	Industry-research	Frontier-adjacent
Stephen Yin	2024	High	PhD → UChicago (Stats)	Academia	No
Lj Flores	2024	High	MSc → McGill/Mila (Jackie Cheung)	Academia	No
Deyuan Li	2025	High	xAI (MTS)	Industry-research	Frontier-adjacent

AI2-era Mentees (Pre-Yale, collaborative)

Name	Context	Placement	Role
Wen Xiao	AI2 intern 2021 (UBC PhD)	Microsoft Azure AI	Researcher
Sean MacAvaney	Georgetown intern 2020	University of Glasgow	Senior Lecturer

5. Placement Distribution and Attrition Analysis

PhD cohort (n=2): Both graduates went to industry research. Neither went to academia. This is a small sample insufficient for distribution analysis.

Upper tail: Ansong Ni → Meta FAIR RS (strong)
Lower tail: Linyong Nan → Zoom AI SR AI Scientist (respectable; not frontier)
Median: unknown (n=2)
Variance: high due to small sample

M.S. cohort (n=5 known): Mix of PhD programs (3) and industry (1 Scale AI) and unknown (1).

No confirmed M.S. → frontier lab placements.
Mostly stepping-stone placements (PhD or respectable industry).

B.S. cohort (n=6 known): Strongly skewed toward xAI/Anthropic (frontier/frontier-adjacent) and PhD programs.

xAI pipeline appears real (3 consecutive grads 2023–2025).
Anthropic: 1 placement.
Note: B.S. → frontier placements are not directly predictive of M.S./PhD → frontier placements, which are more competitive.

Attrition: No evidence of non-completions, forced exits, or bad quits. However, the lab is only 3 years old — there may not have been time for problematic exits to occur or become public. Coverage of attrition is low due to sample size.

Near-graduation unemployment/underemployment risk: No evidence of either, but the sample is too small to draw conclusions.

Frontier readiness estimate (for PhD-level): Limited but improving.

1 confirmed frontier full-time (Ni → Meta FAIR)
2 confirmed frontier internships from current students (Liu at Google DeepMind and Meta FAIR)
Undergrad-to-frontier pipeline (Anthropic, xAI) signals that Cohan’s network reaches these labs, but this does not automatically translate to PhD placements

6. Data Coverage Dashboard

Metric	Coverage	Confidence
Resolved alumni identity (PhD+MS)	~87% (13/15)	High
Verified first role after graduation (PhD+MS)	~80% (12/15)	High
Verified current role	~73% (11/15)	Medium
Role-family classification (high/medium confidence)	~87%	High
Frontier funnel evidence	1 full-time, 2 internships	Low-Medium
Founder/commercialization evidence	Roberts Innovation Fund (Cohan PI); no student founder evidence	Low
Verifiable attrition reason	~10% (mostly unknown; no confirmed bad quits)	Low
Near-graduation employment-status/latency	~40% (only Ni + Nan confirmed rapid placement)	Low

Overall coverage confidence: Medium

Critical metrics (resolved identity, first role) are high; but attrition, near-graduation status, and M.S.-specific coverage are low.
No automatic verdict downgrade from coverage gate (coverage is medium, not low).

What missing data would most change the verdict:

M.S. student publication track record (first-author top-tier papers from M.S. advisees)
Current PhD students’ actual graduation placements (Yilun Zhao, Yixin Liu, Xiangru Tang are expected ~2026–27 — these will be the most informative data points)
NSF/NIH grant status and runway
Any off-the-record information about M.S. advising experience and whether Cohan truly invests in thesis students

7. Four-Dimension Risk and Fit Assessment

Goal weights (blended: 60% industry-research + 40% academia):

Survival: 27 Academic: 27 Industry: 31 Happiness: 15

Dimension	Score	Evidence	Confidence
Survival	75	Full Scholarship (institutional, not advisor-dependent); Google Research Scholar; AI2 affiliation; Yale support. No confirmed federal grants but lab is active.	Medium
Academic outcome	68	Current PhDs publish at NeurIPS/ICLR/CVPR/ACL. Strong PhD-application support potential. But: 0 faculty placements; domain mismatch for Weijia; M.S. publication rate unknown.	Medium
Industry outcome	70	Ni → Meta FAIR (1 frontier full-time); xAI pipeline (3 grads); Anthropic (1 grad); current student internships at Google DeepMind + Meta FAIR. M.S. → frontier unclear.	Medium-Low
Happiness	72	No negative culture signals; active productive lab; accessible early-career advisor; small NLP lab community at Yale. Domain pivot required.	Low

Four-Dimension Fit Score: 75 × 0.27 + 68 × 0.27 + 70 × 0.31 + 72 × 0.15 = 20.25 + 18.36 + 21.70 + 10.80 = **71.1/100**

Base verdict: Proceed with caution (50–74 range)

8. AI Industry Outcome Scorecard (Industry-Research Track)

Category	Weight	Score	Evidence
Frontier placement evidence	35	23	Ansong Ni → Meta FAIR RS (1 verified full-time frontier PhD); Hailey Schoelkopf → Anthropic (1 BS graduate, provisional signal); 3 xAI (BS-level, frontier-adjacent). Current PhD Yixin Liu interned at Meta FAIR + Google DeepMind — not yet graduated.
Internship-to-offer conversion	20	13	Ansong Ni: 4 internships → Meta FAIR full-time (strong evidence of systematic internship strategy). Liu has 3 frontier lab internships pending graduation. No confirmed conversion failure cases.
Network access to hiring teams	20	14	AI2 affiliation (Beltagy, Lo, Downey); Ansong Ni at Meta FAIR (warm referral channel); Google Research Scholar (Google connection); Schoelkopf at Anthropic; xAI grads. Strong but not dense.
Project relevance to target teams	15	9	Cohan’s scientific NLP and agents are relevant to research/applied science teams at labs. Less directly relevant to pretraining, post-training RL, or core alignment teams. Weijia’s SFT/RL background could fill this gap if positioned right.
Geography and visa feasibility	10	8	Yale in New Haven, CT. Reasonable US location. On-campus recruiting not as strong as Bay Area but manageable.

Industry-research track total: 67/100 → Proceed with caution (no cap at 75+)

Verified Frontier Placement Table

Name	Lab Role	Dest.	Frontier Lab?	Confidence	Notes
Ansong Ni	PhD graduate (2024)	Meta FAIR	Yes	High [resolved]	Research Scientist; had 4 internships prior
Hailey Schoelkopf	B.S. graduate (2023)	Anthropic (MTS)	Yes	High [resolved]	B.S.-level; not predictive of PhD track
Yixin Liu	Current PhD	Meta FAIR intern + Google DeepMind intern + MSR intern	Frontier internships	High	Not yet graduated; conversion pending
Xiangru Tang	Current PhD	Google (3 teams), Microsoft	Frontier internships	High	Not yet graduated

Frontier gate result:

Verified frontier full-time PhDs: 1 (Ni → Meta FAIR)
Verified frontier internships: 2+ (Liu at GDM + Liu at Meta FAIR)
Gate: 1 full-time < 2 required for unrestricted Strong Fit; 2 internships < 3 required with documented conversion
Result: Frontier gate caps verdict at Proceed with caution ✓

9. Verdict and Score Reconciliation

Gate	Result	Verdict cap
Four-Dimension Fit Score: 71.1	Proceed with caution range	Proceed with caution
Industry-research track: 67/100	50-74 range	Proceed with caution
Frontier gate: 1 full-time, 2 internships	Below Strong Fit thresholds	Proceed with caution
Coverage: Medium	No automatic downgrade	—

Final verdict: ⚠️ Proceed with caution (71/100)

10. Personalized Fit (Weijia Zhang × Arman Cohan)

Research Overlap

Most Promising Intersection Projects

Scientific AI agent evaluation framework: Weijia’s GUIAgentDebugger taxonomy (4 categories, 29 subtypes of agent failures) could be adapted to scientific agent tasks (hypothesis generation, literature search, experiment planning). Cohan’s SciVer + IRIS direction needs robust evaluation methodology. This is the strongest pitch.
RL for scientific instruction-following: Combine Weijia’s RL/SFT pipeline experience with Cohan’s SciRIFF dataset and ReIFE evaluation work. Produce a scientific agent that can follow complex multi-step instructions over literature.
Multimodal scientific RAG agents: Leverage Weijia’s VLM agent background + Cohan’s retrieval/embedding work to build agents that navigate multimodal scientific documents (tables, figures, text). Connects to SciVer (ACL 2025) direction.

Skill Match

Weijia’s strengths that fit Cohan’s lab: Python/PyTorch infrastructure, SFT data pipeline engineering, agent framework development (LangGraph, MCP), VLM/LLM engineering, RL training.
Gap to fill: Scientific NLP literature (SciDQA, QASPER, summarization), scholarly document processing. These are learnable in the first semester.
Weijia’s unique edge: MSRA internship (VLM agents for production systems), OpenManus-RL contributor (60K stars), GUIAgentDebugger (novel agent failure taxonomy). This is a strong candidate profile for Cohan’s lab.

Potential Friction Points

Scope narrowing: Cohan may ask Weijia to focus on scientific domains (biology, chemistry, medicine), which may feel limiting compared to general-purpose agent work.
M.S. timeline pressure: 2 years with thesis requirement. Starting a new research direction risks not having publishable results by year 2.
PhD students get priority: Yixin Liu and Xiangru Tang are near graduation and likely absorb significant advisor bandwidth. A new M.S. student may not get the same level of direct engagement.
Weijia is at UIUC → Yale (same path as Yilun Zhao): This could be a conversation starter with Yilun Zhao, but also means Weijia shouldn’t assume Cohan has a “UIUC → Yale NLP PhD” pipeline for M.S. students.

Network Complementarity

Cohan’s network (AI2, Meta FAIR, Anthropic, Google Research, xAI) is highly complementary to Weijia’s existing network (MSRA, OpenManus/ByteDance adjacent, Tencent WeChat).
Working with Cohan would add US-based frontier lab connections that Weijia’s current network doesn’t cover strongly.

One-line fit verdict

Weijia is a strong candidate for Cohan’s lab if she can pitch a scientific AI agent evaluation project; the main risks are the short M.S. timeline and the domain pivot cost.

11. Alumni Impact and Connection Mapping (Prioritized)

Name	Relation	Role	Why They Matter	Channel
Yilun Zhao	Current PhD (Yale NLP), UIUC alumnus	PhD candidate	Same institution, same lab, UIUC background — best person to get real inside perspective on Cohan’s advising style	In person at Yale / email
Yixin Liu	Current PhD (Yale NLP)	PhD candidate, former Meta FAIR/GDM/MSR intern	Can speak to internship pipeline, what it takes to land at frontier labs from Cohan’s lab	In person at Yale / email
Xiangru Tang	Current PhD (Yale NLP)	PhD candidate, multiple awards	Can speak to publication trajectory, advisor interaction, and startup/industry connections	In person at Yale / email
Ansong Ni	PhD alumnus (2024)	Research Scientist, Meta FAIR	Best source on: what Cohan’s advising is really like, how the internship pipeline works, realistic frontier lab placement probability	LinkedIn / email
Hailey Schoelkopf	B.S. alumna (2023)	Anthropic MTS	Can speak to Cohan’s network reach to Anthropic; but note B.S. → MTS is a different track	LinkedIn
Jake Tae	B.S. alumnus (2024)	xAI MTS	Can speak to xAI pipeline specifics	LinkedIn

Connection strength: Yilun Zhao, Yixin Liu, Xiangru Tang = direct (same lab, will be at Yale with Weijia). Ansong Ni = adjacent. Others = adjacent to weak.

12. Funding and Resources

Source	Amount	Status	Runway
Yale M.S. Full Scholarship (Weijia’s)	Institutional	Confirmed	2 years (not advisor-dependent)
Google Research Scholar Award	~$60–100K (typical range)	Confirmed	Unknown expiration
Roberts Innovation Fund (Yale)	Seed grant (2×)	Confirmed	Short-term seed
AI2 Faculty Research Scientist	Non-financial affiliation	Ongoing	Ongoing
Unnamed health startup collaboration	Unknown	Postdoc ad posted Jul 2024	Unknown
NSF / NIH grants	Unknown	Not confirmed in public records	Unknown

Key funding risk: If Cohan does not have active federal grants, M.S. thesis research infrastructure (compute, data) depends on Yale institutional resources and AI2 access. For a scientific NLP lab this is typically manageable (cloud APIs, open datasets), but compute-intensive RL training (Weijia’s specialty) may require confirmation.

Recommendation: Ask Cohan directly what compute resources are available for thesis students, especially if Weijia wants to continue RL-scale experiments.

13. Academic Profile

Position: Assistant Professor, Yale CS (Jan 2023). Faculty Research Scientist, AI2 (ongoing). Affiliated with Yale Wu Tsai Institute and Yale School of Medicine.

Education: PhD, Georgetown University, 2018. Advisor: Nazli Goharian. Dissertation: Harold N. Glassman Distinguished Doctoral Dissertation Award in Science (2019).

Prior: Research Scientist at AI2 (2018–2022); Affiliate AP at University of Washington (2021–2022).

Citation metrics (approx.): ~27,000+ Google Scholar citations [Tier B, provisional]. Semantic Scholar: ~104 papers, ~1,860 highly influential citations. H-index: not confirmed (estimated 35–45 range based on citation volume; verify directly at scholar.google.com/citations?user=baI7IY0AAAAJ).

Landmark papers:

Longformer (2020) — ~4,700+ citations; foundational efficient transformer [1]
SciBERT (2019) — thousands of citations; canonical scientific NLP model [2]
SPECTER (ACL 2020) — widely used scientific paper embeddings [3]

Most high-citation work is from AI2 era (2018–2022). Yale-era publications (2023–2026) are more applied and evaluation-focused, consistent with building a new lab.

Awards: EMNLP Best Long Paper (2017); EACL Outstanding Paper; COLING Honorable Mention; Google Research Scholar Award; Roberts Innovation Award (Yale, 2×).

Service: Area Chair: ACL 2020, ICLR 2021, NAACL 2021. Organizer: SciNLP, SDP workshop series.

14. Research Gaps (What We Don’t Know)

H-index / i10-index: Must be read directly from Google Scholar; not found in search results.
Federal grant portfolio: No NSF/NIH award numbers confirmed. This is a meaningful unknown for lab longevity.
M.S. thesis publication rate: No documented first-author ACL/EMNLP/NeurIPS/ICLR papers from M.S. advisees. Absence of evidence ≠ evidence of absence.
Cohan’s advising style for M.S. vs. PhD students: Lab website doesn’t distinguish. Must ask directly or ask current students.
Health startup collaboration: Name not public; nature and duration of collaboration unknown.
Yilun Zhao, Xiangru Tang graduation placements: Both expected ~2026. These will be the most informative placement data points in the next 12 months.
Full postdoc roster: One postdoc position was posted July 2024; hired candidate and current status unknown.

15. Questions to Ask

Questions for Prof. Cohan

What major problems is the lab trying to solve, and why is it positioned to win (especially vs. larger AI2, Stanford NLP, CMU)?
Do you advise M.S. thesis students with the same depth as PhD students? What does a typical week of M.S. thesis advising look like?
What would a realistic first project be for me, given my background in GUI agents and RL? Are compute and data already in place?
What is the funding model for M.S. thesis students? What compute resources can I access?
What is your authorship policy? Do M.S. students get first-author opportunities?
What is your internship policy during the M.S.? Can I do a summer internship while working on the thesis?
What is the typical time-to-completion for your M.S. thesis students? Do students typically publish before defending?
How does the AI2 affiliation work in practice — can thesis students access AI2 resources or collaborate with AI2 researchers?
What happens if the initial project direction isn’t working — how do you pivot?

Questions for Current/Former Students (Yilun Zhao, Yixin Liu, Ansong Ni)

What is day-to-day life like in the lab — how often does Arman engage directly with your work?
Are meeting commitments reliable, and how fast does he review drafts or give feedback?
How does he handle it when projects fail or get scooped?
What’s his bandwidth like now that the lab is 3 years old — is he more or less available than when you started?
Do M.S. students get as much attention as PhD students in your experience?
How strong is placement support — does he make calls for internships or jobs?
Is there anything you wish you knew before joining the lab?
Are there any students who left or had difficulties? What happened?

High-Uncertainty, High-Impact Verification Questions

Ask Yale SEAS registrar or current students: Are there any M.S. thesis students from Cohan’s lab who published first-author at ACL/EMNLP/NeurIPS?
Check NSF Award Search (nsf.gov/awardsearch) for Arman Cohan — do any results appear?
Ask Ansong Ni (LinkedIn): Did the Google DeepMind and Meta internships result from Arman’s introductions, or were they self-sourced?
Ask Yixin Liu: Has the Meta FAIR and Google DeepMind internship been through Arman’s network, and does he actively advocate for students?

16. 12–24 Month Career Plan (Contingency)

Path A: Work with Cohan (Recommended under conditions)

Condition: Secure a clear intersection project scope in a pre-admission meeting.

Month 1–3: Join Yale NLP, read Cohan’s recent papers (SciRIFF, ReIFE, IRIS), build relationships with Yilun Zhao and Yixin Liu.
Month 3–6: Launch intersection project (e.g., scientific agent evaluation framework leveraging GUIAgentDebugger taxonomy). Target ACL 2027 or EMNLP 2027.
Month 6–12: Submit to top venue. Simultaneously begin internship search (target: AI2, Meta FAIR Applied Science, Google Research, Anthropic).
Month 12–18: Complete internship (Summer 2027). Return, write thesis draft.
Month 18–24: Defend thesis. Decide: PhD application (using Cohan’s strong recommendation + publication) or industry RS conversion from internship.
Contingency if project stalls: Pivot to a more core Cohan topic (scientific QA, RAG evaluation) at Month 6 to protect publication timeline.

Path B: Different Advisor at Yale

If Cohan meeting reveals domain mismatch is too severe or M.S. bandwidth is limited:

Explore: Dragomir Radev’s former students who may be now faculty at Yale; or Rex Ying (GNN/agent work); or Tesca Fitzgerald (robotics/interactive agents — more aligned with GUI agents).
Cohan could still be a committee member without being primary advisor.

Path C: Parallel Outreach Before Arriving

Before arriving at Yale in August 2026, email Cohan with a research pitch document (1 page) describing the scientific agent evaluation project idea. If he responds positively and proposes a concrete project, confidence in the working relationship increases significantly.

Sources

#	Source	Tier	URL
1	Yale NLP Lab – Home & Team	A	https://nlp.cs.yale.edu/
2	Yale Engineering Faculty – Arman Cohan	A	https://engineering.yale.edu/research-and-faculty/faculty-directory/arman-cohan
3	Google Scholar – Arman Cohan	B	https://scholar.google.com/citations?user=baI7IY0AAAAJ
4	Semantic Scholar – Arman Cohan	B	https://www.semanticscholar.org/author/Arman-Cohan/2527954
5	Georgetown CS – Cohan joins Yale	A	https://cs.georgetown.edu/news-story/arman-cohan-joins-yale-as-assistant-professor/
6	Yale Ventures – Reimagining Research	A	https://ventures.yale.edu/news/reimagining-research-arman-cohan-ai-agents-mentorship-and-scientific-discovery
7	Longformer – arXiv	B	https://arxiv.org/abs/2004.05150
8	SPECTER – ACL Anthology	B	https://aclanthology.org/2020.acl-main.207/
9	SciBERT – arXiv	B	https://arxiv.org/abs/1903.10676
10	Ansong Ni – Personal site + LinkedIn	C	https://niansong1996.github.io/
11	Yixin Liu – Homepage	C	https://yixinl7.github.io/
12	Xiangru Tang – Homepage	C	https://xiangrutang.github.io/
13	Yilun Zhao – Homepage	C	https://yilunzhao.github.io/
14	Hailey Schoelkopf – LinkedIn	C	https://www.linkedin.com/in/hailey-schoelkopf-070361286/
15	Jake Tae – LinkedIn	C	https://www.linkedin.com/in/jaketae/
16	Zhangir Azerbayev – LinkedIn	C	https://www.linkedin.com/in/zhangir-azerbayev-314ab21b8/
17	Linyong Nan – LinkedIn	C	https://www.linkedin.com/in/linyong-nan-b0b573130/
18	ScholarNexus – Arman Cohan	B/C	https://scholarnexus.ai/supervisor/Arman_Cohan?id=92f2f3ea-0860-436d-941a-26a344fae943
19	Roberts Innovation Fund – Yale SEAS	A	https://seas.yale.edu/news-events/news/roberts-innovation-fund-support-10-bold-seas-faculty-inventions
20	Wen Xiao – Homepage	C	https://wendy-xiao.github.io/
21	Alan (Haoxin) Li – Homepage	C	https://lihaoxin2020.github.io/
22	SciVer – ACL 2025	B	https://aclanthology.org/2025.acl-long.420/
23	Wu Tsai Institute – Arman Cohan	A	https://wti.yale.edu/profile/arman-cohan

Weijia (Charlie) Zhang