Advisor Dossier: Prof. Arman Cohan — Yale University

Advisor Dossier: Prof. Arman Cohan — Yale University

Student: Weijia Zhang | M.S. CS, Yale University (Aug 2026 – May 2028, Thesis Track, Full Scholarship) Assumed goals (inferred from CV): Industry-research 60% · Academia (PhD after M.S.) 40% Report date: 2026-06-11


1. Executive Summary

Top critical risks/unknowns:

  1. Research domain mismatch (high impact): Cohan’s lab focuses on scientific NLP, document understanding, and scholarly AI agents. Weijia’s background is in GUI/interactive agents, VLM agents, SFT pipelines, and RL. The overlap exists but requires an intentional pivot project.
  2. M.S. advising bandwidth (unknown): Cohan has only 4–5 active PhD students. M.S. thesis advising may get deprioritized relative to PhD work. No confirmed evidence of M.S. students publishing first-author top-tier papers from this lab yet.
  3. Frontier placement evidence thin (coverage gap): The lab is only 3 years old. Only 2 PhDs have graduated (both 2024). One confirmed frontier full-time placement (Ansong Ni → Meta FAIR). Not enough data to judge consistent frontier-lab pipeline.
  4. No confirmed NSF/NIH federal grants: Funding appears to rely on Yale institutional support, Google Research Scholar Award, and Roberts Innovation Fund. Grant runway beyond 2026 is not confirmed.

One-line verdict: Cohan is a high-quality, well-networked early-career advisor — but the research-domain fit for Weijia requires a deliberate intersection project, and the M.S. timeline (2 years) makes domain pivot riskier than it would be for a PhD student.

Strongest pros:

  • Lab produces top-tier publications (NeurIPS, ICLR, CVPR, ACL, EMNLP); current PhD students are exceptionally strong
  • Ongoing AI2 affiliation provides direct access to a major NLP research organization
  • Alumni network touches Meta FAIR, Anthropic, xAI, MSR, Google DeepMind
  • Cohan is accessible and early-career — likely high mentorship bandwidth for motivated M.S. students

Strongest cons:

  • Lab’s primary domain (scientific NLP, document understanding) is not Weijia’s primary strength (GUI agents, RL, VLM)
  • 2-year M.S. timeline leaves little room to establish a new research direction
  • M.S. student placement track record from this lab is limited (Kejian Shi → Scale AI is the most documented M.S. placement so far)
  • Frontier lab pipeline evidence for PhD-level is thin (1 full-time; undergrad-to-frontier placements are not precedent for PhD/M.S. trajectories)

Score snapshots:

  • Four-Dimension Fit Score: 71/100 → Proceed with caution
  • AI Industry Outcome (industry-research track): 67/100 → Proceed with caution

Coverage: Medium confidence (2 PhDs graduated, identity/role coverage ~80%, but M.S. coverage thin and attrition data unavailable)

Concrete next steps:

  1. Request a meeting with Cohan before arriving at Yale to test whether there’s a viable intersection project (agent evaluation for science; RL for scientific instruction-following).
  2. Talk to Yilun Zhao (current Yale NLP PhD, also from UIUC CS) and Yixin Liu (current PhD, has interned at Meta FAIR + Google DeepMind) to understand day-to-day advising, project ownership, and bandwidth.
  3. Ask Cohan directly about his policy toward M.S. thesis students vs. PhD students and publication expectations.
  4. Verify NSF/NIH grant status through Yale Research Administration records.
  5. Parallel strategy: arrive at Yale with 2–3 potential advisors in mind; Cohan is worth pursuing if a viable project can be scoped in the first meeting.

2. Critical Problems First

#ProblemSeverityConfidenceEvidence
1Research domain mismatchHighHighCohan: scientific NLP, long-doc, scholarly AI. Weijia: GUI/VLM agents, RL, SFT pipelines.
2M.S. advising bandwidthMediumMediumNo confirmed M.S. first-author top-tier paper from Cohan lab. Only 1 documented M.S. → industry outcome (Scale AI).
3Thin PhD placement sampleMediumHighOnly 2 PhD graduates (both 2024). Meta FAIR + Zoom AI. No academia placements. No OpenAI/Anthropic PhD full-time.
4Unconfirmed federal fundingLow-MediumLowNo NSF/NIH grant numbers found in public records. Relies on internal Yale + Google Scholar Award + AI2 affiliation.
5Short M.S. timelineMediumHigh2 years (2026–2028) is tight for domain pivot + thesis publication + internship.

3. Strong Pros and Strong Cons

Pros

  • Top-tier research output by current students: Yixin Liu (ICLR 2026, NeurIPS 2025, internships at Meta FAIR + Google DeepMind + MSR), Xiangru Tang (ICML 2025 Best Paper Runner-Up, WAIC Rising Star 2025, ICLR 2026, multiple Google/MS connections). Working alongside these students provides real learning and collaboration opportunities.
  • AI2 affiliation (ongoing): Cohan retains a Faculty Research Scientist role at Allen Institute for AI. This is an active research relationship, not honorary. It opens access to AI2’s massive resources, data, and talent network (Iz Beltagy, Kyle Lo, Doug Downey, Noah Smith at UW, etc.).
  • Frontier lab alumni access: Ansong Ni (Meta FAIR RS), Hailey Schoelkopf (Anthropic MTS), 3 grads at xAI (Azerbayev, Tae, Deyuan Li). This is a warm-referral network that M.S. students could potentially access.
  • Google Research Scholar Award: External validation of lab quality; also signals Google connection.
  • Early-career = high accessibility: Cohan joined Yale in 2023; he has incentives to mentor strong students closely. He likely has more bandwidth than senior faculty.
  • Viable intersection projects exist: Scientific agent evaluation (combining Weijia’s agent taxonomy expertise from GUIAgentDebugger with Cohan’s SciVer/ReIFE evaluation methodology) is a natural fit.

Cons

  • Domain mismatch: Cohan’s core research is scientific NLP (SciBERT, SPECTER, SciRIFF, SciDQA). Weijia’s core expertise is VLM/GUI agents, RL-based training (VERL), SFT pipelines. Joining requires reframing research direction.
  • M.S. students may be second-priority: In most NLP labs, PhD students dominate the advisor’s attention. No evidence that Cohan’s M.S. students receive the same level of advising depth as his PhDs.
  • Lab is very young: Only 2 PhDs graduated. Distribution statistics are based on too small a sample. A single outlier (Ansong Ni, who had 4 industry internships before graduating) may over-represent the lab’s typical trajectory.
  • No confirmed Weijia-relevant M.S. placement: The only documented M.S. outcome is Kejian Shi → Scale AI, which is a respectable but not frontier outcome.
  • xAI concentration is unusual and hard to interpret: 3 BS grads going to xAI in sequence (2023–2025) may reflect a personal connection rather than a systematic pipeline — not guaranteed to continue.
  • Frontier goal requires two hops: Weijia is an M.S. student, and most frontier lab Research Scientist hiring targets PhD-level candidates. Realistically, M.S. → frontier requires either a PhD afterward (which Cohan can help with) or exceptional output during M.S. (possible but uncertain).

4. Alumni Outcomes and Graduation Windows

PhD Graduates

NameStart yrGrad yrConfidenceFirst RoleCurrent RoleOutcome TypeFrontier?Exit
Ansong Ni20202024High [resolved]Research Scientist, Meta FAIRResearch Scientist, Meta FAIRIndustry-researchYes (Meta FAIR)Graduated
Linyong Nan~20192024High [resolved]Sr. AI Scientist, Zoom AISr. AI Scientist, Zoom AIIndustry-researchNoGraduated

Current PhD Students (Active, no graduation data yet)

NameStart yrEst. gradConfidenceResearch focusInternship record
Yilun Zhao2021~2026HighLLM evaluation, AI4Science, multimodalUnknown
Yixin Liu~2022~2027HighLLM eval, reward modeling, trainingMeta FAIR, Google DeepMind, MSR
Xiangru Tang~2021–22~2026–27HighLLM agents, AI scientists, bioinformaticsGoogle (3 teams), MS, Eigen AI
Alan (Haoxin) Li~2023–24~2028HighEfficiency, multimodality, retrievalUnknown
Chengye Wang~2023–24~2028MediumScientific NLP, multimodalUnknown

M.S. Graduates (Known)

NameGrad yrConfidencePlacementOutcome Type
Kejian Shi2025HighScale AIIndustry-engineering
Chuhan Li2025HighPhD → UCSB (Xin Eric Wang)Academia
Ziyao Shangguan2025HighPhD → Yale CS (robotics/NLP)Academia
Heyuan Huang2024HighPhD → Johns Hopkins CLSPAcademia
Leyao Wang~2025–26MediumUnknownUnknown

B.S. Graduates / Undergrad Researchers

NameGrad yrConfidencePlacementOutcome TypeFrontier?
Hailey Schoelkopf2023HighAnthropic (MTS)Industry-researchYes
Zhangir Azerbayev2023HighxAI (MTS)Industry-researchFrontier-adjacent
Jake Tae2024HighxAI (MTS)Industry-researchFrontier-adjacent
Stephen Yin2024HighPhD → UChicago (Stats)AcademiaNo
Lj Flores2024HighMSc → McGill/Mila (Jackie Cheung)AcademiaNo
Deyuan Li2025HighxAI (MTS)Industry-researchFrontier-adjacent

AI2-era Mentees (Pre-Yale, collaborative)

NameContextPlacementRole
Wen XiaoAI2 intern 2021 (UBC PhD)Microsoft Azure AIResearcher
Sean MacAvaneyGeorgetown intern 2020University of GlasgowSenior Lecturer

5. Placement Distribution and Attrition Analysis

PhD cohort (n=2): Both graduates went to industry research. Neither went to academia. This is a small sample insufficient for distribution analysis.

  • Upper tail: Ansong Ni → Meta FAIR RS (strong)
  • Lower tail: Linyong Nan → Zoom AI SR AI Scientist (respectable; not frontier)
  • Median: unknown (n=2)
  • Variance: high due to small sample

M.S. cohort (n=5 known): Mix of PhD programs (3) and industry (1 Scale AI) and unknown (1).

  • No confirmed M.S. → frontier lab placements.
  • Mostly stepping-stone placements (PhD or respectable industry).

B.S. cohort (n=6 known): Strongly skewed toward xAI/Anthropic (frontier/frontier-adjacent) and PhD programs.

  • xAI pipeline appears real (3 consecutive grads 2023–2025).
  • Anthropic: 1 placement.
  • Note: B.S. → frontier placements are not directly predictive of M.S./PhD → frontier placements, which are more competitive.

Attrition: No evidence of non-completions, forced exits, or bad quits. However, the lab is only 3 years old — there may not have been time for problematic exits to occur or become public. Coverage of attrition is low due to sample size.

Near-graduation unemployment/underemployment risk: No evidence of either, but the sample is too small to draw conclusions.

Frontier readiness estimate (for PhD-level): Limited but improving.

  • 1 confirmed frontier full-time (Ni → Meta FAIR)
  • 2 confirmed frontier internships from current students (Liu at Google DeepMind and Meta FAIR)
  • Undergrad-to-frontier pipeline (Anthropic, xAI) signals that Cohan’s network reaches these labs, but this does not automatically translate to PhD placements

6. Data Coverage Dashboard

MetricCoverageConfidence
Resolved alumni identity (PhD+MS)~87% (13/15)High
Verified first role after graduation (PhD+MS)~80% (12/15)High
Verified current role~73% (11/15)Medium
Role-family classification (high/medium confidence)~87%High
Frontier funnel evidence1 full-time, 2 internshipsLow-Medium
Founder/commercialization evidenceRoberts Innovation Fund (Cohan PI); no student founder evidenceLow
Verifiable attrition reason~10% (mostly unknown; no confirmed bad quits)Low
Near-graduation employment-status/latency~40% (only Ni + Nan confirmed rapid placement)Low

Overall coverage confidence: Medium

  • Critical metrics (resolved identity, first role) are high; but attrition, near-graduation status, and M.S.-specific coverage are low.
  • No automatic verdict downgrade from coverage gate (coverage is medium, not low).

What missing data would most change the verdict:

  1. M.S. student publication track record (first-author top-tier papers from M.S. advisees)
  2. Current PhD students’ actual graduation placements (Yilun Zhao, Yixin Liu, Xiangru Tang are expected ~2026–27 — these will be the most informative data points)
  3. NSF/NIH grant status and runway
  4. Any off-the-record information about M.S. advising experience and whether Cohan truly invests in thesis students

7. Four-Dimension Risk and Fit Assessment

Goal weights (blended: 60% industry-research + 40% academia):

  • Survival: 27Academic: 27Industry: 31Happiness: 15
DimensionScoreEvidenceConfidence
Survival75Full Scholarship (institutional, not advisor-dependent); Google Research Scholar; AI2 affiliation; Yale support. No confirmed federal grants but lab is active.Medium
Academic outcome68Current PhDs publish at NeurIPS/ICLR/CVPR/ACL. Strong PhD-application support potential. But: 0 faculty placements; domain mismatch for Weijia; M.S. publication rate unknown.Medium
Industry outcome70Ni → Meta FAIR (1 frontier full-time); xAI pipeline (3 grads); Anthropic (1 grad); current student internships at Google DeepMind + Meta FAIR. M.S. → frontier unclear.Medium-Low
Happiness72No negative culture signals; active productive lab; accessible early-career advisor; small NLP lab community at Yale. Domain pivot required.Low

Four-Dimension Fit Score: 75 × 0.27 + 68 × 0.27 + 70 × 0.31 + 72 × 0.15 = 20.25 + 18.36 + 21.70 + 10.80 = **71.1/100**

Base verdict: Proceed with caution (50–74 range)


8. AI Industry Outcome Scorecard (Industry-Research Track)

CategoryWeightScoreEvidence
Frontier placement evidence3523Ansong Ni → Meta FAIR RS (1 verified full-time frontier PhD); Hailey Schoelkopf → Anthropic (1 BS graduate, provisional signal); 3 xAI (BS-level, frontier-adjacent). Current PhD Yixin Liu interned at Meta FAIR + Google DeepMind — not yet graduated.
Internship-to-offer conversion2013Ansong Ni: 4 internships → Meta FAIR full-time (strong evidence of systematic internship strategy). Liu has 3 frontier lab internships pending graduation. No confirmed conversion failure cases.
Network access to hiring teams2014AI2 affiliation (Beltagy, Lo, Downey); Ansong Ni at Meta FAIR (warm referral channel); Google Research Scholar (Google connection); Schoelkopf at Anthropic; xAI grads. Strong but not dense.
Project relevance to target teams159Cohan’s scientific NLP and agents are relevant to research/applied science teams at labs. Less directly relevant to pretraining, post-training RL, or core alignment teams. Weijia’s SFT/RL background could fill this gap if positioned right.
Geography and visa feasibility108Yale in New Haven, CT. Reasonable US location. On-campus recruiting not as strong as Bay Area but manageable.

Industry-research track total: 67/100 → Proceed with caution (no cap at 75+)

Verified Frontier Placement Table

NameLab RoleDest.Frontier Lab?ConfidenceNotes
Ansong NiPhD graduate (2024)Meta FAIRYesHigh [resolved]Research Scientist; had 4 internships prior
Hailey SchoelkopfB.S. graduate (2023)Anthropic (MTS)YesHigh [resolved]B.S.-level; not predictive of PhD track
Yixin LiuCurrent PhDMeta FAIR intern + Google DeepMind intern + MSR internFrontier internshipsHighNot yet graduated; conversion pending
Xiangru TangCurrent PhDGoogle (3 teams), MicrosoftFrontier internshipsHighNot yet graduated

Frontier gate result:

  • Verified frontier full-time PhDs: 1 (Ni → Meta FAIR)
  • Verified frontier internships: 2+ (Liu at GDM + Liu at Meta FAIR)
  • Gate: 1 full-time < 2 required for unrestricted Strong Fit; 2 internships < 3 required with documented conversion
  • Result: Frontier gate caps verdict at Proceed with caution

9. Verdict and Score Reconciliation

GateResultVerdict cap
Four-Dimension Fit Score: 71.1Proceed with caution rangeProceed with caution
Industry-research track: 67/10050-74 rangeProceed with caution
Frontier gate: 1 full-time, 2 internshipsBelow Strong Fit thresholdsProceed with caution
Coverage: MediumNo automatic downgrade

Final verdict: ⚠️ Proceed with caution (71/100)


10. Personalized Fit (Weijia Zhang × Arman Cohan)

Research Overlap

| Weijia’s Background | Cohan’s Research | Overlap Level | |——————–|—————–|—————| | GUI/VLM agents, agent debugging (GUIAgentDebugger) | AI agents for scientific discovery (IRIS, SciMentor) | Moderate — same paradigm, different domain | | RAG, vector DBs, retrieval | SPECTER, RouterRetriever, RAG | Strong — directly aligned | | SFT data pipeline, post-training (MSRA Excel Copilot) | SciRIFF, instruction-following (ReIFE) | Moderate — different application domain | | RL for agents (OpenManus-RL, VERL) | LLM evaluation, claim verification | Weak — different focus | | LLM evaluation (GUIAgentDebugger error taxonomy) | ReIFE, SciVer, LLM-as-judge evaluation | Moderate-Strong |

Most Promising Intersection Projects

  1. Scientific AI agent evaluation framework: Weijia’s GUIAgentDebugger taxonomy (4 categories, 29 subtypes of agent failures) could be adapted to scientific agent tasks (hypothesis generation, literature search, experiment planning). Cohan’s SciVer + IRIS direction needs robust evaluation methodology. This is the strongest pitch.
  2. RL for scientific instruction-following: Combine Weijia’s RL/SFT pipeline experience with Cohan’s SciRIFF dataset and ReIFE evaluation work. Produce a scientific agent that can follow complex multi-step instructions over literature.
  3. Multimodal scientific RAG agents: Leverage Weijia’s VLM agent background + Cohan’s retrieval/embedding work to build agents that navigate multimodal scientific documents (tables, figures, text). Connects to SciVer (ACL 2025) direction.

Skill Match

  • Weijia’s strengths that fit Cohan’s lab: Python/PyTorch infrastructure, SFT data pipeline engineering, agent framework development (LangGraph, MCP), VLM/LLM engineering, RL training.
  • Gap to fill: Scientific NLP literature (SciDQA, QASPER, summarization), scholarly document processing. These are learnable in the first semester.
  • Weijia’s unique edge: MSRA internship (VLM agents for production systems), OpenManus-RL contributor (60K stars), GUIAgentDebugger (novel agent failure taxonomy). This is a strong candidate profile for Cohan’s lab.

Potential Friction Points

  1. Scope narrowing: Cohan may ask Weijia to focus on scientific domains (biology, chemistry, medicine), which may feel limiting compared to general-purpose agent work.
  2. M.S. timeline pressure: 2 years with thesis requirement. Starting a new research direction risks not having publishable results by year 2.
  3. PhD students get priority: Yixin Liu and Xiangru Tang are near graduation and likely absorb significant advisor bandwidth. A new M.S. student may not get the same level of direct engagement.
  4. Weijia is at UIUC → Yale (same path as Yilun Zhao): This could be a conversation starter with Yilun Zhao, but also means Weijia shouldn’t assume Cohan has a “UIUC → Yale NLP PhD” pipeline for M.S. students.

Network Complementarity

  • Cohan’s network (AI2, Meta FAIR, Anthropic, Google Research, xAI) is highly complementary to Weijia’s existing network (MSRA, OpenManus/ByteDance adjacent, Tencent WeChat).
  • Working with Cohan would add US-based frontier lab connections that Weijia’s current network doesn’t cover strongly.

One-line fit verdict

Weijia is a strong candidate for Cohan’s lab if she can pitch a scientific AI agent evaluation project; the main risks are the short M.S. timeline and the domain pivot cost.


11. Alumni Impact and Connection Mapping (Prioritized)

NameRelationRoleWhy They MatterChannel
Yilun ZhaoCurrent PhD (Yale NLP), UIUC alumnusPhD candidateSame institution, same lab, UIUC background — best person to get real inside perspective on Cohan’s advising styleIn person at Yale / email
Yixin LiuCurrent PhD (Yale NLP)PhD candidate, former Meta FAIR/GDM/MSR internCan speak to internship pipeline, what it takes to land at frontier labs from Cohan’s labIn person at Yale / email
Xiangru TangCurrent PhD (Yale NLP)PhD candidate, multiple awardsCan speak to publication trajectory, advisor interaction, and startup/industry connectionsIn person at Yale / email
Ansong NiPhD alumnus (2024)Research Scientist, Meta FAIRBest source on: what Cohan’s advising is really like, how the internship pipeline works, realistic frontier lab placement probabilityLinkedIn / email
Hailey SchoelkopfB.S. alumna (2023)Anthropic MTSCan speak to Cohan’s network reach to Anthropic; but note B.S. → MTS is a different trackLinkedIn
Jake TaeB.S. alumnus (2024)xAI MTSCan speak to xAI pipeline specificsLinkedIn

Connection strength: Yilun Zhao, Yixin Liu, Xiangru Tang = direct (same lab, will be at Yale with Weijia). Ansong Ni = adjacent. Others = adjacent to weak.


12. Funding and Resources

SourceAmountStatusRunway
Yale M.S. Full Scholarship (Weijia’s)InstitutionalConfirmed2 years (not advisor-dependent)
Google Research Scholar Award~$60–100K (typical range)ConfirmedUnknown expiration
Roberts Innovation Fund (Yale)Seed grant (2×)ConfirmedShort-term seed
AI2 Faculty Research ScientistNon-financial affiliationOngoingOngoing
Unnamed health startup collaborationUnknownPostdoc ad posted Jul 2024Unknown
NSF / NIH grantsUnknownNot confirmed in public recordsUnknown

Key funding risk: If Cohan does not have active federal grants, M.S. thesis research infrastructure (compute, data) depends on Yale institutional resources and AI2 access. For a scientific NLP lab this is typically manageable (cloud APIs, open datasets), but compute-intensive RL training (Weijia’s specialty) may require confirmation.

Recommendation: Ask Cohan directly what compute resources are available for thesis students, especially if Weijia wants to continue RL-scale experiments.


13. Academic Profile

Position: Assistant Professor, Yale CS (Jan 2023). Faculty Research Scientist, AI2 (ongoing). Affiliated with Yale Wu Tsai Institute and Yale School of Medicine.

Education: PhD, Georgetown University, 2018. Advisor: Nazli Goharian. Dissertation: Harold N. Glassman Distinguished Doctoral Dissertation Award in Science (2019).

Prior: Research Scientist at AI2 (2018–2022); Affiliate AP at University of Washington (2021–2022).

Citation metrics (approx.): ~27,000+ Google Scholar citations [Tier B, provisional]. Semantic Scholar: ~104 papers, ~1,860 highly influential citations. H-index: not confirmed (estimated 35–45 range based on citation volume; verify directly at scholar.google.com/citations?user=baI7IY0AAAAJ).

Landmark papers:

  • Longformer (2020) — ~4,700+ citations; foundational efficient transformer [1]
  • SciBERT (2019) — thousands of citations; canonical scientific NLP model [2]
  • SPECTER (ACL 2020) — widely used scientific paper embeddings [3]

Most high-citation work is from AI2 era (2018–2022). Yale-era publications (2023–2026) are more applied and evaluation-focused, consistent with building a new lab.

Awards: EMNLP Best Long Paper (2017); EACL Outstanding Paper; COLING Honorable Mention; Google Research Scholar Award; Roberts Innovation Award (Yale, 2×).

Service: Area Chair: ACL 2020, ICLR 2021, NAACL 2021. Organizer: SciNLP, SDP workshop series.


14. Research Gaps (What We Don’t Know)

  1. H-index / i10-index: Must be read directly from Google Scholar; not found in search results.
  2. Federal grant portfolio: No NSF/NIH award numbers confirmed. This is a meaningful unknown for lab longevity.
  3. M.S. thesis publication rate: No documented first-author ACL/EMNLP/NeurIPS/ICLR papers from M.S. advisees. Absence of evidence ≠ evidence of absence.
  4. Cohan’s advising style for M.S. vs. PhD students: Lab website doesn’t distinguish. Must ask directly or ask current students.
  5. Health startup collaboration: Name not public; nature and duration of collaboration unknown.
  6. Yilun Zhao, Xiangru Tang graduation placements: Both expected ~2026. These will be the most informative placement data points in the next 12 months.
  7. Full postdoc roster: One postdoc position was posted July 2024; hired candidate and current status unknown.

15. Questions to Ask

Questions for Prof. Cohan

  1. What major problems is the lab trying to solve, and why is it positioned to win (especially vs. larger AI2, Stanford NLP, CMU)?
  2. Do you advise M.S. thesis students with the same depth as PhD students? What does a typical week of M.S. thesis advising look like?
  3. What would a realistic first project be for me, given my background in GUI agents and RL? Are compute and data already in place?
  4. What is the funding model for M.S. thesis students? What compute resources can I access?
  5. What is your authorship policy? Do M.S. students get first-author opportunities?
  6. What is your internship policy during the M.S.? Can I do a summer internship while working on the thesis?
  7. What is the typical time-to-completion for your M.S. thesis students? Do students typically publish before defending?
  8. How does the AI2 affiliation work in practice — can thesis students access AI2 resources or collaborate with AI2 researchers?
  9. What happens if the initial project direction isn’t working — how do you pivot?

Questions for Current/Former Students (Yilun Zhao, Yixin Liu, Ansong Ni)

  1. What is day-to-day life like in the lab — how often does Arman engage directly with your work?
  2. Are meeting commitments reliable, and how fast does he review drafts or give feedback?
  3. How does he handle it when projects fail or get scooped?
  4. What’s his bandwidth like now that the lab is 3 years old — is he more or less available than when you started?
  5. Do M.S. students get as much attention as PhD students in your experience?
  6. How strong is placement support — does he make calls for internships or jobs?
  7. Is there anything you wish you knew before joining the lab?
  8. Are there any students who left or had difficulties? What happened?

High-Uncertainty, High-Impact Verification Questions

  1. Ask Yale SEAS registrar or current students: Are there any M.S. thesis students from Cohan’s lab who published first-author at ACL/EMNLP/NeurIPS?
  2. Check NSF Award Search (nsf.gov/awardsearch) for Arman Cohan — do any results appear?
  3. Ask Ansong Ni (LinkedIn): Did the Google DeepMind and Meta internships result from Arman’s introductions, or were they self-sourced?
  4. Ask Yixin Liu: Has the Meta FAIR and Google DeepMind internship been through Arman’s network, and does he actively advocate for students?

16. 12–24 Month Career Plan (Contingency)

Condition: Secure a clear intersection project scope in a pre-admission meeting.

  • Month 1–3: Join Yale NLP, read Cohan’s recent papers (SciRIFF, ReIFE, IRIS), build relationships with Yilun Zhao and Yixin Liu.
  • Month 3–6: Launch intersection project (e.g., scientific agent evaluation framework leveraging GUIAgentDebugger taxonomy). Target ACL 2027 or EMNLP 2027.
  • Month 6–12: Submit to top venue. Simultaneously begin internship search (target: AI2, Meta FAIR Applied Science, Google Research, Anthropic).
  • Month 12–18: Complete internship (Summer 2027). Return, write thesis draft.
  • Month 18–24: Defend thesis. Decide: PhD application (using Cohan’s strong recommendation + publication) or industry RS conversion from internship.
  • Contingency if project stalls: Pivot to a more core Cohan topic (scientific QA, RAG evaluation) at Month 6 to protect publication timeline.

Path B: Different Advisor at Yale

If Cohan meeting reveals domain mismatch is too severe or M.S. bandwidth is limited:

  • Explore: Dragomir Radev’s former students who may be now faculty at Yale; or Rex Ying (GNN/agent work); or Tesca Fitzgerald (robotics/interactive agents — more aligned with GUI agents).
  • Cohan could still be a committee member without being primary advisor.

Path C: Parallel Outreach Before Arriving

Before arriving at Yale in August 2026, email Cohan with a research pitch document (1 page) describing the scientific agent evaluation project idea. If he responds positively and proposes a concrete project, confidence in the working relationship increases significantly.


Sources

#SourceTierURL
1Yale NLP Lab – Home & TeamAhttps://nlp.cs.yale.edu/
2Yale Engineering Faculty – Arman CohanAhttps://engineering.yale.edu/research-and-faculty/faculty-directory/arman-cohan
3Google Scholar – Arman CohanBhttps://scholar.google.com/citations?user=baI7IY0AAAAJ
4Semantic Scholar – Arman CohanBhttps://www.semanticscholar.org/author/Arman-Cohan/2527954
5Georgetown CS – Cohan joins YaleAhttps://cs.georgetown.edu/news-story/arman-cohan-joins-yale-as-assistant-professor/
6Yale Ventures – Reimagining ResearchAhttps://ventures.yale.edu/news/reimagining-research-arman-cohan-ai-agents-mentorship-and-scientific-discovery
7Longformer – arXivBhttps://arxiv.org/abs/2004.05150
8SPECTER – ACL AnthologyBhttps://aclanthology.org/2020.acl-main.207/
9SciBERT – arXivBhttps://arxiv.org/abs/1903.10676
10Ansong Ni – Personal site + LinkedInChttps://niansong1996.github.io/
11Yixin Liu – HomepageChttps://yixinl7.github.io/
12Xiangru Tang – HomepageChttps://xiangrutang.github.io/
13Yilun Zhao – HomepageChttps://yilunzhao.github.io/
14Hailey Schoelkopf – LinkedInChttps://www.linkedin.com/in/hailey-schoelkopf-070361286/
15Jake Tae – LinkedInChttps://www.linkedin.com/in/jaketae/
16Zhangir Azerbayev – LinkedInChttps://www.linkedin.com/in/zhangir-azerbayev-314ab21b8/
17Linyong Nan – LinkedInChttps://www.linkedin.com/in/linyong-nan-b0b573130/
18ScholarNexus – Arman CohanB/Chttps://scholarnexus.ai/supervisor/Arman_Cohan?id=92f2f3ea-0860-436d-941a-26a344fae943
19Roberts Innovation Fund – Yale SEASAhttps://seas.yale.edu/news-events/news/roberts-innovation-fund-support-10-bold-seas-faculty-inventions
20Wen Xiao – HomepageChttps://wendy-xiao.github.io/
21Alan (Haoxin) Li – HomepageChttps://lihaoxin2020.github.io/
22SciVer – ACL 2025Bhttps://aclanthology.org/2025.acl-long.420/
23Wu Tsai Institute – Arman CohanAhttps://wti.yale.edu/profile/arman-cohan