How to Apply to DeepSeek

Key Takeaways

DeepSeek was founded in May 2023 by Liang Wenfeng, who funded the lab through profits from High-Flyer Quant, his Hangzhou-based quantitative hedge fund, eliminating the need for venture capital and external investor pressure.
The company is headquartered in Hangzhou, Zhejiang Province, and employs roughly 150 researchers, the majority of whom are recent graduates and PhD students from top Chinese universities.
DeepSeek-V3 reportedly cost approximately 5.6 million US dollars in training compute, and the January 2025 release of DeepSeek-R1 triggered massive US tech stock losses including Nvidia's largest single-day market cap decline in history.
All DeepSeek frontier models are released as open-weight under permissive MIT-style licenses, positioning the company as a global research contributor rather than a closed commercial vendor.
Interviews are technical, research-focused, and conducted in Mandarin, emphasizing first-principles reasoning, mathematical rigor, and the ability to engage as a peer with active researchers rather than performing leetcode puzzles.
Foreign-passport hiring is extremely limited; the working language is Chinese, the office is in Hangzhou, and the company prefers candidates already embedded in the Chinese AI research ecosystem.
Compensation for senior research engineers and PhD-level researchers is reportedly competitive within the Chinese AI market, with total packages frequently in the 500,000 to 1,500,000 RMB range plus performance bonuses tied to research and shipped-model impact.
The company operates under US export controls restricting top-tier Nvidia GPUs, so candidates are expected to bring creative ideas for extracting more capability per FLOP through architectural and systems-level innovation.
DeepSeek does not have traditional product managers, marketing, or sales functions; researchers self-organize and the bar for joining is intentionally narrow rather than scaling headcount aggressively.

About DeepSeek

DeepSeek (深度求索, Shēndù Qiúsuǒ, literally 'Deep Seeking') is a Chinese artificial intelligence research company headquartered in Hangzhou, Zhejiang Province, that was founded in May 2023 by Liang Wenfeng (梁文锋). Liang is a Zhejiang University-trained AI engineer who built his fortune through High-Flyer Quant (幻方量化), a Chinese quantitative hedge fund he co-founded in 2015 that grew into one of China's largest quant shops with reportedly more than 100 billion RMB in assets under management. High-Flyer accumulated tens of thousands of Nvidia A100 GPUs for trading research before US export controls tightened, and Liang spun out DeepSeek as a dedicated AGI research lab funded entirely by High-Flyer's profits. This unusual structure, with no venture capital and no need for short-term revenue, has shaped the company's research-first identity and its willingness to publish open-weight models that competitors guard closely. The company employs roughly 150 people, the vast majority of whom are recent graduates and PhD students from elite Chinese universities including Tsinghua University, Peking University, Zhejiang University, and the Chinese Academy of Sciences. Liang has spoken publicly about deliberately hiring young researchers without prior industry experience, arguing that fresh perspectives matter more than credentialed seniority for fundamental AI research. The team is famously flat, with no traditional product managers, no marketing function, and minimal management hierarchy. Researchers self-organize into small teams that pursue specific architectural or training questions, with compute allocated based on the strength of the research proposal rather than seniority or political weight. DeepSeek's model trajectory has redefined expectations for what a small, focused team can accomplish. The DeepSeek-V2 release in May 2024 introduced the Multi-Head Latent Attention (MLA) architecture and a sparse Mixture-of-Experts (MoE) design that drove inference costs dramatically below Western competitors. DeepSeek-V3, released in December 2024, scaled the approach to 671 billion total parameters with 37 billion active per token and reportedly cost approximately 5.6 million US dollars in training compute, a figure that stunned the industry. Then in January 2025 the company released DeepSeek-R1, a reasoning model trained primarily through reinforcement learning that matched OpenAI's o1 on several benchmarks at a fraction of the inference cost. The R1 release triggered one of the largest single-day market capitalization losses in US technology history, with Nvidia alone shedding nearly 600 billion dollars in market value as investors rapidly reassessed the moat assumptions underpinning the American AI capital expenditure boom. DeepSeek publishes its models under permissive MIT-style licenses with full open weights, technical reports, and frequently the training code. This open-weight strategy positions the company as a global research contributor rather than a closed commercial vendor, and it has built deep credibility with academic and independent researchers worldwide. The company operates in a complex geopolitical environment shaped by US chip export controls that restrict access to the most advanced Nvidia accelerators. DeepSeek has navigated these constraints by pre-stocking GPUs through High-Flyer's earlier purchases, by aggressive software and architectural optimization, and by working with H800 and H20 variants that fall under export thresholds. Hangzhou itself, home to Alibaba and a vibrant Hangzhou Bay tech ecosystem, has emerged as a credible alternative to the Beijing-centric AI cluster, and DeepSeek is now its most visible AI export.

Application Process

1
Monitor the official careers page at careers
Monitor the official careers page at careers.deepseek.com (and the Chinese-language portal deepseek.com/careers) for posted positions; the company also recruits aggressively through Chinese tech job boards including Maimai (脉脉), Boss Zhipin (Boss直聘), and Liepin (猎聘).
2
Submit a Chinese-language resume (jianli, 简历) along with a personal research sta
Submit a Chinese-language resume (jianli, 简历) along with a personal research statement explaining your interests in foundation models, reinforcement learning, systems, or training infrastructure; English resumes are accepted for research roles but Chinese remains strongly preferred.
3
Pass an initial technical screen consisting of a take-home coding or research pr
Pass an initial technical screen consisting of a take-home coding or research problem, often involving CUDA kernel optimization, distributed training, or a reproduction of a recent paper result.
4
Complete two to four rounds of technical interviews conducted in Mandarin, cover
Complete two to four rounds of technical interviews conducted in Mandarin, covering deep learning fundamentals, transformer internals, MoE and MLA architectures, optimization theory, and recent DeepSeek paper specifics including the V3 and R1 technical reports.
5
Engage in a research discussion round where you defend a recent paper of your ow
Engage in a research discussion round where you defend a recent paper of your own choosing, propose an experiment you would run with unlimited compute, and answer probing questions about why you would expect your approach to work.
6
Meet with a senior researcher or directly with Liang Wenfeng for a final culture
Meet with a senior researcher or directly with Liang Wenfeng for a final culture and vision conversation focused on long-term AGI ambition, willingness to work in a flat structure, and ability to operate without traditional product or management scaffolding.
7
Receive an offer that typically combines a base salary, a research bonus tied to
Receive an offer that typically combines a base salary, a research bonus tied to publication or shipped-model contribution, and equity-equivalent participation through the High-Flyer parent structure, with onboarding at the Hangzhou headquarters.

Resume Tips for DeepSeek

recommended

Submit your resume in Simplified Chinese for all China-based roles; the recruiti

Submit your resume in Simplified Chinese for all China-based roles; the recruiting team and engineering panels work in Mandarin, and a polished Chinese jianli signals serious intent and basic cultural fit.

recommended

Lead with specific technical contributions to large model training, reinforcemen

Lead with specific technical contributions to large model training, reinforcement learning, distributed systems, or CUDA-level kernel work; vague claims of 'AI experience' or generic ML coursework are immediately filtered out.

recommended

Cite first-author or strong co-author publications at top venues such as NeurIPS

Cite first-author or strong co-author publications at top venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, or top AI workshops; preprints on arXiv with significant citation traction also carry weight.

recommended

Highlight quantitative achievements that demonstrate mathematical and algorithmi

Highlight quantitative achievements that demonstrate mathematical and algorithmic rigor, including math olympiad results (CMO, IMO), competitive programming finishes (ACM-ICPC regionals or finals, NOI), or graduate-level coursework in numerical optimization.

recommended

Show direct experience with PyTorch internals, Triton, CUDA, or large-scale dist

Show direct experience with PyTorch internals, Triton, CUDA, or large-scale distributed training frameworks such as Megatron-LM, DeepSpeed, or Colossal-AI; surface-level Hugging Face usage alone is insufficient.

recommended

Reference concrete reproductions or extensions of frontier work, such as having

Reference concrete reproductions or extensions of frontier work, such as having reproduced LLaMA pretraining at small scale, fine-tuned a 7B-plus model end-to-end, or implemented a published RL-from-feedback pipeline.

recommended

List academic affiliations clearly, prioritizing recognizable Chinese institutio

List academic affiliations clearly, prioritizing recognizable Chinese institutions including Tsinghua University (清华大学), Peking University (北京大学), Zhejiang University (浙江大学), Shanghai Jiao Tong University (上海交通大学), Fudan University (复旦大学), USTC (中国科技大学), and the Chinese Academy of Sciences (中国科学院); top international PhDs are also welcomed but rare.

recommended

Avoid resume padding with unrelated internships, generic full-stack web work, or

Avoid resume padding with unrelated internships, generic full-stack web work, or non-technical leadership roles; DeepSeek values depth in one or two narrow research areas over a broad surface.

ATS System: DeepSeek Proprietary Recruiting + Chinese Job Boards

DeepSeek operates its own recruiting pipeline through careers.deepseek.com and the Chinese-language deepseek.com/careers portal, supplemented by active sourcing on Chinese tech job boards including Maimai (脉脉), Boss Zhipin (Boss直聘), Liepin (猎聘), and direct referrals from Tsinghua, Peking University, Zhejiang University, and Chinese Academy of Sciences faculty. The company does not appear to use Western enterprise ATS platforms such as Workday, Greenhouse, or Lever. There is no public LinkedIn-driven application path for most roles, and most successful candidates either apply directly through the Chinese-language portal or are introduced through academic advisor networks.

Apply directly through careers.deepseek.com or deepseek.com/careers rather than relying on aggregator sites, which often miss new postings or surface stale roles.
Build relationships through the Chinese AI academic network, particularly with Tsinghua, PKU, Zhejiang University, and Chinese Academy of Sciences PhD advisors who maintain referral pipelines into DeepSeek.
Maintain an active arXiv presence with reproducible code; DeepSeek recruiters actively scan recent publications for promising independent researchers.
Engage substantively with the DeepSeek GitHub and Hugging Face repositories, including filing high-quality issues, contributing fixes, or reproducing published results, since these contributions are visible signals to the team.
Submit your resume in Simplified Chinese for the strongest signal of cultural and linguistic fit, even if the job posting nominally accepts English.

Complete DeepSeek Proprietary Recruiting + Chinese Job Boards Resume Guide →

Interview Culture

DeepSeek interviews reflect the company's identity as a research-first organization built around small, intensely curious teams rather than a conventional Chinese tech company.

Where employers like Baidu, Alibaba's DAMO Academy, or ByteDance often grade candidates on leetcode-style algorithm puzzles and behavioral STAR-format questions, DeepSeek interviews lean heavily on open-ended technical discussion. Candidates report being asked to walk through the architecture of DeepSeek-V3 and explain why Multi-Head Latent Attention reduces memory pressure compared to standard multi-head attention, to derive the scaling law for compute-optimal training from first principles, or to design a novel reinforcement learning reward shaping scheme for a specific reasoning task. Interviewers, themselves typically active researchers, expect candidates to engage as intellectual peers, to push back on flawed reasoning, and to admit uncertainty rather than bluff. The cultural backdrop has shifted dramatically since the January 2025 R1 release. Before R1, DeepSeek was a relatively obscure research lab known primarily within Chinese AI circles, and hiring was a quiet word-of-mouth pipeline through Tsinghua, PKU, Zhejiang, and Chinese Academy of Sciences advisor networks. After R1's global impact, application volume reportedly increased by orders of magnitude, with thousands of resumes arriving from Chinese and international candidates, including high-profile defections from competing labs. The company has responded by tightening its bar rather than scaling headcount, with Liang Wenfeng publicly stating that he prefers to remain small and focused rather than become a 'big company' competing on hiring volume. Compensation has reportedly been raised to attract top talent, but the cultural emphasis on intrinsic research motivation over career signaling persists. The High-Flyer financial backing is a defining cultural feature that interviewers frequently reference. Because DeepSeek does not need to raise venture capital, does not have a product roadmap dictated by go-to-market timelines, and does not need to ship features for revenue milestones, researchers are expected to think on multi-year horizons. Interview conversations often touch on a candidate's view of AGI timelines, what fundamental problems they would prioritize, and whether they can sustain motivation on long research bets that may not pay off for years. Candidates who frame themselves as future managers or product leaders are typically rejected; the company explicitly wants individual contributors who plan to stay technical for their entire career. US export-control compliance is also a recurring theme, with candidates expected to understand the constraints DeepSeek operates under and to bring creative ideas for getting more out of less compute, whether through quantization, sparsity, distillation, or novel architectures. Onsite interviews take place at the Hangzhou headquarters, and candidates from outside Hangzhou are expected to relocate; remote work is rare and reserved for established collaborators.

What DeepSeek Looks For

Deep mathematical and algorithmic foundations, including comfort with linear algebra, probability theory, optimization, and information theory at the level required to read and reproduce frontier ML papers without hand-holding.
Proven ability to ship large-scale distributed training or inference work, with concrete experience at the CUDA, Triton, or systems level rather than only at the high-level framework layer.
First-principles reasoning about architectural and training-procedure tradeoffs, demonstrated through original research output rather than incremental engineering.
Native or near-native Mandarin proficiency for working in the Hangzhou office, since all internal discussion, documentation, and code review happens in Chinese.
Long-term research orientation with patience for multi-year bets, and a clear distaste for product-management-style career framing or short-term metric chasing.
Cultural alignment with a flat, research-driven structure where compute and influence flow to the strongest ideas rather than to seniority or political maneuvering.
Genuine interest in open-source and open-weight research, including a willingness to publish technical reports, release model weights, and engage with the global research community.
Resourcefulness under hardware constraints, particularly the ability to extract maximum performance from the H800 and H20 GPU classes available under current US export rules.

Frequently Asked Questions

What is the typical compensation range for a research engineer at DeepSeek?

Reported compensation for research engineers and applied researchers ranges roughly from 500,000 RMB at the entry PhD level to 1,500,000 RMB or higher for senior researchers with strong publication records, before performance bonuses. Total packages reportedly include base salary, an annual performance bonus tied to research output and shipped-model contribution, and in some cases participation in High-Flyer-linked equity-equivalent arrangements. After the R1 release the company reportedly raised offers significantly to compete with Alibaba, Tencent, ByteDance, and Moonshot AI for top talent. Living costs in Hangzhou are notably lower than Beijing or Shanghai, which makes the take-home value meaningful even at the lower end of the range.

Does DeepSeek hire foreign nationals or non-Chinese-speaking candidates?

Hiring of non-Chinese candidates is very limited in practice. The internal working language is Mandarin, code review and design documents are written in Chinese, and the office is in Hangzhou with no announced international satellite locations. Exceptional foreign researchers with publication records at top venues have occasionally been hired, but they are expected to relocate to Hangzhou and to develop functional Chinese fluency quickly. The company is not currently a realistic target for candidates who do not speak Mandarin or who require visa sponsorship in a Western location.

How does the Hangzhou AI ecosystem compare to Beijing for AI careers?

Beijing has historically been the dominant Chinese AI cluster, hosting Baidu, ByteDance, Moonshot AI (Kimi), Zhipu AI, Baichuan AI, and a heavy concentration of academic AI labs at Tsinghua and Peking University. Hangzhou is a smaller but rapidly growing alternative anchored by Alibaba, Alibaba's DAMO Academy, NetEase, Hikvision, and now DeepSeek. Hangzhou offers a lower cost of living, less commute friction, and a quieter research-oriented atmosphere, but fewer companies overall mean less mobility if you decide to leave. Many candidates view DeepSeek specifically rather than Hangzhou broadly as the draw, and Hangzhou's AI gravity is increasing in the post-R1 era.

Why do candidates sometimes turn down DeepSeek offers for Alibaba DAMO, Baidu, or Moonshot AI?

Common reasons include personal preference for Beijing's broader AI ecosystem, larger stable compensation packages or vested equity at established players, more conventional career-ladder visibility, family or schooling reasons in other cities, or concerns about working in such a flat structure with no traditional management track. DeepSeek offers intellectual prestige and direct contribution to frontier model research, but it asks candidates to accept a research-only path with limited management mobility, intense focus expectations, and the geopolitical risk profile of being a high-visibility Chinese AI lab under active US scrutiny.

What is the interview process timeline?

Reported timelines range from three to eight weeks from initial application to offer. The process typically includes a resume screen within one to two weeks, a take-home or live technical screen, two to four onsite or video interview rounds with research engineers, a research discussion round, and a final cultural conversation that may include Liang Wenfeng directly for senior or strategic hires. Post-R1 application volume has stretched some timelines as the recruiting team works through significantly elevated inbound interest.

What does DeepSeek look for that other Chinese AI labs do not?

DeepSeek explicitly prioritizes research-first individual contributors who plan to stay technical for the long term, who can operate in a flat structure without traditional product or management scaffolding, and who bring deep architectural or systems-level intuition rather than incremental engineering polish. The company places less weight on prestigious internships, leadership titles, or product-shipping resumes compared to Alibaba, Tencent, or ByteDance, and more weight on first-principles reasoning, demonstrated reproductions of frontier work, and willingness to publish openly. Candidates who frame themselves primarily as future managers, product owners, or business leads are routinely filtered out at the first technical screen.

How does US chip export-control affect day-to-day work at DeepSeek?

Export controls restrict DeepSeek's access to Nvidia's most advanced accelerators including the H100 and B200 series, leaving the company to work primarily with H800, H20, and other export-compliant GPU variants alongside its pre-2022 inventory of A100 and earlier chips inherited from High-Flyer. This constraint has shaped a culture of aggressive software optimization, including the architectural choices behind Multi-Head Latent Attention, the Mixture-of-Experts sparsity in V3, and detailed CUDA and PJRT-level kernel work. Candidates are expected to view the constraint as a creative forcing function rather than a barrier, and meaningful interview conversations often probe how candidates would optimize a specific training or inference workload under tight memory and bandwidth budgets.

Does DeepSeek have an internship program for students?

DeepSeek runs a competitive internship pipeline targeted primarily at PhD students and outstanding masters candidates from top Chinese universities. Internships are typically 3 to 6 months in Hangzhou and involve direct research contribution under the mentorship of a senior researcher. Strong intern performance is one of the most reliable paths to a full-time offer, and many of the company's permanent researchers were converted from internships. Applications are reviewed continuously rather than in fixed cohorts, and the bar is high enough that even Tsinghua and PKU PhD students are routinely turned down.

What languages and tools should I expect to work with at DeepSeek?

Day-to-day research and engineering work centers on Python with PyTorch as the primary framework, extensive use of Triton and raw CUDA for performance-critical kernels, and custom internal training and inference infrastructure. Familiarity with Megatron-LM, DeepSpeed, or comparable distributed training frameworks is expected, along with experience profiling GPU workloads, debugging numerical stability issues at scale, and reasoning about communication patterns across multi-node training jobs. C++ comfort is valuable for systems and inference work, and shell and Linux fluency is assumed.

What is the work culture and schedule like?

Reports describe an intense but research-oriented schedule that does not match the worst 996 stereotypes of Chinese big-tech but is also not a Western nine-to-five. Researchers self-organize around projects, with deadlines driven by training-run timelines and paper or model release windows rather than fixed product launches. Collaboration is high-bandwidth, with internal discussion happening in Chinese across chat tools and in person at the Hangzhou office. The company is small enough that everyone knows everyone, and research direction is set by bottom-up proposal and debate rather than top-down roadmap. Burnout risk exists during intense training-run periods, but the absence of product, sales, and marketing functions removes a significant source of cross-functional friction common in larger AI labs.

How to Apply to DeepSeek

Key Takeaways

About DeepSeek

Application Process

Resume Tips for DeepSeek

Submit your resume in Simplified Chinese for all China-based roles; the recruiti

Lead with specific technical contributions to large model training, reinforcemen

Cite first-author or strong co-author publications at top venues such as NeurIPS

Highlight quantitative achievements that demonstrate mathematical and algorithmi

Show direct experience with PyTorch internals, Triton, CUDA, or large-scale dist

Reference concrete reproductions or extensions of frontier work, such as having

List academic affiliations clearly, prioritizing recognizable Chinese institutio

Avoid resume padding with unrelated internships, generic full-stack web work, or

ATS System: DeepSeek Proprietary Recruiting + Chinese Job Boards

Interview Culture

What DeepSeek Looks For

Frequently Asked Questions

Open Positions

Related Resources

Similar Companies

Related Articles

Sources