Senior Applied Research Scientist, Multi-modal Foundation Models in Santa Clara, California, United States

Job Information

Nvidia Senior Applied Research Scientist, Multi-modal Foundation Models in Santa Clara, California

We are the NVIDIA-Metropolis TAO (Train-Adapt-Optimize) - Foundation Models team are looking for a Senior Applied Research Scientist/Engineer to join our team and develop our Multi-modal Foundation Models and GenAI solutions. We are developing a host of solutions including multi-modality/ vision-language/ vision-centric foundational models for images/ video and 3D-world understanding.

Come join us in these exciting times and make a sizable difference in the exploding world of Deep Learning! Doing what’s never been done before takes vision, innovation, and the world’s best talent.

As an NVIDIAN, you’ll be immersed in a diverse, cultivating environment where everyone is inspired to do their life's work. This role will require someone who deeply understands and can architect/ apply algorithms with Computer vision and Multi-modal Foundation models and apply these to applications such as Smart City/Retail/Manufacturing AI market etc. As part of our team, you will be contributing to some of the groundbreaking multi-modal research and product solutions from NVIDIA including but not limited to: TAO (https://developer.nvidia.com/tao-toolkit) , NEMO (https://www.nvidia.com/en-us/ai-data-science/products/nemo/) , VILA (https://github.com/Efficient-Large-Model/VILA) , LITA (https://github.com/NVlabs/LITA) , VIA, (https://www.nvidia.com/en-us/solutions/robotics-and-edge-computing/vision-ai/visual-insight-agent/) and more.

What you’ll be doing:

Conduct applied research and design innovative algorithms in the space of Computer vision, image/ video/ vision-language/ vision-centric foundation models, Diffusion Models and 3D-VLMs.
Stay abreast of the latest research papers, and breakthroughs in multi-modality/ foundation models research and implement these to improve NVIDIA models.
Developing AI infrastructure for large-scale training and evaluation pipelines for foundation models.
Drive the gathering, building, and auto-labeling pipelines for annotation of datasets to train domain-specific SOTA VLMs and FMs.
Develop, Train, Fine-tune, and Deploy Foundation models for varied use cases including smart cities, industrial manufacturing, gaming etc.
Apply alignment techniques such as instruction tuning, reinforcement learning from human feedback (RLHF), and parameter-efficient fine-tuning such as p-tuning, adaptors, LoRA, and so on to improve use cases.
Collaborate closely with Research teams to develop and bring SOTA models to product.
Design new algorithms towards product-steered research and publish in conferences/ leaderboards.
Mentor and guide junior team members, encouraging a collaborative and innovative team culture.

What we need to see:

MS or PhD in Computer Science, Computer Engineering or Electrical Engineering or related field in Deep Learning, Machine Learning, and Computer Vision or equivalent experience.
5+ years of algorithm development/ research experience relevant work/research experience in one or many of the following areas: Vision-language models, Foundation Models, 3D-LLMs, Video generative models and diffusion algorithms; or Action-based transformers.
Experience in training vision-foundation models including 3D-VLMs, ViT, LLaVA, CLIP, Diffusion, and VLMs.
Hands-on experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and proficiency in modern software development practices (version control, testing, CI/CD).
Publications in top-tier AI conferences or contributions to open-source projects is a plus.
Excellent communication skills and the ability to collaborate efficiently in a cross-functional, distributed team environment.

With a competitive salary package and benefits, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and autonomous GenAI Engineer, who loves challenges? Do you have a genuine passion for advancing the state of AI & machine learning across a variety of industries? If so, we want to hear from you.

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Apply Now

NHS Human Services, Inc.

Job Information

Nvidia Senior Applied Research Scientist, Multi-modal Foundation Models in Santa Clara, California

Current Search Criteria