Job Description
Are you ready to shape the future of artificial intelligence? Nebula AI Labs is at the forefront of the 2026 AI revolution, developing next-generation generative models that redefine human-computer interaction. We are looking for a visionary Senior Generative AI Engineer to lead the architecture and deployment of Large Language Models (LLMs) that power our enterprise solutions.
In this role, you won't just maintain existing models; you will pioneer new methodologies in fine-tuning, reinforcement learning from human feedback (RLHF), and scalable inference pipelines. Join us in building the intelligent systems of tomorrow, today.
Responsibilities
- Architect Scalable LLM Pipelines: Design, develop, and optimize end-to-end Generative AI pipelines, focusing on Retrieval-Augmented Generation (RAG) and auto-regressive models.
- Model Optimization: Implement model quantization, pruning, and distillation strategies to reduce latency and improve inference costs for production environments.
- Prompt Engineering & Evaluation: Establish rigorous evaluation frameworks using LLM-as-a-judge to continuously improve model performance and safety alignment.
- Cross-Functional Collaboration: Partner with product managers and data scientists to translate business requirements into cutting-edge AI technical solutions.
- Research & Innovation: Stay ahead of the curve by integrating the latest advancements in Transformer architectures and multimodal AI into our core stack.
- Production Deployment: Manage the deployment of models on Kubernetes and cloud-native infrastructure (AWS/GCP) ensuring high availability and fault tolerance.
Qualifications
- Education: MS or PhD in Computer Science, Machine Learning, or a related quantitative field.
- Experience: 5+ years of experience in software engineering with a strong focus on Machine Learning and Deep Learning.
- Technical Skills: Proficiency in Python, PyTorch, TensorFlow, or JAX; strong experience with Hugging Face Transformers and LangChain.
- LLM Expertise: Deep understanding of Large Language Models, attention mechanisms, and fine-tuning techniques (LoRA, P-Tuning, QLoRA).
- Tools: Experience with MLOps tools (Docker, Kubernetes, MLflow, Weights & Biases) and vector databases (Pinecone, Milvus, ChromaDB).
- Soft Skills: Exceptional problem-solving abilities and excellent communication skills for technical leadership.