Overview:
We are seeking a highly skilled Generative Computer Vision Engineer with a focus on Creative art synthesis. The ideal candidate will have deep expertise in computer vision and generative models, along with a strong grasp of the underlying algorithms and techniques. You will be responsible for developing and optimizing AI systems that transform text inputs into high-quality video and image content, applying state-of-the-art generative methods to real-world applications.
Key Responsibilities: • Design and develop advanced generative AI models for video generation and image synthesis from textual descriptions. • Implement and optimize text-to-image and text-to-video generation pipelines, ensuring high-quality and efficient output. • Leverage generative techniques such as GANs (Generative Adversarial Networks), VAEs (Variational Autoencoders), Diffusion Models, and other state-of-the-art methods. • Understand and apply the underlying mathematical and algorithmic concepts behind generative models to ensure efficient, scalable, and innovative solutions. • Collaborate with research, product, and creative teams to fine-tune models for specific visual and creative goals. • Develop and enhance multimodal learning systems that integrate natural language processing (NLP) and computer vision for text-to-visual generation. • Research and stay updated on cutting-edge advancements in computer vision and generative AI, incorporating them into new product features and enhancements. • Optimize AI pipelines for performance and scalability, utilizing hardware acceleration where necessary (e.g., GPU, TPU). • Contribute to building a robust infrastructure for deploying generative models in production environments.
Required Skills and Qualifications: • Master’s or Ph.D. in Computer Science, AI, Machine Learning, or a related field, with a focus on computer vision or generative models. • Deep understanding of the theoretical and practical foundations of generative models, including GANs, VAEs, Diffusion Models, and transformer-based architectures. • Strong programming proficiency in Python, with experience using machine learning libraries like PyTorch, TensorFlow, or JAX. • Experience in building and deploying text-to-image and text-to-video models, such as DALL·E, Stable Diffusion, or CLIP. • Familiarity with motion synthesis and video generation techniques using AI. • Proficiency in computer vision tools like OpenCV, Scikit-image, and mediapipe. • Experience integrating natural language processing (NLP) with visual models for text-driven visual generation. • Fluency in the mathematics behind optimization techniques, loss functions, regularization methods, and neural network architectures. • Hands-on experience with cloud platforms like AWS, Google Cloud, or Azure for large-scale AI model deployment. • Knowledge of multimodal learning and the integration of text, image, and video streams into cohesive systems. • Proficiency in tools for 3D rendering, video editing, or animation (e.g., Blender, Unity, Unreal Engine). • Familiarity with GPU/TPU optimization for training and deploying deep learning models.
Bonus Qualifications: • Experience with AI-driven creative content production tools (e.g., MidJourney, RunwayML). • Contributions to open-source generative AI projects or research publications in the field. • Understanding of interactive media, gaming, or virtual reality (VR) environments. • Proficiency in style transfer, image super-resolution, or enhancement techniques.
Soft Skills: • Strong communication and collaboration skills to work effectively in cross-functional teams. • Problem-solving mindset with a creative approach to tackling challenges in AI-driven content generation. • Ability to adapt to new research and quickly integrate cutting-edge techniques into real-world applications.
Benefits: • Competitive salary with equity options. • Opportunity to work on innovative AI-generated content products. • Flexible working hours and remote work options. • Access to ongoing learning opportunities in the field of AI and generative models.