Data Engineering (Immediate)
UsefulBI Corporation
Job Description
Location: Bangalore / Lucknow / Pune Work Model- Hybrid Role Type: Full-time, Senior/Mid Level Company’s website: link: UBI About UsefulBI: UsefulBI is a leading AI-driven data solutions provider specialising in data engineering, cloud transformations, and AI powered analytics for Fortune 500 companies. We help businesses turn complex data into actionable insights through our innovative products and services. Role Overview : We are looking for a highly skilled Data Engineer with strong expertise in modern data platforms, scalable data pipelines, and cloud ecosystems along with exposure to Generative AI technologies.
The ideal candidate should have hands-on experience in building large-scale ETL/ELT pipelines, distributed data processing, and cloud-native architectures while also understanding GenAI/LLM-based systems such as RAG pipelines, vector databases, embeddings, and AI-powered analytics workflows. You will work closely with Data Engineers, AI/ML teams, Product teams, and Architects to build intelligent, scalable, and high-performance data solutions. Key Responsibilities: • Design, develop, and maintain scalable batch and real-time data pipelines. • Build and optimize ETL/ELT workflows using PySpark, SQL, and cloud-native services • Work with structured and semi-structured datasets across lakehouse and warehouse environments • Develop data ingestion frameworks for multiple enterprise data sources • Optimize large-scale data processing and pipeline performance • Build reusable and scalable data engineering components and frameworks • Collaborate with Analytics, AI/ML, and Product teams for data-driven solutions • Ensure data quality, governance, monitoring, and observability standards • Work with cloud ecosystems such as AWS, Azure, Databricks, or GCP • Support deployment and orchestration workflows using CI/CD and automation tools • Contribute to AI-powered applications and GenAI-enabled analytics systems • Work with vector databases, embeddings, and document-processing pipelines for AI use cases Required Skills • Strong hands-on experience in Python and SQL • Strong experience with PySpark and Databricks • Experience building scalable ETL/ELT and data pipelines • Good understanding of Data Warehousing and Lakehouse concepts • Experience with AWS / Azure / GCP cloud platforms • Hands-on experience with distributed data processing systems • Experience with orchestration tools such as Airflow or similar schedulers • Strong understanding of data modeling, performance tuning, and optimization • Experience working with APIs, ingestion frameworks, and large datasets • Strong problem-solving and debugging skills • Good understanding of CI/CD, Git, and deployment workflows