Senior Data Engineer - Platform and Frameworks (m/f/d)

The mission

The web was created by scientists and for scientists, to foster scientific collaboration and drive progress for a better world. Join our team to take the web back to its roots and achieve that original mission.
We’re a passionate team of pragmatic optimists from around the world and from many different backgrounds. Together, we focus on building great products that change the way scientists communicate for the better.

We love what we do. We connect the world of science and make research open to all.

The position

As part of ResearchGate’s data engineering team, you are working at the core of our data pipelines. These are not only helping our Analytics and Business departments to make the right decisions but also enabling product teams to craft the data-driven product features that make science more effective and fast on ResearchGate. Join us and help shape our data infrastructure to be reliable, robust and fast!


  • Become an essential member of our Machine Learning Infrastructure Architecture Team and shape the long-term vision of ML at ResearchGate
  • Develop a system that enables data teams to quickly iterate on ML-based workloads and easily deploy their models to our production systems
  • Ensure that the data pipelines we use at ResearchGate are ready for future challenges
  • Provide technical leadership, influence, and partner with fellow engineers to architect, design and build infrastructure that withstands scale and availability while reducing operational overhead
  • Engineer efficient, adaptable and scalable data architectures to make building and maintaining big data applications easy and enjoyable for others
  • Build fault tolerant, self-healing, adaptive, and highly accurate data computational pipelines
  • Work with data scientists, data analysts, backend engineers, and product managers to solve problems, identify trends and leverage the data we produce
  • Build workflows involving large datasets and/or machine learning models in production using distributed computing and big data processing concepts and technologies


  • Experience in designing and implementing data pipelines and ML applications
  • Working with data at the petabyte scale
  • Design and operation of robust distributed systems
  • Experience in Python is a must, experience in Java is a plus
  • Working knowledge of relational databases and query authoring (SQL)
  • Experience using technologies like Kafka, Hadoop, Hive, and Flink
  • Experience in using machine learning tools/frameworks/libraries, such as Python, R, Jupyter Notebook, scikit-learn, PyTorch, Tensorflow is a plus

You'll be working in a team-based environment where code is written, tested and shipped continuously. Our engineering team is passionate about building maintainable, scalable web applications that are constantly optimized to meet the needs of our users - 15+ million researchers worldwide.
Our hiring process is uncomplicated. You'll be interviewed by the people you'll be working with, so you can quickly find the role that suits you best and start making an impact.
We’re located at the heart of Berlin, one of the most exciting cities in the world and a place where people from all walks of life feel welcome. Work to change the world of science and have a good time while you’re at it: we offer free, healthy lunches and many fun events.