As a Data Engineer you will be part of a team responsible for designing, configuring and maintaining some of the most complex data systems we use, including relational, non-relational, NoSQL, in-memory, persisting and distributed databases and parallel processing systems.
Other tasks and areas of responsibility:
Participate in the design and implementation of a new architecture for the ingestion, process and storage of a big amount data
Build data pipelines to automate the ETL processes of many of the sources we combine at MyTraffic
Work closely with the rest of the team including Data Scientists and Software Engineers to understand the needs and find optimal solutions
Constantly review and update existing systems to find better solutions or technologies to make them more flexible, scalable or performant
Get involved in DevOps and Infrastructure tasks including infrastructure configuration and data systems administrations
We are looking for someone who is passionate about databases, Big Data and ETLs.
You would be one of the points of reference of the team regarding how data and databases are architectured, structured and integrated with the rest of Mytraffic’s systems.
About Your Team You will join a motivated team of French and Spanish professionals with extensive experience in Big Data environments.
You’ll have the opportunity to discuss any new technology with them, as they are always curious and strive to get the best out of it.
Some key members in your day-to-day at MyTraffic will be:
Guillermo Sánchez-Valdepeñas: is the Data Engineering Manager, and you will report directly to him. He previously worked as a data engineer at two of Spain’s largest banks, Santander and BBVA. Seeking a shift from the 'banking life' to a more agile environment, he joined Geoblink (acquired by MyTraffic).
Sébastien Diemer (Staff Engineer): began their work experience in 2008 as a research intern at UC Berkeley, where they contributed to the floating sensor network project. In 2013, Sébastien became a PhD student at MINES ParisTech, specializing in algorithms for coordinating automated vehicles while minimizing travel time and traffic congestion. He was our CTO for 5 years but decided to return to an individual contributor where he can bring much power.
Mario Fernandez (Senior Data Engineer): with a strong background in big data engineering. His career spans roles including Big Data Architect at StratioBD, and BI Developer at ABOUT YOU. He has experience in data technologies like Kubernetes, Apache Airflow, and Apache Spark. He is also our infra reference, he has a hybrid profile between data engineering and infrastructure engineering, always looking to automate as much as possible.
Thibault Karrer (Senior Data Engineer): he builds large-scale data pipelines, infrastructure, and a BI platform. He has experience as a Senior Data Engineer and Data Scientist at Hivency, where he implemented machine learning models and data architecture on Google Cloud Platform. He has expertise in Terraform, Apache Spark, and Python.
Guilhem de Viry (Data Engineer): builds and maintains the company’s data platform. His responsibilities include designing and optimizing Spark jobs, managing infrastructure, creating custom data analysis tools, and automating CI processes. Guilhem brings strong expertise in Python, AWS, and Apache Spark.
Arthur Rebouillet-Petiot (Data engineer): bringing data science and engineering expertise. With previous roles as a Data Science Consultant at Sia Partners and a Research Assistant at Imperial College London, he has worked on advanced machine learning projects, including turbulence modeling and production control algorithms.
About you
Previous experience in a Data Engineer role and a Computer Science or related degree.
You are familiar with a wide variety of databases of all different kinds and have hands-on experience with some of them.
Hands-on experience with infrastructure in the cloud (we use AWS)
Hands-on experience working with architectures that take into account databases, data transfer and data processing systems.
Experience in data pipelines and ETL data processes.
Experience with orchestrators (we use Airflow)
Software development fundamentals like data structures, algorithms, problem solving
Ability to craft simple and elegant solutions to complex problems.
Good communication skills, you know how to explain in a simple way how a complex system works.
Comfortable working in a startup environment.
You are a curious person and loves solving challenges.
Any published open source code is a plus.
Extra kudos if you have experience working with spatial data or GIS systems
If you’re the type of person that is constantly reading about new trends to see what’s going on out there and how you can incorporate new technologies into your current project when there are good reasons for it, then we want to hear from you.
If you don't meet all the criteria, don't worry. At Mytraffic, we believe in continuous learning. If this role interests you, please apply and explain your motivations. We’d love to hear from you!
What We Offer
Dynamic Work Environment: Our department is a hub of innovation, testing new technologies and exploring new verticals.
High Impact: The quality of our analyses is recognized by major market players, and our solutions have a significant impact on our clients.
Cross-functional Collaboration: You will be the link between various MyTraffic teams throughout project timelines, ensuring comprehensive and effective solutions.