What is a Data Engineer?

Pollen is a data-driven company that relies heavily on its team of Data Engineers, but who are they, what do they do and why would they choose to work at Pollen?

Pollen’s Data Engineering Lead, Paul Monk, sat down with us to answer these questions and more. 

Pollen: You’ve been at Pollen about two and a half years now, what was your journey to becoming a data engineer and joining us?

Paul: I found my way into technology through my love of problem solving and started my career as a support technician. That encouraged me to learn programming to automate the manual tasks I was doing every day. I moved departments and took a role in the web development team which gave me good exposure across front and back end web applications with Javascript & React and Python & Django respectively. I stayed there for about two years before joining Pollen as a mid level software engineer.

So why data?

The first few projects I worked on became data related so it became clear that we needed a dedicated data function within Pollen. I could see how important data was going to be as we scaled and grew, I could see a need there, a place where I could make an impact and add value. I’m part of the quantified-self movement; wearing fitness trackers, tracking what I eat, tracking exercising and optimising myself, and for that you need data, right? Data is one of my keen interests, so it’s a good natural fit!

So what is a Data Engineer?

Our mission is to collect, protect and scale our data usage across the entire organisation.  We want all our internal users to have a great experience, whether they’re analytics engineers, data analysts, data scientists or other users generating their own insights. We’re like an engineering platform team but instead of making sure the web servers are on and scaled up, we’re doing that for our data warehouse, CI/CD and our infrastructure tooling, making sure our users can work at high velocity with safety.

We also ingest data from our various cloud services into our data warehouse, Snowflake, making sure it’s secure and performant. Where possible we use off the shelf tooling like Stitch or Fivetran but not everything supports that so for the rest we’ll write our own in Python and we use Airflow to orchestrate this.

We have a DevOps type culture, so we’re building monitoring and tooling so that if an analytics engineer writes some code and it breaks a test, we want them to go in and own the fix themselves. This goes back to Pollen’s core drivers of Mastery and Freedom & Ownership. We give them the freedom and the tools to develop a mastery in SQL and data modelling, and empower them to own the creation of friendly datasets.

What’s the best thing about being a Data Engineer?

I'm massively inspired about having impact and value, bringing disparate data sources together to create a picture that you can’t see in isolation. Even if I’m not directly analysing or building the data model, to empower teams to do this has a massive impact and adds a tremendous amount of value to the company.

What mix of experience and skills makes a great data engineer?

From a technical perspective, you should start with a language, probably Python, it’s a really friendly language to learn if you’re starting out so nail that down first. SQL is the lingua franca of data and Scala’s useful for big data. As for key traits, I’d say curiosity, not leaving questions unanswered, like ‘Is the answer in the data?’, ‘Could this system work better?’ 

Being passionate and loving what you’re doing is key, as is constantly searching for the best solutions, but it’s important to balance that with pragmatism – you need to know the difference between the perfect solution and a working solution. A technically perfect solution isn’t necessarily the best solution for the business. 

Lastly, I think it’s thinking about the team rather than the individual and having the humility to know when you’re wrong. I love seeing a data engineer at Pollen with a junior engineer or analyst, improving their mastery together and working to push things forward.

Paul Monk

Data Engineering Lead

Magazine home