• Learn Data Engineering in 2023 January 6, 2023
  • In 2023 the demand for data engineers is showing no sign of slowing down. Here we’ll be going over the core skills a data engineer should have and the concepts that have survived the years, with some excellent resources to look at. First let’s take a step back and understand the state of data engineering today - what has changed and what has remained - from the early days of data practices to the modern data stack.

    Read more »
  • Best Practices for Python Projects in 2022 November 1, 2022
  • For most Python projects, the same foundational tools can speed up development of your project and remove inefficiencies. This article goes over some of my favourite tools for creating the perfect project.

    Read more »
  • Installing Apache Airflow on AWS EC2 September 29, 2022
  • Apache Airflow is a widely used open source tool in organisations with large amounts of data processing. Created by AirBnb in 2015, Airflow is highly extensible, supporting many use cases in data engineering. It is desgined to orchestrate your data pipelines which are defined by directed acyclic graphs(DAGs).

    Read more »
  • Setting up Python Projects with Pyenv & Poetry September 27, 2022
  • There are several ways to install Python on your system and each come with their advantages and disadvantages. For example, a data scientist will benefit from the pre-installed packages like SciPy and Numpy from a quick Anaconda installation. For developers however, it’s better to use tools that provide ease of switching between different versions of Python as your projects may require specific versions.For example, a data scientist will benefit from the pre-installed packages like SciPy and Numpy from a quick Anaconda installation.

    Read more »