- Git Explained - Unstage, Unmodifiy & Undo May 1, 2023
Version control systems are the backbone of collaborative development, ensuring that everyone is working on the same codebase and keeping track of changes. But when things go wrong, Git can be a daunting tool to navigate.
Read more »- Data Engineering Design Principles You Should Follow April 22, 2023
Software engineering is far more mature than the current state of data engineering, particularly when it comes to principles. For example, in software engineering, SOLID principles are a set of five design principles that help in writing maintainable and scalable code.
Read more »- Why ChatGPT Won't Replace Data Engineers April 16, 2023
It seems that we are living in a time that will one day be recognised as the beginning of the AI age explosion. AI tools are becoming more intelligent and more accessible. With the release of ChatGPT, concerns have raised again about AI outperforming humans in our jobs, but I believe we have many years ahead of us before AI can fully design and develop our data products from start to finish.
Read more »- Bare-bones Pandas March 26, 2023
Pandas is a popular data manipulation library in Python, but its syntax is often criticized for being confusing. There are multiple ways to achieve the same output, and these methods may have subtle differences. In this blog post, we will examine the most common Pandas methods and compare them, so you can choose the best approach for your data manipulation needs.
Read more »- A Guide To Creating Your First Data Engineering Project January 28, 2023
Getting your first data engineering role can be difficult depending on what kind of experience and skill level a company is looking for. The transferrable skills from software engineering and data science makes the transition easier than starting out from scratch and so more often than not, data engineering roles require some years of experience with programming languages like Python and SQL, data warehouse knowledge or similar experience from other data related roles.
Read more »- Learn Data Engineering in 2023 January 6, 2023
In 2023 the demand for data engineers is showing no sign of slowing down. Here we’ll be going over the core skills a data engineer should have and the concepts that have survived the years, with some excellent resources to look at. First let’s take a step back and understand the state of data engineering today - what has changed and what has remained - from the early days of data practices to the modern data stack.
Read more »- Best Practices for Python Projects in 2022 November 1, 2022
For most Python projects, the same foundational tools can speed up development of your project and remove inefficiencies. This article goes over some of my favourite tools for creating the perfect project.
Read more »- Installing Apache Airflow on AWS EC2 September 29, 2022
Apache Airflow is a widely used open source tool in organisations with large amounts of data processing. Created by AirBnb in 2015, Airflow is highly extensible, supporting many use cases in data engineering. It is desgined to orchestrate your data pipelines which are defined by directed acyclic graphs(DAGs).
Read more »- Setting up Python Projects with Pyenv & Poetry September 27, 2022
There are several ways to install Python on your system and each come with their advantages and disadvantages. For example, a data scientist will benefit from the pre-installed packages like SciPy and Numpy from a quick Anaconda installation. For developers however, it’s better to use tools that provide ease of switching between different versions of Python as your projects may require specific versions.For example, a data scientist will benefit from the pre-installed packages like SciPy and Numpy from a quick Anaconda installation.
Read more »