GitHub is a powerful code hosting platform with millions of developers from all over the world. It is a great way to collaborate with others on code and share it with the world.
I was recently working on an open source project in my free time to create a Python package that could be used to create architecture diagrams with a focus on Azure. You can visit the Architectures repository if you are interested.
As part of this project, I wanted to create a Github workflow to publish my package to the Python Package Index, also known as PyPi. GitHub already has what it calls a GitHub action that will let you do this! An action in GitHub is basically a pipeline that helps you automate some code activity such as compiling code, running tests, or publishing artifacts. These files are written in a data-serialization language called YAML.
To create the pipeline, I added a .github folder at the root of my repository and a workflows folder under that with the below python-publish.yml
file.
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
name: Upload Python Package
on:
release:
types: [released]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install requests
pip install setuptools wheel twine
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
twine upload dist/*
There is a lot going on in a small amount of code here, but I will break it down so we can get to the automated release versioning aspect of this deployment.
The “name” section states that the name of the pipeline will be “Upload Python Package”.
The “on release” section tells GitHub to run this action every time there is a release created. The “jobs deploy” section says we are going to create a deployment job that is going to run on a GitHub hosted ubuntu machine with the latest SKU.
The “steps” section outlines all of the activities that will take place as part of the workflow.
- The first step runs version two of the checkout action. This action is responsible for gathering the code from the repository.
- The “Set Up Python” step uses version two of the setup-python action and state that the version of Python used should be the latest 3.X version.
- The “Install Dependencies” step runs a few commands on the hosted agent. Those commands are making sure that pip (the Python Package Manager command line utility) is up to date and that the requests, setuptools, wheel, and twine libraries are installed.
- The “Build and Publish” step is setting getting encrypted secrets that have been added to GitHub and using them to authenticate with PyPi to publish the package based on a
setup.py
file at the root of the repository.
The setup.py
file is used to help install a Python package. Usually one that you created yourself! If you want to share your package with the world, you have to build the distributions locally with sdist:
python setup.py sdist bdist_wheel
And publish to PyPy using twine:
twine upload dist/*
If you are wondering more about the structure of the setup.py
file and running these steps locally, I would recommend reading through the Python.org documentation on Packaging Python Projects. I will also share my setup file at the end of this article.
One of the attributes specified in the setup file is the version of the package to build and publish. The issue I had was that I didn’t want to have to update this file every time I released a new version of my package on GitHub. I wanted all of this to happen automatically. I was able to achieve this by writing a simple function and adding it to my setup.py
file to populate the version attribute.
import json
import requests
import re
def get_release_data(user, repo, field=None, regex_pattern=None, group_number=0):
"""
Extract data from the latest release.
"""
response = requests.get(f"https://api.github.com/repos/{user}/{repo}/releases/latest")
release_data_dict = json.loads(response.text)
if field:
field_value = release_data_dict[field]
if regex_pattern is None:
output = field_value
else:
output = re.search(regex_pattern, field_value).group(group_number)
else:
output = release_data_dict
return output
version = get_release_data(user="jsoconno", repo="architectures", field="tag_name", regex_pattern="[0-9]*\.[0-9]*\.[0-9]*")
Although I originally intended for this function to just get the latest release number from GitHub for a given user and repository, I realized looking at the API that I could make this much more flexible and allow the user to pull any data about a release based on a regex pattern.
At this point, it all came together. Let’s walk through the flow. I write some code and push it to GitHub. I do this until I’m happy with the current major or minor release. Once all code coverage thresholds and tests pass, I create a release using the GitHub GUI. Because my YAML pipeline is set to trigger on release, it starts to run. When it goes to build the distribution using the setup file, it runs the above function to get the current version of the latest release created by GitHub (i.e. the current release). That distribution is then pushed to PyPi using the twine library and the encrypted secret credentials stored on GitHub for my PiPy account.
Voila! Every time I create a new release in GitHub, the version set is what is used in the build and publish process for my Package! CI/CD for the win!
For full context, here is an example of my setup.py
file.
import setuptools
import json
import requests
import re
def get_release_data(user, repo, field=None, regex_pattern=None, group_number=0):
"""
Extract data from the latest release.
"""
response = requests.get(f"https://api.github.com/repos/{user}/{repo}/releases/latest")
release_data_dict = json.loads(response.text)
if field:
field_value = release_data_dict[field]
if regex_pattern is None:
output = field_value
else:
output = re.search(regex_pattern, field_value).group(group_number)
else:
output = release_data_dict
return output
version = get_release_data(user="jsoconno", repo="architectures", field="tag_name", regex_pattern="[0-9]*\.[0-9]*\.[0-9]*")
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setuptools.setup(
name="architectures",
version=version,
author="Justin O'Connor",
author_email="jsoconno@gmail.com",
description="Tools for creating architecture as code using Python.",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/jsoconno/architectures",
packages=setuptools.find_packages(),
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
python_requires='>=3.7',
)
Leave a Reply