About a year ago, Github launched a feature that allows adding of README to a user profile. To add the README to your profile, you have to:

  • create a public repository with a name matching your Github username
  • place the README.md in the root of the repository

You can learn more about it in Github documentation .

What is a Dynamic Github Profile?

The dynamic Github profile is updated automatically on some external event or by schedule. It is possible with the use of Github Actions . Github Actions are another recently released Github feature. Github Actions is essentially a CI/CD system that allows creating and running custom workflows.

I first learned about the profile README in this article on Hackernoon. The guy used PHP to fetch and update a list of the latest posts in his blog. Although I am a PHP expert myself, I desired to make it more challenging. I realized that XML parsing and replacing text in a file is achievable using Bash native tools only.

Parsing RSS feed

RSS feed is a plain XML file with a simple schema. Here’s a sample from my freshly launched blog:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <title>Posts on Aleksandr Tabakov&#39;s technical blog | atabakoff</title>
    <description>Recent content in Posts on Aleksandr Tabakov&#39;s technical blog | atabakoff</description>
    <generator>Hugo -- gohugo.io</generator>
    <lastBuildDate>Fri, 27 May 2022 21:33:51 +0200</lastBuildDate><atom:link href="https://atabakoff.com/posts/index.xml" rel="self" type="application/rss+xml" />
      <title>How to run Vaultwarden in Docker/Podman as a systemd service</title>
      <pubDate>Fri, 27 May 2022 21:33:51 +0200</pubDate>
      <description>Running Vaultwarden in a container as a systemd service using Podman. How to install Podman, run Vaultwarden in a container, create a systemd config for Vaultwarden service and manage it using systemctl.</description>

Each post is represented by an item element where we need title, link, and pubDate.

Parsing RSS feed with grep

The naive approach is to use grep and then build markdown in a bash loop. Let’s first try to grep:

> wget --quiet -O rss.xml https://atabakoff.com/posts/index.xml
> cat rss.xml | grep -Po '<(title|link|pubDate)>[^<]+'
<title>Posts on Aleksandr Tabakov&#39;s technical blog | atabakoff
<title>How to run Vaultwarden in Docker/Podman as a systemd service
<pubDate>Fri, 27 May 2022 21:33:51 +0200

Not bad, but we need to get rid of the first three lines and the opening tag:

> cat rss.xml | grep -Po '<(title|link|pubDate)>[^<]+' | tail -n +4 \
     | grep -oE '>([^>]+)' | grep -oE '([^>]+)'
How to run Vaultwarden in Docker/Podman as a systemd service
Fri, 27 May 2022 21:33:51 +0200
Just to test grep
Fri, 31 May 2022 18:33:51 +0200

I added one extra item to test if my expression works with multiple posts. At this point, I start thinking that grep might not be the best option. I quickly wrote a converter to convert RSS to markdown, before researching other options:


items=$( cat rss.xml | grep -Po '<(title|link|pubDate)>[^<]+' | tail -n +4 \
    | grep -oE '>([^>]+)' | grep -oE '([^>]+)' )

for item in $items
   case $(expr $count % 3) in
        pubDate=$( date -d "$item" +'%d/%m/%Y' )
* $pubDate [$title]($link)
    count=$(($count + 1))

Run it to validate:

> ./test.sh
* 27/05/2022 [How to run Vaultwarden in Docker/Podman as a systemd service](https://atabakoff.com/how-to-run-vaultwarden-in-podman-as-a-systemd-service/)
* 31/05/2022 [Just to test grep](https://atabakoff.com/testing-grep/)

Testing performance

My RSS feed is tiny. To measure performance, we need to run the parser many times. I created a test.sh file:



while [ "$x" -gt "0" ] ; do
   $(/bin/bash $1) 2>/dev/null

It accepts a script file as a parameter and runs it 100 times in a loop. Let’s run it with time to see how much it’s taking to parse the feed:

> time ./test.sh grep-rss.sh 
./test.sh grep.sh  1,87s user 0,72s system 137% cpu 1,883 total

Not very impressive but expected due to the use of regular expressions.

Parsing RSS feed in Bash

I started googling if there’s a way to parse XML in Bash and found this awesome solution . It describes the same problem of parsing the RSS feed. I modified the code for my needs and stored it in the parse-rss.sh file:


xmlgetnext () {
   local IFS='>'
   read -d '<' TAG VALUE

cat $1 | while xmlgetnext ; do
   case $TAG in
         pubDate=$( date -d "$VALUE" +'%d/%m/%Y' )
* $pubDate [$title]($link)

I ran the same test to compare performance:

> time ./test.sh parse-rss.sh
./test.sh parse.sh  0,81s user 0,33s system 109% cpu 1,042 total

Almost two times faster: 1,042 vs 1,883. It is the final approach I chose for processing of RSS feed.

Updating README.md

Updating a list of posts is simply a replacement. Since markdown allows the usage of HTML code, we can use HTML comments to mark a placeholder for posts:


The standard tool to replace text in Bash is sed but it has one limitation. It is a string editor, only processing one string in one step. In our case, both the placeholder and the posts list is a multiline text. Here’s how I solved it:



POSTS=$( cat $1 | head $NUM | tr '\n' '\t' )
cat README.md | tr '\n' '\t' \
    | sed -E "s#(<\!--blog:start-->).*(<\!--blog:end-->)#\1\t${POSTS}\2#g" \
    | tr '\t' '\n' > README.tmp
rm -f rss.xml posts.md

Some things worth explaining:

  • NUM=$(($2*3)) is the number of lines for the specified number of posts; in my case, I want to show five posts taking three lines each (title, link, date)
  • tr '\n' '\t' is to convert the text to a single line to process it by sed
  • tr '\t' '\n' is to bring back newlines

Github Actions Pipeline

Now we have our scripts and we need to put them into a pipeline. Github Actions are looking at a special .gihub/workflows directory and process each .yaml file there. I’ve created a posts.yml file there with the following content:

name: Update blog posts

    - cron:  '0 0 * * *'

    runs-on: ubuntu-latest
    - name: Clone repository
      uses: actions/checkout@v2
        fetch-depth: 1
    - name: Fetch RSS feed
      run: wget --quiet -O rss.xml https://atabakoff.com/posts/index.xml
    - name: Parse RSS feed
      run: |
        cd ${GITHUB_WORKSPACE}
        ./src/parse-rss.sh rss.xml > posts.md        
    - name: Update README.md
      run:  |
        cd ${GITHUB_WORKSPACE}
        ./src/update-readme.sh posts.md 5        
    - name: Push changes
      run: |
        git config --global user.name "${GITHUB_ACTOR}"
        git config --global user.email "${GITHUB_ACTOR}@users.noreply.github.com"
        git commit -am "Updated blog posts" | exit 0
        git push        

Here’s what needs to be explained:

  • push is to run it on push
  • cron: '0 0 * * *' is a schedule, in my case every day at midnight
  • uses: actions/checkout@v2 clones a repository

Then I split fetching, parsing, and updating into separate steps. It allows me quickly localize a problem if something goes wrong. Something worth noting:

  • cd ${GITHUB_WORKSPACE} is to move to the current working directory, which is the newly cloned repository
  • ${GITHUB_ACTOR} is your username
  • ${GITHUB_ACTOR}@users.noreply.github.com is a special Github email one can use to push the changes to the repository


You can find the full solution in my profile repository . It’s been a lot of fun solving this problem with pure Bash.

That being said, there’re lots of community-made Github actions. They allow creating a dynamic profile without writing any code. All you need to do is to write some YAML. But there’s little challenge in that. It is not a warrior way.