Syncing Specific Folders Between Different Repositories Using GitHub Actions

人为了逃避做正事究竟可以多荒谬——嗯说的是我自己。来简单记录一下最近两天的折腾。

之前写过博客发布 流程 ,我的 Hugo 和 Obsidian Vault 并不在相同根目录下,日常写 MD 文档又都是在 Obsidian 中完成,为了避免每次手动挪文件到 Hugo 的 content/posts,当时采用了 FreeFileSync 来做本地不同文件夹下的同步,略为臃肿但有效。

为什么又动了折腾的念头?我的笔电已服役快六年,性能愈发捉襟见肘,不常用软件都被卸载了,FreeFileSync 只用来同步一个小小的文件夹也太浪费。而且除了在 Google Drive 上同步 Obsidian 笔记,我也希望用 GitHub 定期备份加个双保险。

我的想法:

  • Obsidian Vault 关联一个 GitHub 私人 repo(以下简称 Ob)
  • 若push 内容有 blog-post 文件夹的变动则触发 GitHub Action
  • 根据 commit 信息生成 Ob 仓库 blog-post 文件夹的 changed_files.txt 文件,包含所有变动的文件名称
  • 按照 changed_files.txt 将 Ob blog-post 中的对应文件同步至 Hugo 仓库的 content/posts 文件夹,并在 Hugo 仓库提交 commit 和 push 请求

而 Hugo 仓库收到 push 后会触发另一个 Github Action,会 deploy Hugo 再发布到我的 Github Pages。如果 Obsidian 里安装了 Git 插件,那 Obsidian 文件夹备份、博客发布,都只在 Obsidian 里就可一键完成。

说起来很啰嗦,其实就是 A 仓库 A1 文件夹同步到 B 仓库 B1 文件夹而已。等我费劲弄完发现网上也是有现成轮子的, repo-file-sync-action ,只是自己折腾总是格外有劲儿。全程都让 ChatGPT 帮我写,最后我问它,用 XX 题目来描述这个过程你觉得合适吗?结果 ChatGPT 唰唰给我生成了一整篇文章!行吧,那就却之不恭了。复制粘贴一下备个份。另外因为我多问了一嘴双向同步怎么搞,它又加了个反向的同步过程。


In this article, we’ll explore how to synchronize specific folders between different GitHub repositories using GitHub Actions. This is particularly useful for scenarios where you need to keep a subset of your files in sync across multiple projects.

Problem Statement

We have two repositories:

  1. Repository A: github.com/username/repo-a
    • Contains a folder Folder-A1
  2. Repository B: github.com/username/repo-b
    • Contains a folder Folder-B1

The goal is to ensure that changes in Folder-A1 in Repository A are automatically synced to Folder-B1 in Repository B, and vice versa.

Prerequisites

  1. Personal Access Token (PAT): You’ll need a PAT with repository permissions to enable GitHub Actions to push changes to the repositories.
  2. GitHub Actions: GitHub’s CI/CD service that allows you to automate tasks directly in your repositories.

Steps to Set Up Syncing

1. Create a Personal Access Token

  1. Go to GitHub’s Personal Access Token settings .
  2. Generate a new token with the repo scope.

2. Add the PAT to Repository Secrets

  1. Navigate to your repository on GitHub.
  2. Go to Settings > Secrets > Actions.
  3. Add a new secret named PAT with the value of your generated token.

3. Configure GitHub Actions Workflows

We will create two workflows, one for each repository, to handle the synchronization.

Workflow for Repository A (Sync Folder-A1 to Folder-B1)

Create the workflow file in the repo-a repository:

# .github/workflows/sync_folder_a1_to_b1.yml
name: Sync Folder-A1 to Folder-B1

on:
  push:
    paths:
      - 'Folder-A1/**'
    branches:
      - main  # Change this to the default branch of your repository if it's different

jobs:
  sync:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout Repo A (repo-a)
      uses: actions/checkout@v3
      with:
        token: ${{ secrets.PAT }}
        fetch-depth: 0  # Ensure full history is fetched

    - name: Install rsync
      run: sudo apt-get install -y rsync

    - name: Checkout Repo B (repo-b)
      run: |
        git clone --branch master https://github.com/username/repo-b.git
        cd repo-b
        git config user.name "github-actions[bot]"
        git config user.email "github-actions[bot]@users.noreply.github.com"        

    - name: Compare and Sync Folder-A1 to Folder-B1
      run: |
        cd $GITHUB_WORKSPACE
        if git rev-parse HEAD~1 >/dev/null 2>&1; then
          git diff --name-only HEAD~1 -- Folder-A1/ > changed_files.txt
        else
          find Folder-A1 -type f > changed_files.txt
        fi
        cd repo-b
        while IFS= read -r file; do
          if [ -f "../$file" ]; then
            rsync -av --delete "../$file" "Folder-B1/${file#Folder-A1/}"
          else
            rm -f "Folder-B1/${file#Folder-A1/}"
          fi
        done < ../changed_files.txt
        git add Folder-B1
        git commit -m "Sync changes from Folder-A1 to Folder-B1"
        git push origin master        
      env:
        PAT: ${{ secrets.PAT }}

Workflow for Repository B (Sync Folder-B1 to Folder-A1)

Create the workflow file in the repo-b repository:

# .github/workflows/sync_folder_b1_to_a1.yml
name: Sync Folder-B1 to Folder-A1

on:
  push:
    paths:
      - 'Folder-B1/**'
    branches:
      - master  # Change this to the default branch of your repository if it's different

jobs:
  sync:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout Repo B (repo-b)
      uses: actions/checkout@v3
      with:
        token: ${{ secrets.PAT }}
        fetch-depth: 0  # Ensure full history is fetched

    - name: Install rsync
      run: sudo apt-get install -y rsync

    - name: Checkout Repo A (repo-a)
      run: |
        git clone --branch main https://github.com/username/repo-a.git
        cd repo-a
        git config user.name "github-actions[bot]"
        git config user.email "github-actions[bot]@users.noreply.github.com"        

    - name: Compare and Sync Folder-B1 to Folder-A1
      run: |
        cd $GITHUB_WORKSPACE
        if git rev-parse HEAD~1 >/dev/null 2>&1; then
          git diff --name-only HEAD~1 -- Folder-B1/ > changed_files.txt
        else
          find Folder-B1 -type f > changed_files.txt
        fi
        cd repo-a
        while IFS= read -r file; do
          if [ -f "../$file" ]; then
            rsync -av --delete "../$file" "Folder-A1/${file#Folder-B1/}"
          else
            rm -f "Folder-A1/${file#Folder-B1/}"
          fi
        done < ../changed_files.txt
        git add Folder-A1
        git commit -m "Sync changes from Folder-B1 to Folder-A1"
        git push origin main        
      env:
        PAT: ${{ secrets.PAT }}

Explanation of the Workflows

  1. Trigger: The workflows are triggered by a push event to specific paths and branches.
  2. Checkout Repositories: The actions/checkout@v3 action is used to checkout the repository’s code.
  3. Install rsync: rsync is used for efficient file synchronization.
  4. Clone the Target Repository: The target repository is cloned and configured for Git operations.
  5. File Comparison and Sync: Using git diff to find changes and rsync to sync the files.
  6. Commit and Push: The changes are committed and pushed back to the target repository.

This setup ensures that the specified folders in both repositories remain in sync whenever changes are made. This solution leverages GitHub Actions to automate the synchronization process, making it seamless and efficient.

Conclusion

Synchronizing specific folders between different repositories can be a crucial requirement for many projects. By using GitHub Actions, you can automate this process effectively, ensuring that changes are propagated across repositories without manual intervention. The steps outlined in this article provide a robust solution for maintaining consistency across multiple codebases.

Comments