Keeping the sauce secret – Introducing GitLab Watchman and GitHub Watchman

Keeping the sauce secret – Introducing GitLab Watchman and GitHub Watchman

If your organisation does any development, infrastructure management, or anything with code, you will know that some of your most important data and intellectual property sit in Git repositories.

Whether it is down to bad coding practice, mistakes, or oversights, all sorts of confidential data is often stored in repositories by developers (think about the hardcoded credentials accidentally committed early on in the development process).

To help detect this data being exposed in your environment, I’ve created two new additions to the Watchman family to work with with two of the biggest Git repository vendors: GitHub Watchman and GitLab Watchman.

Both are available now on GitHub:

Detailed instructions on how to use the applications are in the respective GitHub repositories.

How it works

GitLab and GitHub Watchman follow the same formula as Slack Watchman, and make use of all the developments Slack Watchman has undergone since the initial release. They search publicly available repositories in your environment for sensitive data and credentials using YAML rule definitions. The sorts of things they look for are:

  • AWS, GCP, Azure credentials
  • Exposed tokens (bearer, secret etc.)
  • Credentials for cloud storage (S3 etc.)
  • Private keys (SSH, PGP etc.)
  • Plaintext passwords

The results can be output via your choice of logging option:

  • Stdout
  • File logging
  • TCP stream
  • CSV file

Stdout, File and TCP stream logging output logs in JSON format, perfect for ingesting into a SIEM or any other log analysis platform.

You can run GitLab Watchman to look for results going back as far as:

  • 24 hours
  • 7 days
  • 30 days
  • All time

This means after one deep scan, you can schedule GitLab Watchman to run regularly and only return results from your chosen timeframe.

.conf file

Configuration options can be passed in a file named watchman.conf which must be stored in your home directory. The file should follow the YAML format, and should look like below with either github_watchman or gitlab_watchman at the top level:
github_watchman:
  token: abc123
  url: https://github.example.com
  logging:
    file_logging:
      path:
    json_tcp:
      host:
      port:

GitHub Watchman and GitLab Watchman will look for this file at runtime, and use the configuration options from here. If you are not using the advanced logging features, leave them blank.

If you are having issues with your .conf file, run it through a YAML linter.

Note If you use any other Watchman applications and already have a watchman.conf file, just append the conf data for GitHub Watchman to the existing file:

github_watchman:
  token: abc123
  url: https://github.example.com
  logging:
    file_logging:
      path:
    json_tcp:
      host:
      port:
gitlab_watchman:
  token: abc123
  url: https://gitlab.example.com
  logging:
    file_logging:
      path:
    json_tcp:
      host:
      port:

GitLab Watchman

GitLab Watchman is downloadable from GitHub: https://github.com/PaperMtn/gitlab-watchman

Or can be installed via PyPI: pip install gitlab-watchman

Features

It searches GitLab for internally shared projects and looks at:

  • Code
  • Commits
  • Wiki pages
  • Issues
  • Merge requests
  • Milestones

For the following data:

  • GCP keys and service account files
  • AWS keys
  • Azure keys and service account files
  • Google API keys
  • Slack API tokens & webhooks
  • Private keys (SSH, PGP, any other misc private key)
  • Exposed tokens (Bearer tokens, access tokens, client_secret etc.)
  • S3 config files
  • Passwords in plaintext
  • CICD variables exposed publicly
  • and more

Requirements

GitLab versions

GitLab Watchman uses the v4 API, and works with GitLab Enterprise Edition versions:

  • 13.0 and above – Yes
  • GitLab.com – Yes
  • 12.0 – 12.10 – Maybe, untested but if using v4 of the API then it could work

GitLab Licence & Elasticsearch

To search the scopes:

  • blobs
  • wiki_blobs
  • commits

The GitLab instance must have Elasticsearch configured, and be running Enterprise Edition with a minimum GitLab Starter or Bronze Licence.

GitLab personal access token

To run GitLab Watchman, you will need a GitLab personal access token.

You can create a personal access token in the GitLab GUI via Settings -> Access Tokens -> Add a personal access token

The token needs permission for the following scopes:

api

Note: Personal access tokens act on behalf of the user who creates them, so I would suggest you create a token using a service account, otherwise the app will have access to your private repositories.

GitLab URL

You also need to provide the URL of your GitLab instance.

Providing token & URL

GitLab Watchman will first try to get the the GitLab token and URL from the environment variables GITLAB_WATCHMAN_TOKEN and GITLAB_WATCHMAN_URL, if this fails they will be taken from the .conf file

GitHub Watchman

GitHub Watchman is downloadable from GitHub: https://github.com/PaperMtn/gitlab-watchman

Or can be installed via PyPI: pip install gitlab-watchman

Limitations

The GitHub Search API limits results to 1000, so that is the maximum that will be returned for each search. If you have a small environment you are not likely to hit this limit. If you do have a larger environment, you will want to make sure any custom YAML rules you create have very specific search queries.

What it looks for

It searches GitHub for internally shared projects and looks at:

  • Code
  • Commits
  • Issues
  • Repositories

For the following data:

  • GCP keys and service account files
  • AWS keys
  • Azure keys and service account files
  • Google API keys
  • Slack API tokens & webhooks
  • Private keys (SSH, PGP, any other misc private key)
  • Exposed tokens (Bearer tokens, access tokens, client_secret etc.)
  • S3 config files
  • Passwords in plaintext
  • and more

Requirements

GitHub versions

GitHub Watchman uses the v3 API, and works with GitHub Enterprise Server versions that support the v3 API.

GitHub Watchman also works with GitHub.com (Free, Pro and Team) using the API.

GitHub personal access token

To run GitHub Watchman, you will need a GitHub personal access token.

You can create a personal access token in the GitHub GUI via Settings -> Developer settings -> Personal access tokens

The token needs no specific scopes assigned, as it searches public repositories in the GitHub instance.

Note: Personal access tokens act on behalf of the user who creates them, so I would suggest you create a token using a service account, otherwise the app will have access to your private repositories.

GitHub URL

You also need to provide the URL of your GitHub instance.

Providing token & URL

GitHub Watchman will first try to get the the GitHub token and URL from the environment variables GITHUB_WATCHMAN_TOKEN and GITHUB_WATCHMAN_URL, if this fails they will be taken from the .conf file.

Future plans

The value of GitLab Watchman and GitHub Watchman are the rules that are fed to it, as this dictates what is searched for. The current rules are not exhaustive, there is a lot more data that could be found in Git repositories.

I plan on adding more rules to both applications, but also encourage you to create your own as well to fulfil your needs. If you do create your own rules, please do create a pull request to get them added to their respective repositories, so the community can benefit from them as well.

As always, feel free to raise any issues via GitHub, or ping me directly on Twitter: https://twitter.com/_PaperMtn

2 Comments

    • PaperMtn

      Currently it only looks for one URL variable. I have to say this isn’t a use case I have considered, but maybe it could be adjusted to look at multiple URLs.

      Of course, as a work around you could run multiple instances.

Comments are closed