I had a task to find out who owned a pile of GitHub repos. Backing out that information across many repositories wasn’t so clear. To address this, I queried for who belonged to which team, then who contributed to which repos. With that information in hand, we can work out which teams “own” which repos.
Getting repos to contributors
REPOS = $( gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/orgs/ $org /repos --paginate | jq -r '.[] | .name' | sort )
for repo in $REPOS; do
echo $repo
for user in $( gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"/repos/ $org / $repo /contributors" --paginate | jq -r '.[] | .login' | sort ); do
echo "\t $user "
done
sleep 1
done
Getting team membership
TEAMS = $( gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
--paginate \
/orgs/ $org /teams | jq -r '.[] | .name' )
for team in $TEAMS; do
echo " $team "
for member in $( gh api -H "Accept: application/vnd.github+json" -H "X-GitHub-Api-Version: 2022-11-28" --paginate "/orgs/ $org /teams/ $team /members" | jq -r '.[] | .login' ); do
echo "\t $member "
done
done
Getting the mapping
Repo contributors goes to repo-to-contributor.txt
.
The list of teams to members lives in teams.txt
.
A list of repositories you want to look at are in unmapped-repos.txt
(one repo name per file).
from collections import defaultdict, Counter
with open ( 'unmapped-repos.txt' ) as f:
unmapped = f.read().split()
with open ( 'repo-to-contributor.txt' ) as f:
raw = f.read()
def map_from_outline (s):
_mapping = defaultdict( list )
current = None
for line in s.split( ' \n ' ):
if line.strip() == '' :
continue
if line.startswith( " " ):
# person
_mapping[current].append(line.strip())
else :
current = line
return _mapping
mapping = map_from_outline(raw)
all_users = Counter()
for repo, people in mapping.items():
all_users.update(people)
with open ( 'teams.txt' ) as f:
team_setup = map_from_outline(f.read())
people_teams = defaultdict( list )
for k,vs in team_setup.items():
for v in vs:
people_teams[v].append(k)
people = dict ({x:y for x,y in all_users.items() if y > 1 })
people = sorted (people.items(), key =lambda x: - x[ 1 ])
guess = defaultdict()
for repo, people in mapping.items():
if repo not in unmapped:
continue
cnt = Counter()
for p in people:
cnt.update(people_teams.get(p))
guess[repo] = dict (cnt.most_common( 2 ))
from pprint import pprint
pprint( dict (guess))
print ( " \n\n " )
print ( "No guesses for ownership:" )
for k, v in guess.items():
if len (v) == 0 :
print ( f " \t{ k } " )
print ( " \n\n " )
seen = set ()
print ( "Multiple guesses for ownership:" )
for k, v in guess.items():
if len (v) > 1 :
print ( f " \t{ k } : { v } " )
print ( "Folks who should own it:" )
for k, v in guess.items():
if len (v) == 1 :
print ( f " \t{ k } : { v } " )