I’d like to get the output of git blame <file>
for all files in a repository recursively.
I want to do this without cloning the repository first, by using Github’s GraphQL v4 api.
Is this possible?
I’ve managed to get a list of files via this query:
query {
repository(owner: "some owner", name: "some repository") {
object(expression: "HEAD:") {
... on Tree {
entries {
name
type
mode
object {
... on Blob {
byteSize
text
isBinary
}
}
}
}
}
}
}
as well as a single file’s git blame via this query:
query {
repositoryOwner(login: "some owner") {
repository(name: "some repo") {
object(expression: "some branch") {
... on Commit {
blame(path: "some/file/path.js") {
ranges {
startingLine
endingLine
age
commit {
oid
author {
name
}
}
}
}
}
}
}
}
}
Is it possible to combine these queries into one?
If not, it probably makes sense to clone the repo first and run git blame recursively locally, right?
1 Answer
I managed to do it like this in my own project:
https://github.com/ricardobranco777/bugme/blob/master/gitblame.py
import requests
import json
access_token = "ghp_xxx"
owner = "ricardobranco777"
repo = "bugme"
branch = "master"
file = "README.md"
query = """
query($owner: String!, $repositoryName: String!, $branchName: String!, $filePath: String!) {
repositoryOwner(login: $owner) {
repository(name: $repositoryName) {
object(expression: $branchName) {
... on Commit {
blame(path: $filePath) {
ranges {
startingLine
endingLine
commit {
# id
# commitUrl
# commitResourcePath
committedDate
# message
# messageBody
# url
# authoredDate
oid
author {
name
email
}
}
}
}
}
}
}
}
}
"""
# Define query variables
variables = {
"owner": "os-autoinst", # Replace with your desired repository owner
"repositoryName": "os-autoinst-distri-opensuse", # Replace with your desired repository name
"branchName": "master", # Replace with your desired branch name
"filePath": "README.md", # Replace with your desired file path
}
# Set up the headers with your access token
headers = {
"Authorization": f"Bearer {access_token}",
}
# Define the API endpoint
url = "https://api.github.com/graphql"
# Make the GraphQL request
response = requests.post(url, headers=headers, json={"query": query, "variables": variables})
# Parse the JSON response
data = response.json()
from pprint import pprint
pprint(data)