Remove reference to parentHashes

Turns out they're not actually needed, because the implementation of
"finding all commits" just iterates over the lines of the `git log`
output, anyway - no need to actually traverse the tree.

I am _reasonably_ sure this means that it's no longer a pre-requisite
that the source repo has linear history, either. If I discover during
adding tests that that actually is the case, I'll re-add that note.
This commit is contained in:
Commit Report Sync Bot 2025-03-02 21:49:11 -08:00
parent 29e8f9a2a9
commit 5f6018c6cd
10 changed files with 6 additions and 15 deletions

View File

@ -10,9 +10,6 @@ This is the code for a [Gitea Action](https://docs.gitea.com/usage/actions/overv
* **Repos must use `main` as the default branch.** Come on, it's been like 5 years. * **Repos must use `main` as the default branch.** Come on, it's been like 5 years.
* ...ok _fine_ I will consider adding customization for this. But not a priority to begin with. * ...ok _fine_ I will consider adding customization for this. But not a priority to begin with.
* **Source and Target repos' history on `main` must be linear** - no merge commits
* Personally, I believe that they should always be like that, but I do respect the fact that other people disagree. Since I predict that the number of people using this can be counted on one hand, though, I don't respect that fact _enough_ to implement it right-off-the-bat - YAGNI (and if you _DO_ "NI", please feel free to tell me and I'll take a look at whether I can actually do it)
* To be clear - the action should still be able to _operate_ on histories with merge commits (though, if you're still reading that as "_should be able to_" and not "_can_, I can't promise that I've actually implemented and tested it yet), it just won't sync any commits that aren't in the transitive-first-parent chain[^difference-with-parents].
* **No two source repos can have commits that happened at the exact same time.** It would be technically possible to implement this[^can-you-implement-it], but it would be an _arse_: finding "_the first commit in RepoB earlier than or equal to commit-X in RepoA_" is a query that, _under this constraint only_, can then be used to immediately determine "_does this match the resultant description required in RepoB? If not, we write-in; if so, we **assume that this source commit has already been represented**, and skip it in_", whereas if two commits can have identically-timed commits, it can be the case that in fact we find a target commit ("_latest before the given source commit_") that doesn't match the source commit, even though a matching target commit could exist (at the same time, but earlier by `git log` sub-ordering), resulting in a loop where two repos would keep inserting themselves after the others' equivalent-time representation. * **No two source repos can have commits that happened at the exact same time.** It would be technically possible to implement this[^can-you-implement-it], but it would be an _arse_: finding "_the first commit in RepoB earlier than or equal to commit-X in RepoA_" is a query that, _under this constraint only_, can then be used to immediately determine "_does this match the resultant description required in RepoB? If not, we write-in; if so, we **assume that this source commit has already been represented**, and skip it in_", whereas if two commits can have identically-timed commits, it can be the case that in fact we find a target commit ("_latest before the given source commit_") that doesn't match the source commit, even though a matching target commit could exist (at the same time, but earlier by `git log` sub-ordering), resulting in a loop where two repos would keep inserting themselves after the others' equivalent-time representation.
* ...I think a diagram would help here if I had to explain it to someone : * ...I think a diagram would help here if I had to explain it to someone :
@ -56,5 +53,4 @@ I gotta assume that that's enough for GitHub history rewrite to be legal, if [th
[^could-create]: I guess another feature request could be - if the target repo doesn't exist, create it! I'm going to rephrase my common refrain here to IAGNI - **I* ain't gonna need it. But if you want it for some reason (idk why you would be programmatically creating target repos unless some serious scale is going on...and I kinda wanna calculate that now...Hmm, TODO), ask for it and it shouldn't be hard for me to add it. [^could-create]: I guess another feature request could be - if the target repo doesn't exist, create it! I'm going to rephrase my common refrain here to IAGNI - **I* ain't gonna need it. But if you want it for some reason (idk why you would be programmatically creating target repos unless some serious scale is going on...and I kinda wanna calculate that now...Hmm, TODO), ask for it and it shouldn't be hard for me to add it.
[^no-network]: Note that no network resources or API calls are used in this quadratic period, so you don't run a risk of overloading anything (including your wallet). These potentially-costly operations are linear _in number_ in repo size - two pulls, one push - though note of course that pulling a larger repo takes more literal throughput. But I'm not aware of any provider that has significant limitations on that? Because, y'know, monorepos exist at large companies and are (one assumes?) regularly pulled (at least once every time a developer onboards), and you are not gonna get near their sizes unless you're a company (and if you are, pay me to implement that feature :P ) [^no-network]: Note that no network resources or API calls are used in this quadratic period, so you don't run a risk of overloading anything (including your wallet). These potentially-costly operations are linear _in number_ in repo size - two pulls, one push - though note of course that pulling a larger repo takes more literal throughput. But I'm not aware of any provider that has significant limitations on that? Because, y'know, monorepos exist at large companies and are (one assumes?) regularly pulled (at least once every time a developer onboards), and you are not gonna get near their sizes unless you're a company (and if you are, pay me to implement that feature :P )
[^difference-with-parents]: Difference in implementation would be to make sure to split `parentHashes` on `,` and pick the first, I guess. But I CBA to create a source repo to test it on. If I go to _deliberately_ implement that, rather than it maybe happening by-chace, I'll write a test for it.
[^can-you-implement-it]: Actually I don't know if it _would_ be possible to implement this - I don't know if `git` would even allow two commits at the same time in a target repo. Gotta assume it does given how many folks love to trumpet the fact that they have monorepos - at the scale of possible-committers they're at (especially considering automated tools making updates) there's _gotta_ be potential for a time-collision - just not _earlier than_ parent seems like a reasonable constraint. Might be fun to write a test for "child-younger-than-parent", but I think any such repo is f$^*-ed up enough to justify not supporting it. But, ok then, due to that potential, I guess _technically_ another prerequisite is actually **No source repo may have a parent that happened later than a child**. If you had to read this far into the small print to figure this out when you encountered an error, then, frankly, you brought this on yourself and I hope you are feeling suitably embarassed 😜 [^can-you-implement-it]: Actually I don't know if it _would_ be possible to implement this - I don't know if `git` would even allow two commits at the same time in a target repo. Gotta assume it does given how many folks love to trumpet the fact that they have monorepos - at the scale of possible-committers they're at (especially considering automated tools making updates) there's _gotta_ be potential for a time-collision - just not _earlier than_ parent seems like a reasonable constraint. Might be fun to write a test for "child-younger-than-parent", but I think any such repo is f$^*-ed up enough to justify not supporting it. But, ok then, due to that potential, I guess _technically_ another prerequisite is actually **No source repo may have a parent that happened later than a child**. If you had to read this far into the small print to figure this out when you encountered an error, then, frankly, you brought this on yourself and I hope you are feeling suitably embarassed 😜

View File

@ -1,6 +1,5 @@
- [ ] Make example of templatized workflow, that uses `github` context variables to fetch source repository owner and - [ ] Make example of templatized workflow, that uses `github` context variables to fetch source repository owner and
name, and that fetches secret from Vault name, and that fetches secret from Vault
- [ ] Remove `parentHashes`, never ended up being needed
- [ ] Add a link to the original commit from the body of the file that's created in the target repo, and/or in the - [ ] Add a link to the original commit from the body of the file that's created in the target repo, and/or in the
commit body. commit body.
- [ ] Allow passing a `limit` variable to control how many source commits to read - [ ] Allow passing a `limit` variable to control how many source commits to read

7
dist/index.js vendored
View File

@ -25861,8 +25861,8 @@ function buildSourceCommitHistory(path, numCommits) {
const logOutput = (0, child_process_1.execSync)( const logOutput = (0, child_process_1.execSync)(
// If you want to copy this formatting for debugging, it's: // If you want to copy this formatting for debugging, it's:
// //
// --pretty=format:'{"hash":"%h","author_name":"%an","author_email":"%ae","date":"%ai","message":"%s","parentHashes":"%p"}' // --pretty=format:'{"hash":"%h","author_name":"%an","author_email":"%ae","date":"%ai","message":"%s"}'
`git log --max-count=${numCommits} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\",\"parentHashes\":\"%p\"}'`, { cwd: path }); `git log --max-count=${numCommits} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\"}'`, { cwd: path });
const logLines = logOutput.toString().split('\n'); const logLines = logOutput.toString().split('\n');
for (const line of logLines) { for (const line of logLines) {
const commit = parseCommit(path, line); const commit = parseCommit(path, line);
@ -25877,7 +25877,7 @@ function buildTargetCommitHistory(path, oldestDateInSourceCommitHistory) {
const countingLogOutput = (0, child_process_1.execSync)(`git log --since=${oldestDateInSourceCommitHistory.toISOString()} --pretty=oneline`, { cwd: path }); const countingLogOutput = (0, child_process_1.execSync)(`git log --since=${oldestDateInSourceCommitHistory.toISOString()} --pretty=oneline`, { cwd: path });
const countedNumber = countingLogOutput.toString().split('\n').length; const countedNumber = countingLogOutput.toString().split('\n').length;
console.log(`DEBUG - countedNumber (how many commits in target repo since oldest source commit) is: ${countedNumber}`); console.log(`DEBUG - countedNumber (how many commits in target repo since oldest source commit) is: ${countedNumber}`);
const logOutput = (0, child_process_1.execSync)(`git log --max-count=${countedNumber + 1} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\",\"parentHashes\":\"%p\"}'`, { cwd: path }); const logOutput = (0, child_process_1.execSync)(`git log --max-count=${countedNumber + 1} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\"}'`, { cwd: path });
const logLines = logOutput.toString().split('\n'); const logLines = logOutput.toString().split('\n');
for (const line of logLines) { for (const line of logLines) {
const commit = parseCommit(path, line); const commit = parseCommit(path, line);
@ -25914,7 +25914,6 @@ function parseCommit(repo_path, line) {
repo_path: repo_path, repo_path: repo_path,
date: new Date(parsed['date']), date: new Date(parsed['date']),
message: parsed['message'], message: parsed['message'],
parentHashes: parsed['parentHashes'],
}; };
} }
function insertRepresentativeCommit(sourceRepo, sourceCommit, targetCommit, followOnTargetCommit) { function insertRepresentativeCommit(sourceRepo, sourceCommit, targetCommit, followOnTargetCommit) {

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -136,8 +136,8 @@ export function buildSourceCommitHistory(path: string, numCommits: number): Comm
const logOutput = execSync( const logOutput = execSync(
// If you want to copy this formatting for debugging, it's: // If you want to copy this formatting for debugging, it's:
// //
// --pretty=format:'{"hash":"%h","author_name":"%an","author_email":"%ae","date":"%ai","message":"%s","parentHashes":"%p"}' // --pretty=format:'{"hash":"%h","author_name":"%an","author_email":"%ae","date":"%ai","message":"%s"}'
`git log --max-count=${numCommits} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\",\"parentHashes\":\"%p\"}'`, `git log --max-count=${numCommits} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\"}'`,
{ cwd: path } { cwd: path }
); );
const logLines = logOutput.toString().split('\n'); const logLines = logOutput.toString().split('\n');
@ -161,7 +161,7 @@ export function buildTargetCommitHistory(path: string, oldestDateInSourceCommitH
const countedNumber = countingLogOutput.toString().split('\n').length; const countedNumber = countingLogOutput.toString().split('\n').length;
console.log(`DEBUG - countedNumber (how many commits in target repo since oldest source commit) is: ${countedNumber}`); console.log(`DEBUG - countedNumber (how many commits in target repo since oldest source commit) is: ${countedNumber}`);
const logOutput = execSync( const logOutput = execSync(
`git log --max-count=${countedNumber+1} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\",\"parentHashes\":\"%p\"}'`, `git log --max-count=${countedNumber+1} --pretty=format:'{\"hash\":\"%h\",\"author_name\":\"%an\",\"author_email\":\"%ae\",\"date\":\"%ai\",\"message\":\"%s\"}'`,
{ cwd: path } { cwd: path }
); );
const logLines = logOutput.toString().split('\n'); const logLines = logOutput.toString().split('\n');
@ -200,7 +200,6 @@ function parseCommit(repo_path: string, line: string): Commit {
repo_path: repo_path, repo_path: repo_path,
date: new Date(parsed['date']), date: new Date(parsed['date']),
message: parsed['message'], message: parsed['message'],
parentHashes: parsed['parentHashes'],
} }
} }

View File

@ -16,6 +16,4 @@ export type Commit = {
repo_path: string; repo_path: string;
date: Date; date: Date;
message: string; message: string;
// TODO - turns out we don't actually need parentHashes anyway
parentHashes: string[];
} }