Discussion:
[edk2] Proposal of Git Repo Layout for EDKII project
Gao, Liming
2015-06-03 09:50:27 UTC
Permalink
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://<https://github.com/tianocore>github.com/tianocore<https://github.com/tianocore>). There are EDKII project Repo and each package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo will be writable, EDKII SVN project will become mirror. I expect to keep write access in the centralized Git repo. But, EDKII project Repo (edk2) and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only one of them will be writable. My proposal is to make Package Repo be Read & Write, and update EDKII project to link each package by submodule way. The benefit of this way is:

1. EDKII project is too big. After separate them, the developers can just pull their used packages instead of full.

2. The different packages have the different owners. After separate them, the package owner can give write access for the different developers.

3. Close source project can refer to EDKII packages. Those project can be easily setup by git submodule.

Compared to EDKII project Repo, submodule EDKII project Repo just includes edksetup.bat, and edksetup.sh. Some BKM of submodule is shared here.

1. Every Git operation is took for Package Repo. Pull, Branch, Commit, Create Patch, Fork, and Pull Request are all for Package Repo. If your patch changes multiple packages, you need to commit and create patch per Package.

2. git submodule foreach "command" can be used to run command on every package, for example git submodule foreach "git pull"

Thanks
Liming
Laszlo Ersek
2015-06-03 11:14:16 UTC
Permalink
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1. EDKII project is too big. After separate them, the developers
can just pull their used packages instead of full.
The Linux kernel is arguably bigger, and pulling it all poses no problem.

I'm not very familiar with git submodules, but from a quick skim of
git-submodule(1), I have to ask the following questions:

- Assuming I have a longer patchset that modifies, let's say, 4 top
level packages, can I commit all those patches in my clone? And can I
push the full series to my fork on github? Because, the manual page
says, "you cannot modify the contents of the submodule from within the
main project".

- Assume I'd like to advise a user to build OVMF at a "known good state"
of the edk2 tree. Right now I can do this by naming a single SVN
revision (or git commit hash, preferably); that implies the *full* state
of the edk2 tree, including all non-OvmfPkg modules that OvmfPkg pulls
into the build. Wouldn't submodules prevent this? I'd like to avoid a
mixture of submodule versions after a checkout.

The git-submodule(1) manual mentions the "subtree merge strategy", but I
don't know what that is.
Post by Gao, Liming
2. The different packages have the different owners. After
separate them, the package owner can give write access for the different
developers.
What matters is patch review / acceptance from the package owner. I have
committed patches to non-OvmfPkg modules after getting R-b tags from the
respective modules' owners. I would not have been able to do this if I
had needed separate write access to each top level directory.

On the other hand, there have been a few cases when people committed to
OvmfPkg without our review. :(

So, for me this is a question of track record and trust, not enforcement
by technical means. I think controlling write access on the package
level would do more harm than good.
Post by Gao, Liming
3. Close source project can refer to EDKII packages. Those project
can be easily setup by git submodule.
Okay, *now* I understand the motivation for this.

For me this point is neither negative nor positive; I'm neutral. But
points 1 and 2, probably in the service of point 3, *are* negative for me.
Post by Gao, Liming
Compared to EDKII project Repo, submodule EDKII project Repo just
includes edksetup.bat, and edksetup.sh. Some BKM of submodule is shared
here.
What is BKM?
Post by Gao, Liming
1. Every Git operation is took for Package Repo. Pull, Branch,
Commit, Create Patch, Fork, and Pull Request are all for Package Repo.
If your patch changes multiple packages, you need to commit and create
patch per Package.
We do that already, just for review's sake. However, there have been a
few (very few) patches that had to straddle packages. What happens for
example if you move a type definition from IntelFrameworkPkg to MdePkg,
due to advances in the UEFI specification? I'm not saying this is
impossible to solve with careful patches, but I'm very concerned that
this (especially in combination with the submodules) will break
bisectability.
Post by Gao, Liming
2. git submodule foreach “command” can be used to run command on
every package, for example git submodule foreach "git pull"
In my (limited) experience, git submodule is there for independent
(independently developed) packages. For example, a low level library can
be a git submodule in a main git repo for an application (that is the
client of the library).

I think there are many more inter-dependencies in edk2 than that. Edk2
development does not occur *only* along protocol, PPI, and library class
boundaries. Theoretically that might be possible, but it would require
*extreme* discipline in development (very focused patches, all
developers building all series at all stages before submitting, and so on).

So, as long as my "vote" counts, I vote against this proposal. I can't
see any benefits, and I can see a whole bunch of risks. (Obviously I'm
open to being educated about git-submodule.)

Jordan, what is your opinion? (Note I'm not asking you to agree with me.)

Thanks
Laszlo

------------------------------------------------------------------------------
Paolo Bonzini
2015-06-03 11:34:35 UTC
Permalink
Post by Laszlo Ersek
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1. EDKII project is too big. After separate them, the developers
can just pull their used packages instead of full.
The Linux kernel is arguably bigger, and pulling it all poses no problem.
I'm not very familiar with git submodules, but from a quick skim of
- Assuming I have a longer patchset that modifies, let's say, 4 top
level packages, can I commit all those patches in my clone? And can I
push the full series to my fork on github? Because, the manual page
says, "you cannot modify the contents of the submodule from within the
main project".
You would have N clones, one per subpackage. Each patchset would be
split into multiple series, one per subpackage, each fully bisectable.
You'd commit and push each patchset separately. Once you're done, one
final patch would atomically update all the subpackages at once.
Post by Laszlo Ersek
- Assume I'd like to advise a user to build OVMF at a "known good state"
of the edk2 tree. Right now I can do this by naming a single SVN
revision (or git commit hash, preferably); that implies the *full* state
of the edk2 tree, including all non-OvmfPkg modules that OvmfPkg pulls
into the build. Wouldn't submodules prevent this? I'd like to avoid a
mixture of submodule versions after a checkout.
No, submodules can do this. A "master commit hash" includes a list of
subpackage commit hashes.

You'd have to remind the contributor to run "git submodule update" after
checking out the master commit hash.
Post by Laszlo Ersek
Post by Gao, Liming
1. Every Git operation is took for Package Repo. Pull, Branch,
Commit, Create Patch, Fork, and Pull Request are all for Package Repo.
If your patch changes multiple packages, you need to commit and create
patch per Package.
We do that already, just for review's sake. However, there have been a
few (very few) patches that had to straddle packages. What happens for
example if you move a type definition from IntelFrameworkPkg to MdePkg,
due to advances in the UEFI specification?
You would remove it from IntelFrameworkPkg, add it to MdePkg, and commit
a single atomic change for both to the master repository.
Post by Laszlo Ersek
I'm not saying this is
impossible to solve with careful patches, but I'm very concerned that
this (especially in combination with the submodules) will break
bisectability.
Bisectability would be extremely painful, because bisection on the
master repository would leave you at the single huge commit where you
atomically update all subpackages. You would have no clue of how to
bisect _within_ that atomic update, in fact in some case you cannot.
Post by Laszlo Ersek
Post by Gao, Liming
2. git submodule foreach “command” can be used to run command on
every package, for example git submodule foreach "git pull"
This is not enough. A diff between two consecutive commits in the
master repository would just say something like

--- a/IntelFrameworkPkg
+++ b/IntelFrameworkPkg
-Submodule hash 0123456789
+Submodule hash abcdef0123
--- a/OvmfPkg
+++ b/OvmfPkg
-Submodule hash 456789abcd
+Submodule hash ef01234567

and so on. It wouldn't give a clue of how the source changed in the
packages. You can change it to log the added commits ("git diff
--submodule=log" or "git config diff.submodule log"), but not to show
the diffs.

In short, it would be extremely painful.

Paolo

------------------------------------------------------------------------------
Laszlo Ersek
2015-06-03 11:57:52 UTC
Permalink
Post by Paolo Bonzini
Post by Laszlo Ersek
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1. EDKII project is too big. After separate them, the developers
can just pull their used packages instead of full.
The Linux kernel is arguably bigger, and pulling it all poses no problem.
I'm not very familiar with git submodules, but from a quick skim of
- Assuming I have a longer patchset that modifies, let's say, 4 top
level packages, can I commit all those patches in my clone? And can I
push the full series to my fork on github? Because, the manual page
says, "you cannot modify the contents of the submodule from within the
main project".
You would have N clones, one per subpackage. Each patchset would be
split into multiple series, one per subpackage, each fully bisectable.
You'd commit and push each patchset separately. Once you're done, one
final patch would atomically update all the subpackages at once.
Post by Laszlo Ersek
- Assume I'd like to advise a user to build OVMF at a "known good state"
of the edk2 tree. Right now I can do this by naming a single SVN
revision (or git commit hash, preferably); that implies the *full* state
of the edk2 tree, including all non-OvmfPkg modules that OvmfPkg pulls
into the build. Wouldn't submodules prevent this? I'd like to avoid a
mixture of submodule versions after a checkout.
No, submodules can do this. A "master commit hash" includes a list of
subpackage commit hashes.
You'd have to remind the contributor to run "git submodule update" after
checking out the master commit hash.
Post by Laszlo Ersek
Post by Gao, Liming
1. Every Git operation is took for Package Repo. Pull, Branch,
Commit, Create Patch, Fork, and Pull Request are all for Package Repo.
If your patch changes multiple packages, you need to commit and create
patch per Package.
We do that already, just for review's sake. However, there have been a
few (very few) patches that had to straddle packages. What happens for
example if you move a type definition from IntelFrameworkPkg to MdePkg,
due to advances in the UEFI specification?
You would remove it from IntelFrameworkPkg, add it to MdePkg, and commit
a single atomic change for both to the master repository.
Post by Laszlo Ersek
I'm not saying this is
impossible to solve with careful patches, but I'm very concerned that
this (especially in combination with the submodules) will break
bisectability.
Bisectability would be extremely painful, because bisection on the
master repository would leave you at the single huge commit where you
atomically update all subpackages. You would have no clue of how to
bisect _within_ that atomic update, in fact in some case you cannot.
Thanks for the education.

Although bisectability has not been treated as a primary goal across all
of edk2 (unfortunately!), in OvmfPkg we always consider it a first class
goal -- that's how one find bugs and supports users --, and the project
in general should move towards bisectability (and more focused, fine
grained patches), not away from them.

In fact the scenario you described is the original BaseTools situation
all over. Frequently when a BaseTools sync happened, stuff would break,
and users would be left with an unbisectable, multi-KLOC BaseTools patch
to eyeball.
Post by Paolo Bonzini
Post by Laszlo Ersek
Post by Gao, Liming
2. git submodule foreach “command” can be used to run command on
every package, for example git submodule foreach "git pull"
This is not enough. A diff between two consecutive commits in the
master repository would just say something like
--- a/IntelFrameworkPkg
+++ b/IntelFrameworkPkg
-Submodule hash 0123456789
+Submodule hash abcdef0123
--- a/OvmfPkg
+++ b/OvmfPkg
-Submodule hash 456789abcd
+Submodule hash ef01234567
and so on. It wouldn't give a clue of how the source changed in the
packages. You can change it to log the added commits ("git diff
--submodule=log" or "git config diff.submodule log"), but not to show
the diffs.
In short, it would be extremely painful.
Thanks!
Laszlo


------------------------------------------------------------------------------
Paolo Bonzini
2015-06-03 12:00:58 UTC
Permalink
Post by Laszlo Ersek
Post by Paolo Bonzini
Bisectability would be extremely painful, because bisection on the
master repository would leave you at the single huge commit where you
atomically update all subpackages. You would have no clue of how to
bisect _within_ that atomic update, in fact in some case you cannot.
Thanks for the education.
Although bisectability has not been treated as a primary goal across all
of edk2 (unfortunately!), in OvmfPkg we always consider it a first class
goal -- that's how one find bugs and supports users --, and the project
in general should move towards bisectability (and more focused, fine
grained patches), not away from them.
In fact the scenario you described is the original BaseTools situation
all over. Frequently when a BaseTools sync happened, stuff would break,
and users would be left with an unbisectable, multi-KLOC BaseTools patch
to eyeball.
On one hand it would be better because master repo updates would be more
frequent.

On the other hand it would much much worse because BaseTools updates
only updated one repo, while here one update could plausibly touch all
of IntelFrameworkPkg, OvmfPkg, PcAtChipsetPkg and UefiCpuPkg or
something like that.

Paolo

------------------------------------------------------------------------------
Laszlo Ersek
2015-06-03 11:37:15 UTC
Permalink
Post by Laszlo Ersek
I think there are many more inter-dependencies in edk2 than that. Edk2
development does not occur *only* along protocol, PPI, and library class
boundaries. Theoretically that might be possible, but it would require
*extreme* discipline in development (very focused patches, all
developers building all series at all stages before submitting, and so on).
I apologize for responding separately, but an example just occurred to
me: BaseTools.

BaseTools used to exist as a separate repository, and it kept causing
problems (for example, .nasm* assembly sources could not have been
introduced without coordination with BaseTools). Ultimately the
BaseTools suite was unified with the main edk2 repository, and it was a
very welcome development.

Another example would be the PCDs that package X declares in its DEC
file, and package Y sets (statically or dynamically). Coordination is
required; the set of PCDs declared by a package is ultimately a
cross-package interface.

If these can be safely handled with the submodule approach, please
educate me as to how.

Thanks
Laszlo

------------------------------------------------------------------------------
Leif Lindholm
2015-06-03 12:04:15 UTC
Permalink
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub
(https://<https://github.com/tianocore>github.com/tianocore<https://github.com/tianocore>). There
are EDKII project Repo and each package Repo. After migrate EDKII
from SVN to GitHub, EDKII Git Repo will be writable, EDKII SVN
project will become mirror. I expect to keep write access in the
centralized Git repo. But, EDKII project Repo (edk2) and MdePkg
Repo (edk2-MdePkg) includes the same source code. So, only one of
them will be writable. My proposal is to make Package Repo be Read
& Write, and update EDKII project to link each package by
submodule way.
This is obviously not my call, but I would strongly vote against this
solution.
Post by Gao, Liming
1. EDKII project is too big. After separate them, the
developers can just pull their used packages instead of
full.
Like Laszlo says, this has never been a problem in the substantially
larger Linux kernel project.
Post by Gao, Liming
2. The different packages have the different owners. After
separate them, the package owner can give write access for
the different developers.
I agree this functionality would be useful, but I think the
disadvantages of the submodule approach far outweigh the value of
this. We are already restricting who has write access to the
repository. If someone abuses this to commit things into a package
where they should not, without that maintainer's agreement, they
should be stripped of their commit privileges.

A good way of dealing with this would be by requiring an Acked-by: (or
Reveiwed-by:) from one of the package maintainers, if committing into
a package for which you are not a maintainer.
I don't know enough about the github infrastructure, but git-wise,
this should be possible to enforce with a server-side commit hook.
Post by Gao, Liming
3. Close source project can refer to EDKII packages. Those
project can be easily setup by git submodule.
This can be done anyway. This was how I set up OpenPlatformPkg.
Git contains all of the functionality for doing this - there is no
specific upstream repository layout required.
Just clone edk2, add the submodules you want to import, and commit.
Rebase to newer versions of edk2 and pull newer versions of submodules
as needed.

For OpenPlatformPkg, one of the reasons I decided to use the submodule
functionality was because it makes it conceptually and practically
harder to modify core code for (for example) new platform support
code. Thereby enforcing good upstreaming practice.

I think introducing similar compartmentalisation between the packages
that make up the core code itself would introduce reluctance to do any
substantial patch submissions - regardless of whether they are for
features, bugfixes or security.

Regards,

Leif

------------------------------------------------------------------------------
Ard Biesheuvel
2015-06-03 12:47:36 UTC
Permalink
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub
(https://github.com/tianocore). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo will be
writable, EDKII SVN project will become mirror. I expect to keep write
access in the centralized Git repo. But, EDKII project Repo (edk2) and
MdePkg Repo (edk2-MdePkg) includes the same source code. So, only one of
them will be writable. My proposal is to make Package Repo be Read & Write,
and update EDKII project to link each package by submodule way. The benefit
1. EDKII project is too big. After separate them, the developers can
just pull their used packages instead of full.
2. The different packages have the different owners. After separate
them, the package owner can give write access for the different developers.
3. Close source project can refer to EDKII packages. Those project can
be easily setup by git submodule.
Please no. Git submodules add a layer of complexity that we can really
do without.
As others have pointed out, this will nullify some of the benefits of
using Git, like light-weight local branches and bisectability.
In fact, I would personally prefer staying with SVN over switching to
this particular implementation of Git.

Could you elaborate on what are the perceived issues with the existing
tianocore/edk2.git?
IMO it's working perfectly fine, but if there are any issues, let's
discuss them and works towards a solution.

Regards,
Ard.
Post by Gao, Liming
Compared to EDKII project Repo, submodule EDKII project Repo just includes
edksetup.bat, and edksetup.sh. Some BKM of submodule is shared here.
1. Every Git operation is took for Package Repo. Pull, Branch, Commit,
Create Patch, Fork, and Pull Request are all for Package Repo. If your patch
changes multiple packages, you need to commit and create patch per Package.
2. git submodule foreach “command” can be used to run command on every
package, for example git submodule foreach "git pull"
Thanks
Liming
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Brian J. Johnson
2015-06-03 14:50:34 UTC
Permalink
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1.EDKII project is too big. After separate them, the developers can just
pull their used packages instead of full.
2.The different packages have the different owners. After separate them,
the package owner can give write access for the different developers.
3.Close source project can refer to EDKII packages. Those project can be
easily setup by git submodule.
Compared to EDKII project Repo, submodule EDKII project Repo just
includes edksetup.bat, and edksetup.sh. Some BKM of submodule is shared
here.
1.Every Git operation is took for Package Repo. Pull, Branch, Commit,
Create Patch, Fork, and Pull Request are all for Package Repo. If your
patch changes multiple packages, you need to commit and create patch per
Package.
2.git submodule foreach “command” can be used to run command on every
package, for example git submodule foreach "git pull"
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.

When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.

I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.

Please, no submodules....

Thanks,
--
Brian J. Johnson

--------------------------------------------------------------------

My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.

------------------------------------------------------------------------------
Andrew Fish
2015-06-03 19:01:58 UTC
Permalink
Post by Brian J. Johnson
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1.EDKII project is too big. After separate them, the developers can just
pull their used packages instead of full.
2.The different packages have the different owners. After separate them,
the package owner can give write access for the different developers.
3.Close source project can refer to EDKII packages. Those project can be
easily setup by git submodule.
Compared to EDKII project Repo, submodule EDKII project Repo just
includes edksetup.bat, and edksetup.sh. Some BKM of submodule is shared
here.
1.Every Git operation is took for Package Repo. Pull, Branch, Commit,
Create Patch, Fork, and Pull Request are all for Package Repo. If your
patch changes multiple packages, you need to commit and create patch per
Package.
2.git submodule foreach “command” can be used to run command on every
package, for example git submodule foreach "git pull"
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules….
I agree that submodules add complexity, and make things harder. Maybe for hardware project they are OK, but the core of edk2 should be one project.

I’ll also point out that `git grep` only works in the submodule. Actually git in general only works from inside the submodule.
For example if you have a bunch of submodules and you do
git status
all you will see is:
modified: MdePkg (modified content, untracked content)
modified: MdeModulePkg (modified content, untracked content)

To see the change lists you need to cd into the directory of the submodule and run git status.

Thanks,

Andrew Fish
Post by Brian J. Johnson
Thanks,
--
Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Jordan Justen
2015-06-03 20:00:30 UTC
Permalink
Post by Andrew Fish
Post by Brian J. Johnson
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules….
I agree that submodules add complexity, and make things harder.
Maybe for hardware project they are OK, but the core of edk2 should
be one project.
I also would prefer if EDK II upstream could be a single repo, but I
understand why there is also a desire to consider submodules.

First and foremost, inside Intel, the svn:externals feature is used
extensively to compose platform trees together. And, submodules map
very closely to that usage model.

But, even if you try to consider alternatives to submodules for
composing platform trees, things get complicated.

One idea, is to fork the EDK II master tree, and add submodules for
your platform specific modules. To me this ends up with the worst of
both worlds. 1. All git commands are difficult to use tree-wide, as
expressed in this thread, and 2. You don't have the power to select
only the EDK II modules that you need for your platform.

Another idea is to fork the EDK II master tree and add your platform
specific modules directly into the fork. In this case, you can still
use all the git commands, but you once again can't select only the EDK
II modules that you need for your platform. Other difficulties arise,
such as, what if you have a chipset package that you want to share for
multiple platforms? Unless all the platforms for that chipset live in
the same branch, how do you easily share common code for those chipset
packages? (Maybe a separate 'upstream' for the chipset code that the
platforms merge in as needed?)

I think Android might share some of the same concerns, and their
solution was to invent a submodules-like alternative called 'repo'
that layers on git.

So, can we add these concerns into the discussion, and maybe document
an alternative way to address these concerns if submodules aren't
used?

Thanks,

-Jordan

------------------------------------------------------------------------------
Kirkendall, Garrett
2015-06-03 20:35:47 UTC
Permalink
First, sorry for the long post, maybe this will get me some help discovering what I can do with git , or help somebody else by describing my pain.

As a user of EDKII, we are currently trying to determine how we can transition to git for our projects. I am also still relatively new to git so I still have a lot of learning to do. Also, our primary development environment for x86 UEFI images is Windows.

Here are my struggles with converting to git.

Subversion allows us to pull any subfolder of a repository as an external. Therefore, we are able to pull each package folder of EDKII into the root directory of our working copy and keep the expected EDKII directory structure of all packages in the root (That only leaves special handling of the relatively few EDKII root folder files). The other huge advantage is that I can have a development branch which automatically tracks the HEAD of each external. Or, I can just as easily specify a revision of an external for a production branch.

We thought about submodules and after much reading figured out it really wasn't a good idea, and not nearly as useful as svn externals. The main reason, is that you can't easily track the HEAD of submodules. While it looks like the commands have become somewhat easier to use to update to the HEAD revision, as far as I can tell, you are still required to commit to the parent project to recognize what to pull from the submodule. Submodules also require extra commands/parameters to clone the submodules. Svn updates externals by default, and you have to specify that you don’t want them updated.

The fact that git only works on a complete repository makes it impossible to pull a complete EDKII tree, as some sort of external, such that the package folders go into the root directory as expected by the EDKII build infrastructure. This is why I assume the splitting up of packages into separate repos was suggested. In Linux, it would be easy to pull the EDKII tree into a subdirectory and make links from folders of EDKII into the root directory. Git even seems to be happy following these links and showing status of the complete subproject, etc.

Then there's Google repo. Which, under Linux at least, seems to fit the bill relatively nicely. It allows you to specify a sub repository branch at its HEAD, or at a revision. It gives you some nice functionality to see what has changed in subprojects, but you have to use the git tools to commit to each individual subproject. The downside for Windows, Google repo heavily depends on the OS file management capabilities of Linux and therefore is a non-portable python tool that must be run in a Linux like command-line environment. Cygwin provides a relatively Linux capable command prompt environment in Windows and can run google repo, but with some less than desirable side effects. If you've played with google repo, it does a lot of things with symbolic links. Cygwin can be configured to create windows symbolic links when the target file/directory exists. Google repo creates symbolic links even when the target doesn't exist so, you can't force strict native symbolic links. Since the .git subdirectory in subprojects is full of symbolic links back to the .repo directory, you are then forced to do every bit of the git interaction within a Cygwin prompt. That's OK with me because I know enough about Windows command-line and Linux command-line to be pretty dangerous to myself. Our other developers are not as Linux knowledgeable, so now they will have to learn about Linux command-line and git command-line to do the same thing they had a nice pretty GUI in Windows and subversion. (I know about esrlabs/git-repo, but it doesn't look like it gets much love.)

So even with those problems and Cygwin, Google repo seems to have less of a downside than submodules. So we will probably use that. Now I'm down to the problem of pulling the EDKII tree into my project. We can't get Cygwin/repo to consistently create Windows native symbolic links. It looks like we are going to pull EDKII as a subdirectory and then in a batch file and from a "cmd.exe" prompt create Windows symbolic links to the proper subdirectories. We use a batch file under "cmd.exe" to run our platform build anyway, so no big deal. This will allow us to at least build in a native windows command environment.

One more interesting thing about Cygwin/Linux is that it can do particularly nasty things to file access permissions that windows users don't usually comprehend. It can make an executable file "*.exe" not execute because it got stored in the git repo without the executable bit set. For example when Linux was used to convert from an svn repo to a git repo.

One last thought, I assume there would be some strategy where we could start with the EDKII tree as our base project and then put our platform code in. Then use merging to pick up EDKII modifications, but as Jordan said, that has its own concerns.

Thanks to anyone that took the time to read this whole thing. Any thoughts would be greatly appreciated.

GARRETT KIRKENDALL  
SMTS Firmware Engineer | CTE
7171 Southwest Parkway, Austin, TX 78735 USA
   facebook  |  amd.com


-----Original Message-----
From: Jordan Justen [mailto:***@intel.com]
Sent: Wednesday, June 03, 2015 3:01 PM
To: Andrew Fish; edk2-***@lists.sourceforge.net
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Andrew Fish
Post by Brian J. Johnson
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating
common code among different projects. That only lasted a month or
two: having to manage multiple repositories for day-to-day work,
and the lack of a single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules….
I agree that submodules add complexity, and make things harder.
Maybe for hardware project they are OK, but the core of edk2 should be
one project.
I also would prefer if EDK II upstream could be a single repo, but I understand why there is also a desire to consider submodules.

First and foremost, inside Intel, the svn:externals feature is used extensively to compose platform trees together. And, submodules map very closely to that usage model.

But, even if you try to consider alternatives to submodules for composing platform trees, things get complicated.

One idea, is to fork the EDK II master tree, and add submodules for your platform specific modules. To me this ends up with the worst of both worlds. 1. All git commands are difficult to use tree-wide, as expressed in this thread, and 2. You don't have the power to select only the EDK II modules that you need for your platform.

Another idea is to fork the EDK II master tree and add your platform specific modules directly into the fork. In this case, you can still use all the git commands, but you once again can't select only the EDK II modules that you need for your platform. Other difficulties arise, such as, what if you have a chipset package that you want to share for multiple platforms? Unless all the platforms for that chipset live in the same branch, how do you easily share common code for those chipset packages? (Maybe a separate 'upstream' for the chipset code that the platforms merge in as needed?)

I think Android might share some of the same concerns, and their solution was to invent a submodules-like alternative called 'repo'
that layers on git.

So, can we add these concerns into the discussion, and maybe document an alternative way to address these concerns if submodules aren't used?

Thanks,

-Jordan

------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
edk2-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Brian J. Johnson
2015-06-03 22:42:56 UTC
Permalink
Post by Kirkendall, Garrett
The fact that git only works on a complete repository makes it impossible to pull a complete EDKII tree, as some sort of external, such that the package folders go into the root directory as expected by the EDKII build infrastructure. This is why I assume the splitting up of packages into separate repos was suggested. In Linux, it would be easy to pull the EDKII tree into a subdirectory and make links from folders of EDKII into the root directory. Git even seems to be happy following these links and showing status of the complete subproject, etc.
Yes, that's definitely an issue.
Post by Kirkendall, Garrett
Then there's Google repo. Which, under Linux at least, seems to fit the bill relatively nicely. It allows you to specify a sub repository branch at its HEAD, or at a revision. It gives you some nice functionality to see what has changed in subprojects, but you have to use the git tools to commit to each individual subproject. The downside for Windows, Google repo heavily depends on the OS file management capabilities of Linux and therefore is a non-portable python tool that must be run in a Linux like command-line environment. Cygwin provides a relatively Linux capable command prompt environment in Windows and can run google repo, but with some less than desirable side effects. If you've played with google repo, it does a lot of things with symbolic links. Cygwin can be configured to create windows symbolic links when the target file/directory exists. Google repo creates symbolic links even when the target doesn't exist so, you can't force strict native symbolic links. Since the .git subdirectory in subprojects is full of symbolic links back to the .repo directory, you are then forced to do every bit of the git interaction within a Cygwin prompt. That's OK with me because I know enough about Windows command-line and Linux command-line to be pretty dangerous to myself. Our other developers are not as Linux knowledgeable, so now they will have to learn about Linux command-line and git command-line to do the same thing they had a nice pretty GUI in Windows and subversion. (I know about esrlabs/git-repo, but it doesn't look like it gets much love.)
You could also take a look at gitslave ("gits",
http://gitslave.sourceforge.net/). It works a bit like submodules, in
that you can add a repository as a subdirectory, and manage it without
too much trouble. Essentially it automates running git commands across
a super-repo and sub-repos, with some sugar for aggregating the results
into a readable whole. But at the end of the day, you're still trying
to coordinate multiple git repositories.

Gitslave is just a perl script. I've never tried using it on Windows,
but I'd imagine it would work.
Post by Kirkendall, Garrett
...
One last thought, I assume there would be some strategy where we could start with the EDKII tree as our base project and then put our platform code in. Then use merging to pick up EDKII modifications, but as Jordan said, that has its own concerns.
Thanks to anyone that took the time to read this whole thing. Any thoughts would be greatly appreciated.
If you want to track upstream TianoCore, but don't want to have packages
you're not interested in cluttering up your workareas, you could just do
your development in a branch in which you've "git rm"ed the unneeded
code. Merges should recognize that you've deleted the files. The
unwanted code will still be in the .git data, but disk is cheap....

It's possible to do all sorts of things with git filter-branch. For
instance, you could completely remove unnecessary packages and their
history from the upstream TianoCore code for your in-house tree. Or you
could grab a package, along with its history, out of one project to use
in another project. Of course, the git commit IDs change when you do
that, so you can't just git-pull from an original to fetch updates...
you'd need to write some additional scripts to grab and filter new
upstream commits, or just wrangle patches manually. But you have the
full power of git to use within the project tree, with no extra tools or
wrappers for daily development.

That's essentially the flow I've used. At the beginning of a project we
use a filter-branch script to copy relevant parts of the previous
project (sometimes with significant pathname rewriting) into the new
project. Then we manually apply patches between them (using git's patch
generation and application facilities, of course, or a helper like
stgit) as necessary. We don't have many such patches to merge... if we
did, we'd probably have to find a different solution.

So things don't need to be pretty... they just have to work well in the
common, daily cases, and not be utterly broken in the uncommon ones
(like starting new projects.)
--
Brian J. Johnson

--------------------------------------------------------------------

My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.


------------------------------------------------------------------------------
Laszlo Ersek
2015-06-04 00:13:05 UTC
Permalink
Post by Jordan Justen
Post by Andrew Fish
Post by Brian J. Johnson
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules….
I agree that submodules add complexity, and make things harder.
Maybe for hardware project they are OK, but the core of edk2 should
be one project.
I also would prefer if EDK II upstream could be a single repo, but I
understand why there is also a desire to consider submodules.
First and foremost, inside Intel, the svn:externals feature is used
extensively to compose platform trees together. And, submodules map
very closely to that usage model.
But, even if you try to consider alternatives to submodules for
composing platform trees, things get complicated.
One idea, is to fork the EDK II master tree, and add submodules for
your platform specific modules. To me this ends up with the worst of
both worlds. 1. All git commands are difficult to use tree-wide, as
expressed in this thread, and 2. You don't have the power to select
only the EDK II modules that you need for your platform.
True.

I think git's history / background can help us find an explanation for
questions like the one above. (Note, I'm not saying "find satisfying
answers / solutions", just "an explanation".) Git was invented for
facilitating the Linux kernel's development. (Hence, when in doubt, look
to Linux.) Indeed, when you clone the Linux repo, you get everything.
Post by Jordan Justen
Another idea is to fork the EDK II master tree and add your platform
specific modules directly into the fork. In this case, you can still
use all the git commands, but you once again can't select only the EDK
II modules that you need for your platform.
Agreed. (Similarly, when you clone or pull the kernel, you get driver
directories you might never enter or build.)
Post by Jordan Justen
Other difficulties arise,
such as, what if you have a chipset package that you want to share for
multiple platforms? Unless all the platforms for that chipset live in
the same branch, how do you easily share common code for those chipset
packages? (Maybe a separate 'upstream' for the chipset code that the
platforms merge in as needed?)
Exactly. This is what would follow the Linux development model. The
central chipset package would have a maintainer, a team, a mailing list
(perhaps :)). The chipset package maintainer would apply patches in his
own tree, and sometimes send pull requests to the "central" repository's
"chief maintainer". After a pull (when the chipset update would appear
at once in the central master branch, but without losing history),
platform teams could fetch that branch, and either rebase their
development branches on top, or merge the updated master into their
development branches.
Post by Jordan Justen
I think Android might share some of the same concerns, and their
solution was to invent a submodules-like alternative called 'repo'
that layers on git.
I thought of symlinks (and I was surprised to see in Garrett's email
that Google's "repo" operates with symlinks!), but, as Garrett
describes, symlinks don't really work on windows. (Again, git was
invented for Linux, on Linux...)
Post by Jordan Justen
So, can we add these concerns into the discussion, and maybe document
an alternative way to address these concerns if submodules aren't
used?
Would it be possible to maintain the submodules in-house only?

If the "upstream first" principle is followed -- that is, everything
that gets open sourced gets *developed* for upstream in the first place
-- then it should be possible to port the upstream patches to the
internal submodules. Is that correct? Package owners can keep an eye on
upstream patches (in reviews) and intervene if a patch were to introduce
too tight coupling between packages. (This is being watched already, and
Intel's in-house model can apparently deal with the currently existing
interdependencies.)

Once the upstream patches / history have been dispersed (backported, or
branch-filtered) to the in-house submodules, the current usage /
platform build model could continue.

Red Hat follows the "upstream first" model, and we do a lot of
backports. (No submodules though, as far as kernel & virt are concerned.)

... It's hard to find a common workflow; the requirements and the means
of coordination are different.

Thanks
Laszlo

------------------------------------------------------------------------------
Roy Franz
2015-06-04 00:59:46 UTC
Permalink
Post by Jordan Justen
Post by Andrew Fish
Post by Brian J. Johnson
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules….
I agree that submodules add complexity, and make things harder.
Maybe for hardware project they are OK, but the core of edk2 should
be one project.
I also would prefer if EDK II upstream could be a single repo, but I
understand why there is also a desire to consider submodules.
First and foremost, inside Intel, the svn:externals feature is used
extensively to compose platform trees together. And, submodules map
very closely to that usage model.
But, even if you try to consider alternatives to submodules for
composing platform trees, things get complicated.
One idea, is to fork the EDK II master tree, and add submodules for
your platform specific modules. To me this ends up with the worst of
both worlds. 1. All git commands are difficult to use tree-wide, as
expressed in this thread, and 2. You don't have the power to select
only the EDK II modules that you need for your platform.
Another idea is to fork the EDK II master tree and add your platform
specific modules directly into the fork. In this case, you can still
use all the git commands, but you once again can't select only the EDK
II modules that you need for your platform. Other difficulties arise,
such as, what if you have a chipset package that you want to share for
multiple platforms? Unless all the platforms for that chipset live in
the same branch, how do you easily share common code for those chipset
packages? (Maybe a separate 'upstream' for the chipset code that the
platforms merge in as needed?)
I think Android might share some of the same concerns, and their
solution was to invent a submodules-like alternative called 'repo'
that layers on git.
So, can we add these concerns into the discussion, and maybe document
an alternative way to address these concerns if submodules aren't
used?
Thanks,
-Jordan
I'm wondering how this all works with the single upstream SVN
repository we have now? Are
people routinely checking out subdirectories of the repository, and
including these in their
internal repositories? If this is a key usage model it is going to be
difficult to replicate this
with git.

Honestly if the choice comes down to an EDK2 upstream being git with
40 submodules, or
a single SVN repository (with a git mirror), I'd rather have the
single SVN repository.
I'd never use SVN, but just use the git mirror. The maintainers would would
still need to use SVN, but that's not my problem :):) The people that
want to use SVN
could use SVN, and those that want to use git (except for
maintainers/committers :) could
use git.

Roy

------------------------------------------------------------------------------
Andrew Fish
2015-06-04 01:17:26 UTC
Permalink
Post by Roy Franz
Honestly if the choice comes down to an EDK2 upstream being git with
40 submodules, or
a single SVN repository (with a git mirror), I'd rather have the
single SVN repository.
I'd never use SVN, but just use the git mirror. The maintainers would would
still need to use SVN, but that's not my problem :):) The people that
want to use SVN
could use SVN, and those that want to use git (except for
maintainers/committers :) could
use git.
You can commit to svn from git.

git svn rebase
git svn dcommit
git svn info

Thanks,

Andrew Fish
Jordan Justen
2015-06-04 02:03:41 UTC
Permalink
Post by Roy Franz
Honestly if the choice comes down to an EDK2 upstream being git with
40 submodules, or
a single SVN repository (with a git mirror), I'd rather have the
single SVN repository.
I'd never use SVN, but just use the git mirror. The maintainers would would
still need to use SVN, but that's not my problem :):) The people that
want to use SVN
could use SVN, and those that want to use git (except for
maintainers/committers :) could
use git.
You can commit to svn from git.
git svn rebase
git svn dcommit
git svn info
git svn makes svn based development sane, but it is inferior.

It doesn't really support all git features.

It also has a natsy gotcha where equivalent branch get artificially
split.

For example, my 'git-svn' at top-of-tree is never considered the same
as origin/master.

This prevents things like 'git merge' from being usable. Of course,
'git merge' can't be used with git svn anyhow...

It also causes the source control history to be needlessly duplicated
for the two branches.

An example of how this wastes time is that I do my development based
on the git origin/master branch. But, when it comes time to commit to
svn, I checkout the git-svn branch, run git svn rebase, cherry-pick
all the changes to the git-svn branch, and finally use git svn
dcommit. Contrast this to just running 'git push'.

-Jordan

------------------------------------------------------------------------------
Andrew Fish
2015-06-04 02:41:21 UTC
Permalink
Post by Jordan Justen
Post by Andrew Fish
You can commit to svn from git.
git svn rebase
git svn dcommit
git svn info
git svn makes svn based development sane, but it is inferior.
It doesn't really support all git features.
It also has a natsy gotcha where equivalent branch get artificially
split.
For example, my 'git-svn' at top-of-tree is never considered the same
as origin/master.
This prevents things like 'git merge' from being usable. Of course,
'git merge' can't be used with git svn anyhow...
It also causes the source control history to be needlessly duplicated
for the two branches.
An example of how this wastes time is that I do my development based
on the git origin/master branch. But, when it comes time to commit to
svn, I checkout the git-svn branch, run git svn rebase, cherry-pick
all the changes to the git-svn branch, and finally use git svn
dcommit. Contrast this to just running 'git push'.
I was just throwing it out as possible solution, I did not mean to imply it was “good”.

I’m wondering if you could use a pre-commit hook to merge with the git mirror and do the git-svn for you?

It also looks like `git update-index --assume-unchanged` could be used to prune unwanted code out of the remote view? So maybe one big git repro could be made to work? It would be awesome if the build could generate a list of all the files used in the build, so that pruning would be easy.

Thanks,

Andrew Fish
Ard Biesheuvel
2015-06-04 06:33:17 UTC
Permalink
Post by Andrew Fish
You can commit to svn from git.
git svn rebase
git svn dcommit
git svn info
git svn makes svn based development sane, but it is inferior.
It doesn't really support all git features.
It also has a natsy gotcha where equivalent branch get artificially
split.
For example, my 'git-svn' at top-of-tree is never considered the same
as origin/master.
This prevents things like 'git merge' from being usable. Of course,
'git merge' can't be used with git svn anyhow...
It also causes the source control history to be needlessly duplicated
for the two branches.
An example of how this wastes time is that I do my development based
on the git origin/master branch. But, when it comes time to commit to
svn, I checkout the git-svn branch, run git svn rebase, cherry-pick
all the changes to the git-svn branch, and finally use git svn
dcommit. Contrast this to just running 'git push'.
Actually, running 'git rebase origin/master --onto remotes/git-svn'
from your topic branch works just fine (after a git svn fetch) so
there is no need to cherry-pick a patch at a time.
There is not even really a need to have a local branch that tracks the
svn remote.

As I said in my earlier reply, I prefer Git over SVN any day of the
week, but I don't have /that/ many issues with the current situation.

So am I correct in understanding that the split view into separate
submodules is primarily for the benefit of the downstream, and most
development of EDK2 will occur in the EDK2 project? In that case, it
is really just a matter of providing a read-only split view, and using
git subtree does sound like a promising approach. But otherwise,
retaining the SVN repo but making it the slave rather than the master,
with some automation in place to keep it in sync sounds reasonable as
well, since it will allow the continued use of svn:externals as
before. Or perhaps we should just do both?

In any case, I think the consensus is that the EDK2 upstream should
remain a single project if it moves to Git. If anyone disagrees with
this observation but hasn't chimed in yet, please speak up and
motivate your position.
--
Ard.

------------------------------------------------------------------------------
Andrew Fish
2015-06-04 06:46:33 UTC
Permalink
Post by Ard Biesheuvel
In any case, I think the consensus is that the EDK2 upstream should
remain a single project if it moves to Git. If anyone disagrees with
this observation but hasn't chimed in yet, please speak up and
motivate your position.
Ard,

I agree with you, but I have to ask the question
 How many platform ports in the single git repro is too much?

Maybe the answer is it will never be too much? In that case we should have some tools to help prune the developers repositories.

If we are going to go with a single git, it seems like the edk2 build system could “help out”:
1) Enable more than one package root. Don’t require all the packages to be in the Root. Allow multiple WORKSPACE paths as starting points.
a) This gives some flexibility, even if it is just adding vendor directories, or even an edk2 directory for a larger project.
2) Have the log file for the build list all the Packages, or maybe even the unused packages in the tree for the build.
a) This could enable a tool that pruned out the extra stuff from git that you don’t need.
b) We could even just emit the script that runs the git commands to adjust the local repository.

Thanks,

Andrew Fish
Ard Biesheuvel
2015-06-04 08:40:34 UTC
Permalink
Post by Ard Biesheuvel
In any case, I think the consensus is that the EDK2 upstream should
remain a single project if it moves to Git. If anyone disagrees with
this observation but hasn't chimed in yet, please speak up and
motivate your position.
Ard,
I agree with you, but I have to ask the question… How many platform ports in
the single git repro is too much?
Maybe the answer is it will never be too much? In that case we should have
some tools to help prune the developers repositories.
To what end exactly? I agree that it may not scale beyond some point
in the future, but let's get clear what exactly we want to accomplish
by removing source files and folders from the tree that the developer
is going to ignore anyway. Disk space? Clutter in the EDK2/ root
folder?
Post by Ard Biesheuvel
If we are going to go with a single git, it seems like the edk2 build system
1) Enable more than one package root. Don’t require all the packages to be
in the Root. Allow multiple WORKSPACE paths as starting points.
a) This gives some flexibility, even if it is just adding vendor
directories, or even an edk2 directory for a larger project.
Yes, we should make it as easy as possible to build packages and
platforms out-of-tree.
Post by Ard Biesheuvel
2) Have the log file for the build list all the Packages, or maybe even the
unused packages in the tree for the build.
a) This could enable a tool that pruned out the extra stuff from git that
you don’t need.
b) We could even just emit the script that runs the git commands to adjust
the local repository.
Removing folders containing files that subsequent patches may modify
is going to cause conflicts when pulling from the repo, so I guess
this may be more cumbersome than you think.

So perhaps it is really a matter of adding support to the BuildTools
to allow building packages and platforms that exist outside of the
EDK2 code and refer to it at build time via a git submodule under
EDK2/. Consequently, we should not add any new platforms to the core,
and perhaps retire some and move them to a separate platforms tree
such as the one Leif has been working on.
--
Ard.

------------------------------------------------------------------------------
Jordan Justen
2015-06-04 01:53:06 UTC
Permalink
Post by Jordan Justen
But, even if you try to consider alternatives to submodules for
composing platform trees, things get complicated.
One idea, is to fork the EDK II master tree, and add submodules for
your platform specific modules. To me this ends up with the worst of
both worlds. 1. All git commands are difficult to use tree-wide, as
expressed in this thread, and 2. You don't have the power to select
only the EDK II modules that you need for your platform.
Another idea is to fork the EDK II master tree and add your platform
specific modules directly into the fork. In this case, you can still
use all the git commands, but you once again can't select only the EDK
II modules that you need for your platform. Other difficulties arise,
such as, what if you have a chipset package that you want to share for
multiple platforms? Unless all the platforms for that chipset live in
the same branch, how do you easily share common code for those chipset
packages? (Maybe a separate 'upstream' for the chipset code that the
platforms merge in as needed?)
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.

I would setup an automated process to split each package off using git
subtree, and push the separate repos.

So, people who like the idea of git submodules can use them. The main
disadvantage to them would be to get things upstream, they'd have to
convert their commits to the merged unified tree. (git subtree might
be able to help here as well, but there is no doubt that it would be
more steps.)

I never got the time to investigate if git subtree could work as
required, but this text from the help page seems promising:

"
split

Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.

After splitting successfully, a single commit id is printed to
stdout. This corresponds to the HEAD of the newly created tree,
which you can manipulate however you want.

Repeated splits of exactly the same history are guaranteed to be
identical (ie. to produce the same commit ids). Because of this,
if you add new commits and then re-split, the new commits will
be attached as commits on top of the history you generated last
time, so 'git merge' and friends will work as expected.

Note that if you use '--squash' when you merge, you should
usually not just '--rejoin' when you split.
"

Note the "Repeated splits" part...

-Jordan

------------------------------------------------------------------------------
Laszlo Ersek
2015-06-04 13:15:58 UTC
Permalink
Post by Jordan Justen
Post by Jordan Justen
But, even if you try to consider alternatives to submodules for
composing platform trees, things get complicated.
One idea, is to fork the EDK II master tree, and add submodules for
your platform specific modules. To me this ends up with the worst of
both worlds. 1. All git commands are difficult to use tree-wide, as
expressed in this thread, and 2. You don't have the power to select
only the EDK II modules that you need for your platform.
Another idea is to fork the EDK II master tree and add your platform
specific modules directly into the fork. In this case, you can still
use all the git commands, but you once again can't select only the EDK
II modules that you need for your platform. Other difficulties arise,
such as, what if you have a chipset package that you want to share for
multiple platforms? Unless all the platforms for that chipset live in
the same branch, how do you easily share common code for those chipset
packages? (Maybe a separate 'upstream' for the chipset code that the
platforms merge in as needed?)
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using git
subtree, and push the separate repos.
So, people who like the idea of git submodules can use them.
Sounds promising.
Post by Jordan Justen
The main
disadvantage to them would be to get things upstream, they'd have to
convert their commits to the merged unified tree.
I think the conversion (= porting) should go in the other direction.
Develop for upstream first, and once it passes public review (clearly
including Intel's own reviewers), and is applied, *then* split your own
upstream commits as well, with the same subtree method.

If people develop for the internal repos first (and pass internal
reviews), then it is very tempting (and easy) to leave the public repo
behind.

(Obviously this only covers modules that are open source.)
Post by Jordan Justen
(git subtree might
be able to help here as well, but there is no doubt that it would be
more steps.)
Yes. Certainly not double the work, but more than developing for just
one repo.

Thanks
Laszlo
Post by Jordan Justen
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
After splitting successfully, a single commit id is printed to
stdout. This corresponds to the HEAD of the newly created tree,
which you can manipulate however you want.
Repeated splits of exactly the same history are guaranteed to be
identical (ie. to produce the same commit ids). Because of this,
if you add new commits and then re-split, the new commits will
be attached as commits on top of the history you generated last
time, so 'git merge' and friends will work as expected.
Note that if you use '--squash' when you merge, you should
usually not just '--rejoin' when you split.
"
Note the "Repeated splits" part...
-Jordan
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Brian J. Johnson
2015-06-04 15:58:54 UTC
Permalink
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using git
subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember what
I thought about it.... It works, but it tends to produce a confusing
git log, IIRC. And if you're going to push to the subtrees, you should
be careful to limit each commit to files in a single (sub)tree. That
requires developer discipline, or a good pre-commit hook.

But for extracting packages into separate read-only repos, it should be
perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson

--------------------------------------------------------------------

My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.

------------------------------------------------------------------------------
Roy Franz
2015-06-04 16:34:17 UTC
Permalink
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using git
subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember what
I thought about it.... It works, but it tends to produce a confusing
git log, IIRC. And if you're going to push to the subtrees, you should
be careful to limit each commit to files in a single (sub)tree. That
requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should be
perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based
on the single master git repo, this sounds like a good solution. This keeps
the single git repo as the master, and avoids the complications of commits
that cross module boundaries in the sub-module case. If git subtree can support
this usage model, would read-only git subtrees for the modules meet
the requirements
for those who want to use "modules" individually? Note that only the upstream
subtree repos are read-only, various other groups could still have
internal read/write
repos cloned from those, it's just that changes pushed upstream would need to go
through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to
verify that
git subtree works as desired.

Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Roy Franz
2015-06-04 19:29:23 UTC
Permalink
Post by Roy Franz
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using git
subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember what
I thought about it.... It works, but it tends to produce a confusing
git log, IIRC. And if you're going to push to the subtrees, you should
be careful to limit each commit to files in a single (sub)tree. That
requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should be
perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based
on the single master git repo, this sounds like a good solution. This keeps
the single git repo as the master, and avoids the complications of commits
that cross module boundaries in the sub-module case. If git subtree can support
this usage model, would read-only git subtrees for the modules meet
the requirements
for those who want to use "modules" individually? Note that only the upstream
subtree repos are read-only, various other groups could still have
internal read/write
repos cloned from those, it's just that changes pushed upstream would need to go
through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to
verify that
git subtree works as desired.
Roy
(replying to myself)

I did some quick experiments with git subtree split, and it the result
of repeated
"subtree split" commands are suitable for pushing to a single branch on a remote
repository. This can be used to create read-only "module"
repositories that only contain
a single directory. Each "subtree split" command starts from the
beginning of the repository
history - it is not incremental, but the results of later iterations
consist of new commits being
appended to the existing, stable, history. One downside is that the
subtrees have different
commit ids than the master repo (but they are stable within the
subtree, even when regenerated.)

I think this is a viable strategy. There is also no reason that the
splitting of of the main
repository into subtrees needs to be done publicly. The subtrees of
interest can be made by
the organizations that are interested in them. Everybody will get the
same results from
"git subtree split --prefix=ArmPkg -b subtree-ArmPkg". I think this
makes more sense vs. the
submodules, since here we are really trying to break up a single
project into tiny pieces, rather
than pull together a collection of separate projects.

Comments???

Thanks,
Roy
Post by Roy Franz
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Gao, Liming
2015-06-05 16:59:16 UTC
Permalink
Thanks for your all comments. Those gives me more concept of Git.

I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.

Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?

Thanks
Liming
-----Original Message-----
From: Roy Franz [mailto:***@linaro.org]
Sent: Friday, June 5, 2015 12:34 AM
To: edk2-***@lists.sourceforge.net
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the subtrees,
you should be careful to limit each commit to files in a single
(sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.

Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
edk2-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Roy Franz
2015-06-05 17:47:00 UTC
Permalink
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Hi Liming,
I don't understand what you are proposing - can you please explain
it in more detail?
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Yup - that is the only use case the original design considered. It
has grown some more features as other projects adopted it, but for
some use cases subversion is much more usable.

Roy
Post by Gao, Liming
Thanks
Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the subtrees,
you should be careful to limit each commit to files in a single
(sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Ard Biesheuvel
2015-06-08 09:08:35 UTC
Permalink
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular
use case, but I think the pushback in this thread against it comes
mostly from people who have actually used git submodules, so I suggest
we take those comments seriously.

It seems that you are already in the situation where you need to push
your changes to multiple repositories at the same time, and those
repositories are not tightly coupled. I don't think this applies to
many of us, so it doesn't seem fair to me to force others to adopt
that mode of development while not strictly necessary.

I think the subtree approach is reasonable, where we have a single
read-write core EDK2 repo (probably just the existing one at GitHub)
and use some automation to keep a collection of subtree mirrors in
sync. As Roy has confirmed, git subtree is repeatable, i.e., it
produces the exact same commit IDs when invoked several times, so the
subtree repositories would be stable as well.

For submitting your patches, note that git format-patch supports
src-prefix and dst-prefix options, which perhaps we may use to convert
your subtree patches to core patches? I haven't tried it myself but it
looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and
closed source software. Mind you, I don't think there is anything
wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the subtrees,
you should be careful to limit each commit to files in a single
(sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Olivier Martin
2015-06-16 17:39:49 UTC
Permalink
Most EDK2 users uses MS Windows host machine & GUI tool. I am not sure 'git subtree' that is not part of the default git tool fits with the EDK2 community requirements.

Having a main/unique EDK2 repository is not incompatible with the inclusion of third-party/private components in your development environment.
In my working tree, I have:
- EDK2 as a git repository
- SctPkg as a git repository
- Some private platforms as separate git repository
If really you want to get a set of EDK2 and third-party/private components nothing prevent you to create a branch based on 'master' that would add your external components as git submodules.

-----Original Message-----
From: Ard Biesheuvel [mailto:***@linaro.org]
Sent: 08 June 2015 10:09
To: edk2-***@lists.sourceforge.net
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular use case, but I think the pushback in this thread against it comes mostly from people who have actually used git submodules, so I suggest we take those comments seriously.

It seems that you are already in the situation where you need to push your changes to multiple repositories at the same time, and those repositories are not tightly coupled. I don't think this applies to many of us, so it doesn't seem fair to me to force others to adopt that mode of development while not strictly necessary.

I think the subtree approach is reasonable, where we have a single read-write core EDK2 repo (probably just the existing one at GitHub) and use some automation to keep a collection of subtree mirrors in sync. As Roy has confirmed, git subtree is repeatable, i.e., it produces the exact same commit IDs when invoked several times, so the subtree repositories would be stable as well.

For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and closed source software. Mind you, I don't think there is anything wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the subtrees,
you should be careful to limit each commit to files in a single
(sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
edk2-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
------------------------------------------------------------------------------
Ard Biesheuvel
2015-06-16 17:53:04 UTC
Permalink
Post by Olivier Martin
Most EDK2 users uses MS Windows host machine & GUI tool. I am not sure 'git subtree' that is not part of the default git tool fits with the EDK2 community requirements.
That may be true, but the suggestion is not for every user to use git
subtree individually, but to maintain a number of subtree mirrors that
are kept in sync by automation. That way, you can compose your own
workspace with various EDK2 packages as before. The only unsolved
issue is how to convert your patches against those subtrees into
patches that can be applied to the core upstream version.
Post by Olivier Martin
Having a main/unique EDK2 repository is not incompatible with the inclusion of third-party/private components in your development environment.
- EDK2 as a git repository
- SctPkg as a git repository
- Some private platforms as separate git repository
If really you want to get a set of EDK2 and third-party/private components nothing prevent you to create a branch based on 'master' that would add your external components as git submodules.
Perhaps Liming can comment on whether or not his use case is comparable?
Post by Olivier Martin
-----Original Message-----
Sent: 08 June 2015 10:09
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular use case, but I think the pushback in this thread against it comes mostly from people who have actually used git submodules, so I suggest we take those comments seriously.
It seems that you are already in the situation where you need to push your changes to multiple repositories at the same time, and those repositories are not tightly coupled. I don't think this applies to many of us, so it doesn't seem fair to me to force others to adopt that mode of development while not strictly necessary.
I think the subtree approach is reasonable, where we have a single read-write core EDK2 repo (probably just the existing one at GitHub) and use some automation to keep a collection of subtree mirrors in sync. As Roy has confirmed, git subtree is repeatable, i.e., it produces the exact same commit IDs when invoked several times, so the subtree repositories would be stable as well.
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and closed source software. Mind you, I don't think there is anything wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the subtrees,
you should be careful to limit each commit to files in a single
(sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Peterson, Joe
2015-06-17 01:10:11 UTC
Permalink
Hello all,

There are a lot of conflicting ideas on how we should lay out our EDK2 repo using Git. We have Submodules, Subtrees, using soft links, etc. Some of the suggestions won't work for various reasons- ex. repo isn't supported on Windows, which is very common amongst EDK2 users.

What are everyone's thoughts on just staying with SVN with our mirror on github for people who want to use git locally (and they can check in using git svn). I know there was a lot of support and several requests to move to git, so I would like to understand a little more about why people feel subversion isn't working out for us.

Thanks,
-JEEP
Joe Peterson
------------------------------------------------------------------------------
Laszlo Ersek
2015-06-17 09:38:39 UTC
Permalink
Post by Peterson, Joe
Hello all,
There are a lot of conflicting ideas on how we should lay out our
EDK2 repo using Git. We have Submodules, Subtrees, using soft links,
etc. Some of the suggestions won't work for various reasons- ex. repo
isn't supported on Windows, which is very common amongst EDK2 users.
What are everyone's thoughts on just staying with SVN with our mirror
on github for people who want to use git locally (and they can check
in using git svn)
I prefer to stay with the current setup (ie. SVN is the primary, the git
mirror is r/o) over switching to a split-up git environment. (I believe
the same has been voiced by others -- Ard maybe?)

Nonetheless, patches should be formatted, documented (ie. in commit
messages), and posted to edk2-devel with git tools, in the git way. That
is to say, I don't mind the "SVN backend as primary" going forward, but
user interaction & public workflow should be centered on git 100%. No
more patches formatted with "svn" (the utility), no more patches in
attachments, and no more direct commits made with "svn" (the utility),
please.

This means that the

people who use git locally (including committing with
"git svn dcommit")

must be the *exact* same set as the

people who post patches to edk2-devel

How stuff gets pulled into Intel's internal builds via SVN externals
should not affect the edk2-devel workflow; the latter should migrate to
git entirely.
Post by Peterson, Joe
I know there was a lot of support and several
requests to move to git, so I would like to understand a little more
about why people feel subversion isn't working out for us.
There are two (groups) of issues with subversion.

- The first is that its centralized model / server is not suitable for
distributed development between independent parties.

This limitation can be mostly worked around by using git-svn for
interfacing with the server, and doing the rest of the development with
git exclusively (ie. structuring / rebasing patch series locally,
formatting them, sending them, applying or fetching them for testing and
review, and so on).

- The second problem is that the svn toolset (what developers use) is
much inferior to git's. Since an SVN clone does not have full local
history (independently of every other SVN clone in the world), tools
that are crucial for development, like "git rebase", simply don't exist
with "svn". This has a very bad effect on the patches that get posted to
edk2-devel by direct users of "svn".

In order to lay out a feature as a patchset, in small steps that are
hopefully easy to understand for reviewers, I usually rebase a patchset,
locally, several tens of times, before posting it the very first time.
This entire *mindset* is nonexistent with "svn" users (because it is
impossible to implement with "svn"). This has extremely negative
consequences for the patches that get posted and committed.

One cannot really appreciate the difference that git (the toolset) makes
until one learns to use it. (It was the same for me, before I learned
git, there's no doubt about it.)

Summary:

- the svn toolset dictates horrible development practices; we need to
get rid of it universally, for all aspects of public development

- the SVN repository as "primary commit store" works acceptably (with
the git-svn utility)

Thanks
Laszlo

------------------------------------------------------------------------------
Ard Biesheuvel
2015-06-17 10:23:54 UTC
Permalink
Post by Laszlo Ersek
Post by Peterson, Joe
Hello all,
There are a lot of conflicting ideas on how we should lay out our
EDK2 repo using Git. We have Submodules, Subtrees, using soft links,
etc. Some of the suggestions won't work for various reasons- ex. repo
isn't supported on Windows, which is very common amongst EDK2 users.
What are everyone's thoughts on just staying with SVN with our mirror
on github for people who want to use git locally (and they can check
in using git svn)
I prefer to stay with the current setup (ie. SVN is the primary, the git
mirror is r/o) over switching to a split-up git environment. (I believe
the same has been voiced by others -- Ard maybe?)
Yes, I agree with that, although a monolithic Git repo is still
strongly my #0 preference.

[...]
Post by Laszlo Ersek
- the svn toolset dictates horrible development practices; we need to
get rid of it universally, for all aspects of public development
- the SVN repository as "primary commit store" works acceptably (with
the git-svn utility)
Agreed. And I think moving to Git and expecting those bad practices to
vanish spontaneously is a bit naive. The workflows and mailing list
interactions need to improve first, possibly supported by more
developers moving to git-svn locally. Once we achieve that, migrating
the upstream repo is a piece of cake.
--
Ard.

------------------------------------------------------------------------------
Roy Franz
2015-06-17 18:23:29 UTC
Permalink
Post by Laszlo Ersek
Post by Peterson, Joe
Hello all,
There are a lot of conflicting ideas on how we should lay out our
EDK2 repo using Git. We have Submodules, Subtrees, using soft links,
etc. Some of the suggestions won't work for various reasons- ex. repo
isn't supported on Windows, which is very common amongst EDK2 users.
What are everyone's thoughts on just staying with SVN with our mirror
on github for people who want to use git locally (and they can check
in using git svn)
I prefer to stay with the current setup (ie. SVN is the primary, the git
mirror is r/o) over switching to a split-up git environment. (I believe
the same has been voiced by others -- Ard maybe?)
Yes, this would be my strong preference as well. Either we move to a single
git repo as the primary repository, or we keep svn. Moving to git with anything
other than a single git repo as the primary repository is a big step backwards.
Post by Laszlo Ersek
Nonetheless, patches should be formatted, documented (ie. in commit
messages), and posted to edk2-devel with git tools, in the git way. That
is to say, I don't mind the "SVN backend as primary" going forward, but
user interaction & public workflow should be centered on git 100%. No
more patches formatted with "svn" (the utility), no more patches in
attachments, and no more direct commits made with "svn" (the utility),
please.
This means that the
people who use git locally (including committing with
"git svn dcommit")
must be the *exact* same set as the
people who post patches to edk2-devel
How stuff gets pulled into Intel's internal builds via SVN externals
should not affect the edk2-devel workflow; the latter should migrate to
git entirely.
Agreed.

Now that I understand how the SVN edk2 repository is used internally
at Intel (and maybe
other places), I actually think that the SVN repository as master,
with an officially supported
git mirror, is the best technical solution for the diverse requirement
set that the various users
of the repository have. Git just doesn't support the company internal
use cases as well
as Subversion does.

The only people who need to care that SVN is the master are those with
commit access - the
maintainers. I would expect that all community development will
happen based on the git
repository, and I think this should be recommended by the project.
The usage of SVN
as the primary repository becomes an implementation detail to the
community at large.
Post by Laszlo Ersek
Post by Peterson, Joe
I know there was a lot of support and several
requests to move to git, so I would like to understand a little more
about why people feel subversion isn't working out for us.
There are two (groups) of issues with subversion.
- The first is that its centralized model / server is not suitable for
distributed development between independent parties.
This limitation can be mostly worked around by using git-svn for
interfacing with the server, and doing the rest of the development with
git exclusively (ie. structuring / rebasing patch series locally,
formatting them, sending them, applying or fetching them for testing and
review, and so on).
- The second problem is that the svn toolset (what developers use) is
much inferior to git's. Since an SVN clone does not have full local
history (independently of every other SVN clone in the world), tools
that are crucial for development, like "git rebase", simply don't exist
with "svn". This has a very bad effect on the patches that get posted to
edk2-devel by direct users of "svn".
In order to lay out a feature as a patchset, in small steps that are
hopefully easy to understand for reviewers, I usually rebase a patchset,
locally, several tens of times, before posting it the very first time.
This entire *mindset* is nonexistent with "svn" users (because it is
impossible to implement with "svn"). This has extremely negative
consequences for the patches that get posted and committed.
I also rebase patches many times before they are suitable
for submission, and more times based on feedback after submission. I can't
imagine doing this with Subversion.
Post by Laszlo Ersek
One cannot really appreciate the difference that git (the toolset) makes
until one learns to use it. (It was the same for me, before I learned
git, there's no doubt about it.)
- the svn toolset dictates horrible development practices; we need to
get rid of it universally, for all aspects of public development
- the SVN repository as "primary commit store" works acceptably (with
the git-svn utility)
Thanks
Laszlo
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Bruce Cran
2015-06-17 17:13:16 UTC
Permalink
Post by Peterson, Joe
I know there was a lot of support and several requests to move to git, so I would like to understand a little more about why people feel subversion isn't working out for us.
If we do stay with subversion, I think we should consider moving away
from sourceforge entirely, given recent news such as
http://seclists.org/nmap-dev/2015/q2/194 (*"*Sourceforge hijacks the
Nmap sourceforge account").
--
Bruce

------------------------------------------------------------------------------
Gao, Liming
2015-06-17 12:21:12 UTC
Permalink
Ard:
Subtree model requires to be kept in sync by automation. And, Subtree Repo is still not upstream repo. Developer is required to move his change to upstream repo and push it.

Yes. We can have multiple Repos like you. EDKII, SctPkg, InternalPkgs. Developer needs to manually combine them together. For example, to build SctPkg, you need EDKII Pkg. If you use submodule to link EDKII, you require single package Repo. I am thinking another solution. Could I merge EDKII Repo and SctPkg Repo in my local? If so, I will have the full code bases.

Thanks
Liming
-----Original Message-----
From: Ard Biesheuvel [mailto:***@linaro.org]
Sent: Wednesday, June 17, 2015 1:53 AM
To: edk2-***@lists.sourceforge.net; Gao, Liming
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Olivier Martin
Most EDK2 users uses MS Windows host machine & GUI tool. I am not sure 'git subtree' that is not part of the default git tool fits with the EDK2 community requirements.
That may be true, but the suggestion is not for every user to use git subtree individually, but to maintain a number of subtree mirrors that are kept in sync by automation. That way, you can compose your own workspace with various EDK2 packages as before. The only unsolved issue is how to convert your patches against those subtrees into patches that can be applied to the core upstream version.
Post by Olivier Martin
Having a main/unique EDK2 repository is not incompatible with the inclusion of third-party/private components in your development environment.
- EDK2 as a git repository
- SctPkg as a git repository
- Some private platforms as separate git repository If really you want
to get a set of EDK2 and third-party/private components nothing prevent you to create a branch based on 'master' that would add your external components as git submodules.
Perhaps Liming can comment on whether or not his use case is comparable?
Post by Olivier Martin
-----Original Message-----
Sent: 08 June 2015 10:09
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular use case, but I think the pushback in this thread against it comes mostly from people who have actually used git submodules, so I suggest we take those comments seriously.
It seems that you are already in the situation where you need to push your changes to multiple repositories at the same time, and those repositories are not tightly coupled. I don't think this applies to many of us, so it doesn't seem fair to me to force others to adopt that mode of development while not strictly necessary.
I think the subtree approach is reasonable, where we have a single read-write core EDK2 repo (probably just the existing one at GitHub) and use some automation to keep a collection of subtree mirrors in sync. As Roy has confirmed, git subtree is repeatable, i.e., it produces the exact same commit IDs when invoked several times, so the subtree repositories would be stable as well.
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and closed source software. Mind you, I don't think there is anything wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the
subtrees, you should be careful to limit each commit to files in a
single (sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or
completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
--------------------------------------------------------------------
--
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
---------------------------------------------------------------------
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
---------------------------------------------------------------------
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Ard Biesheuvel
2015-06-17 12:23:37 UTC
Permalink
Post by Gao, Liming
Subtree model requires to be kept in sync by automation. And, Subtree Repo is still not upstream repo. Developer is required to move his change to upstream repo and push it.
Post by Olivier Martin
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Yes. We can have multiple Repos like you. EDKII, SctPkg, InternalPkgs. Developer needs to manually combine them together. For example, to build SctPkg, you need EDKII Pkg. If you use submodule to link EDKII, you require single package Repo. I am thinking another solution. Could I merge EDKII Repo and SctPkg Repo in my local? If so, I will have the full code bases.
-----Original Message-----
Sent: Wednesday, June 17, 2015 1:53 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Olivier Martin
Most EDK2 users uses MS Windows host machine & GUI tool. I am not sure 'git subtree' that is not part of the default git tool fits with the EDK2 community requirements.
That may be true, but the suggestion is not for every user to use git subtree individually, but to maintain a number of subtree mirrors that are kept in sync by automation. That way, you can compose your own workspace with various EDK2 packages as before. The only unsolved issue is how to convert your patches against those subtrees into patches that can be applied to the core upstream version.
Post by Olivier Martin
Having a main/unique EDK2 repository is not incompatible with the inclusion of third-party/private components in your development environment.
- EDK2 as a git repository
- SctPkg as a git repository
- Some private platforms as separate git repository If really you want
to get a set of EDK2 and third-party/private components nothing prevent you to create a branch based on 'master' that would add your external components as git submodules.
Perhaps Liming can comment on whether or not his use case is comparable?
Post by Olivier Martin
-----Original Message-----
Sent: 08 June 2015 10:09
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular use case, but I think the pushback in this thread against it comes mostly from people who have actually used git submodules, so I suggest we take those comments seriously.
It seems that you are already in the situation where you need to push your changes to multiple repositories at the same time, and those repositories are not tightly coupled. I don't think this applies to many of us, so it doesn't seem fair to me to force others to adopt that mode of development while not strictly necessary.
I think the subtree approach is reasonable, where we have a single read-write core EDK2 repo (probably just the existing one at GitHub) and use some automation to keep a collection of subtree mirrors in sync. As Roy has confirmed, git subtree is repeatable, i.e., it produces the exact same commit IDs when invoked several times, so the subtree repositories would be stable as well.
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and closed source software. Mind you, I don't think there is anything wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the
subtrees, you should be careful to limit each commit to files in a
single (sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it should
be perfect. Note that in that mode, it's very similar (or
completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
--------------------------------------------------------------------
--
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
---------------------------------------------------------------------
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
---------------------------------------------------------------------
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
----------------------------------------------------------------------
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Gao, Liming
2015-06-17 12:32:26 UTC
Permalink
Ard:
Sorry, I have not tried this way. I assume it works and resolves the patch generation. Besides, we find subtree Repo history is shown in the root directory, not each sub package. Submodule shows the history for each package, but no history in root.

Thanks
Liming
-----Original Message-----
From: Ard Biesheuvel [mailto:***@linaro.org]
Sent: Wednesday, June 17, 2015 8:24 PM
To: Gao, Liming
Cc: edk2-***@lists.sourceforge.net
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Subtree model requires to be kept in sync by automation. And, Subtree Repo is still not upstream repo. Developer is required to move his change to upstream repo and push it.
Post by Olivier Martin
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Yes. We can have multiple Repos like you. EDKII, SctPkg, InternalPkgs. Developer needs to manually combine them together. For example, to build SctPkg, you need EDKII Pkg. If you use submodule to link EDKII, you require single package Repo. I am thinking another solution. Could I merge EDKII Repo and SctPkg Repo in my local? If so, I will have the full code bases.
-----Original Message-----
Sent: Wednesday, June 17, 2015 1:53 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Olivier Martin
Most EDK2 users uses MS Windows host machine & GUI tool. I am not sure 'git subtree' that is not part of the default git tool fits with the EDK2 community requirements.
That may be true, but the suggestion is not for every user to use git subtree individually, but to maintain a number of subtree mirrors that are kept in sync by automation. That way, you can compose your own workspace with various EDK2 packages as before. The only unsolved issue is how to convert your patches against those subtrees into patches that can be applied to the core upstream version.
Post by Olivier Martin
Having a main/unique EDK2 repository is not incompatible with the inclusion of third-party/private components in your development environment.
- EDK2 as a git repository
- SctPkg as a git repository
- Some private platforms as separate git repository If really you
want to get a set of EDK2 and third-party/private components nothing prevent you to create a branch based on 'master' that would add your external components as git submodules.
Perhaps Liming can comment on whether or not his use case is comparable?
Post by Olivier Martin
-----Original Message-----
Sent: 08 June 2015 10:09
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Gao, Liming
Thanks for your all comments. Those gives me more concept of Git.
I want to clarify my usage model. My daily work bases on internal project to develop features for external and internal packages both. So, I expect I have one git repo (one GIT URL) to get all required packages (internal and external), then base on it to pull the update, create patch, review patch, and push my changes for internal and external. I also hope I can use Git advantage usage for EDKII and my internal project. But, I find no obvious way to support this usage. Submodule is a little complex. Repo tool is not easy to be used in Windows. And, even if we use Repo tool, we still need to separate internal Package as single Repo, then combine them one by one into EDKII as my internal project. git filter-branch can split Package from EDKII Repo. But, this ways need script to update code. And, I am not sure whether I can base on filter-branch to push my changes. After compare them, I think submodule is the simplest way to support my usage model. So, I propose to separate Package as Repo, keep Package Repo as upstream Repo and EDKII Repo as read only. If Package Repo is read only, EDKII Repo is upstream Repo, it will bring a little burden for me. But, it is also an acceptable solution.
Git submodules may be the simplest way to support /your/ particular use case, but I think the pushback in this thread against it comes mostly from people who have actually used git submodules, so I suggest we take those comments seriously.
It seems that you are already in the situation where you need to push your changes to multiple repositories at the same time, and those repositories are not tightly coupled. I don't think this applies to many of us, so it doesn't seem fair to me to force others to adopt that mode of development while not strictly necessary.
I think the subtree approach is reasonable, where we have a single read-write core EDK2 repo (probably just the existing one at GitHub) and use some automation to keep a collection of subtree mirrors in sync. As Roy has confirmed, git subtree is repeatable, i.e., it produces the exact same commit IDs when invoked several times, so the subtree repositories would be stable as well.
For submitting your patches, note that git format-patch supports src-prefix and dst-prefix options, which perhaps we may use to convert your subtree patches to core patches? I haven't tried it myself but it looks promising.
Post by Gao, Liming
Last, I don't understand why GIT not smoothly supports my usage model, because it is just designed for Linux project?
Git was obviously not designed for maintaining a mix of open and closed source software. Mind you, I don't think there is anything wrong with that, it just wasn't on any of the Git developers' radar.
--
Ard.
Post by Gao, Liming
-----Original Message-----
Sent: Friday, June 5, 2015 12:34 AM
Subject: Re: [edk2] Proposal of Git Repo Layout for EDKII project
Post by Brian J. Johnson
Post by Jordan Justen
Yet another idea that I've considered is trying to leverage git
subtree. My idea was that the unified EDK II would remain the main
upstream.
I would setup an automated process to split each package off using
git subtree, and push the separate repos.
...
I never got the time to investigate if git subtree could work as
"
split
Extract a new, synthetic project history from the history of the
<prefix> subtree. The new history includes only the commits
(including merges) that affected <prefix>, and each of those
commits now has the contents of <prefix> at the root of the
project instead of in a subdirectory. Thus, the newly created
history is suitable for export as a separate git repository.
I experimented with git subtree a couple years ago for managing a
project composed of multiple sub-projects. I'm trying to remember
what I thought about it.... It works, but it tends to produce a
confusing git log, IIRC. And if you're going to push to the
subtrees, you should be careful to limit each commit to files in a
single (sub)tree. That requires developer discipline, or a good pre-commit hook.
But for extracting packages into separate read-only repos, it
should be perfect. Note that in that mode, it's very similar (or
completely
equivalent?) to "git filter-branch --subdirectory-filter".
--
Brian J. Johnson
If git subtree can be used to create and maintain read-only "modules" based on the single master git repo, this sounds like a good solution. This keeps the single git repo as the master, and avoids the complications of commits that cross module boundaries in the sub-module case. If git subtree can support this usage model, would read-only git subtrees for the modules meet the requirements for those who want to use "modules" individually? Note that only the upstream subtree repos are read-only, various other groups could still have internal read/write repos cloned from those, it's just that changes pushed upstream would need to go through the master git tree.
If this is a generally acceptable plan, then I guess a next step is to verify that git subtree works as desired.
Roy
Post by Brian J. Johnson
-------------------------------------------------------------------
-
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
-------------------------------------------------------------------
-
--
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
--------------------------------------------------------------------
-
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
--------------------------------------------------------------------
-
--------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
---------------------------------------------------------------------
-
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
Registered in England & Wales, Company No: 2557590 ARM Holdings plc,
Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in
England & Wales, Company No: 2548782
---------------------------------------------------------------------
-
-------- _______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Roy Franz
2015-06-03 19:21:06 UTC
Permalink
Post by Brian J. Johnson
Post by Gao, Liming
Hi, all
Now, EDKII project Git mirror is ready in GitHub (https://
<https://github.com/tianocore>github.com/tianocore
<https://github.com/tianocore>). There are EDKII project Repo and each
package Repo. After migrate EDKII from SVN to GitHub, EDKII Git Repo
will be writable, EDKII SVN project will become mirror. I expect to keep
write access in the centralized Git repo. But, EDKII project Repo (edk2)
and MdePkg Repo (edk2-MdePkg) includes the same source code. So, only
one of them will be writable. My proposal is to make Package Repo be
Read & Write, and update EDKII project to link each package by submodule
1.EDKII project is too big. After separate them, the developers can just
pull their used packages instead of full.
2.The different packages have the different owners. After separate them,
the package owner can give write access for the different developers.
3.Close source project can refer to EDKII packages. Those project can be
easily setup by git submodule.
Compared to EDKII project Repo, submodule EDKII project Repo just
includes edksetup.bat, and edksetup.sh. Some BKM of submodule is shared
here.
1.Every Git operation is took for Package Repo. Pull, Branch, Commit,
Create Patch, Fork, and Pull Request are all for Package Repo. If your
patch changes multiple packages, you need to commit and create patch per
Package.
2.git submodule foreach “command” can be used to run command on every
package, for example git submodule foreach "git pull"
I fully agree with others' reluctance to use git submodules, and the
reasons they have expressed: git submodules are a major pain for
developers, and the concerns Liming listed above can be addressed in
other ways.
When my internal team first transitioned to git, we set up a complex
submodule-like system to (theoretically) allow easily updating common
code among different projects. That only lasted a month or two: having
to manage multiple repositories for day-to-day work, and the lack of a
single commit history spanning the entire tree doomed that scheme.
I collapsed everything together into a single repo using some git
filter-branch magic, and we've been happy ever since.
Please, no submodules....
Agreed, please no submodules. I eagerly went to the github page, and
was shocked to see 3 pages
of submodules. I have to work with some projects that use submodules, and they
are a huge pain. They technically "work", but should be a last resort
(ie composing a
larger project from existing, separately developed projects that are
already managed
in git.)

EDK2 should be a single git repository. I very strongly think the
submodule plan is a very bad idea.

Thanks,
Roy
Post by Brian J. Johnson
Thanks,
--
Brian J. Johnson
--------------------------------------------------------------------
My statements are my own, are not authorized by SGI, and do not
necessarily represent SGI’s positions.
------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Loading...