Migrate from gitweb to cgit with URL rewrite rules
Harald started migrating the gitweb installation on openezx.org over to cgit, because gitweb was eating up too much resources when stressed by web crawlers; there were still some details to take care of and I put some time into that.
We wanted to:
- Keep the links created by the git-notify mails working;
- Keep most of the URLs indexed by search engines working.
URLs rewriting was the most obvious short term solution.
I found these rules from ClearChain for apache which were used for a similar migration at freedesktop.org, they were still not perfect for us:
- there were some typos most likely due to copy and paste of similar rules;
- the regular expression used for project names
([^.]+)(\.git)*
was not working for us because we have project names containing dots (like motorola-2.4); - the rules were not handling the case where clean URLs were enabled in the gitweb installation.
So I tweaked the rules a little bit and I am now putting them into a gitweb_cgit_migration git repository for other to play with, they can be improved in several ways:
- the project name regex is now assuming that a project name ends in
.git
which was OK in our case but it might be too strict for other scenarios; - rules with the same destination address can be merged;
- regex for project names and for commit hashes can be made stricter;
- some non essential pattern groups in the source URL patterns can be removed, but the parameters in the destination URLs would need to be adjusted as well;
- ordering of rules can be improved, right now there might still be some addresses which we are not handling correctly;
- redirects can be made permanent.
However I am not putting more effort in that for now as the result looks acceptable for OpenEZX; anyhow patches are greatly welcome, as always. :)
Commenti
Invia nuovo commento