Migrate from gitweb to cgit with URL rewrite rules

Harald started migrating the gitweb installation on openezx.org over to cgit, because gitweb was eating up too much resources when stressed by web crawlers; there were still some details to take care of and I put some time into that.

We wanted to:

URLs rewriting was the most obvious short term solution.

I found these rules from ClearChain for apache which were used for a similar migration at freedesktop.org, they were still not perfect for us:

  • there were some typos most likely due to copy and paste of similar rules;
  • the regular expression used for project names ([^.]+)(\.git)* was not working for us because we have project names containing dots (like motorola-2.4);
  • the rules were not handling the case where clean URLs were enabled in the gitweb installation.

So I tweaked the rules a little bit and I am now putting them into a gitweb_cgit_migration git repository for other to play with, they can be improved in several ways:

  • the project name regex is now assuming that a project name ends in .git which was OK in our case but it might be too strict for other scenarios;
  • rules with the same destination address can be merged;
  • regex for project names and for commit hashes can be made stricter;
  • some non essential pattern groups in the source URL patterns can be removed, but the parameters in the destination URLs would need to be adjusted as well;
  • ordering of rules can be improved, right now there might still be some addresses which we are not handling correctly;
  • redirects can be made permanent.

However I am not putting more effort in that for now as the result looks acceptable for OpenEZX; anyhow patches are greatly welcome, as always. :)


CommentsSyndicate content

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
i
m
c
3
x
v
Enter the code without spaces.