Migrate from gitweb to cgit with URL rewrite rules

Harald started migrating the gitweb installation on openezx.org over to cgit, because gitweb was eating up too much resources when stressed by web crawlers; there were still some details to take care of and I put some time into that.

We wanted to:

URLs rewriting was the most obvious short term solution.

I found these rules from ClearChain for apache which were used for a similar migration at freedesktop.org, they were still not perfect for us:

  • there were some typos most likely due to copy and paste of similar rules;
  • the regular expression used for project names ([^.]+)(\.git)* was not working for us because we have project names containing dots (like motorola-2.4);
  • the rules were not handling the case where clean URLs were enabled in the gitweb installation.

So I tweaked the rules a little bit and I am now putting them into a gitweb_cgit_migration git repository for other to play with, they can be improved in several ways:

  • the project name regex is now assuming that a project name ends in .git which was OK in our case but it might be too strict for other scenarios;
  • rules with the same destination address can be merged;
  • regex for project names and for commit hashes can be made stricter;
  • some non essential pattern groups in the source URL patterns can be removed, but the parameters in the destination URLs would need to be adjusted as well;
  • ordering of rules can be improved, right now there might still be some addresses which we are not handling correctly;
  • redirects can be made permanent.

However I am not putting more effort in that for now as the result looks acceptable for OpenEZX; anyhow patches are greatly welcome, as always. :)


CommentiCondividi contenuti

Invia nuovo commento

Il contenuto di questo campo è privato e non verrà mostrato pubblicamente. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.
  • Indirizzi web o e-mail vengono trasformati in link automaticamente
  • Elementi HTML permessi: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Linee e paragrafi vanno a capo automaticamente.

Ulteriori informazioni sulle opzioni di formattazione

CAPTCHA
Questa domanda serve a verificare che il form non venga inviato da procedure automatizzate
h
P
N
C
y
b
Enter the code without spaces.