To clients, it looks like a copy of the archive. If a file does not exist, or is too old, it is fetched from the master servers and saved to disk. The files can be populated from CDs or mirrored through rsync to get complete copies. Ideally, it would load balance from multiple mirror servers.
It is possible to set something up with Apache2, mod_proxy, and mod_disk_cache but that does not allow load balancing between servers and stores files in an opaque cache.
Would Perlbal be usable for this? Could it be extended? I think a mod_perl module could be written.
Apache+squid (Score:2)
Re: (Score:1)
Re: (Score:2)
No, Squid won't use a similar structure, but you can preload anyway, by just requesting the relevant URIs.
If you're not loading from a normal internet site, though, you may have to set up a tiny http server for this purpose yourself. Or perhaps another proxy
Re: (Score:1)
Not really. Squid’s directory structure doesn’t look like what it has cached, but it’s pretty easy to hack some Perl to bulk-feed a directory structure plus given base URI into the Squid cache dirs.