The knowledge base blog https://www.esdm.co.uk/knowledge http://www.rssboard.org/rss-specification mojoPortal Blog Module en-GB 120 no Tilecache: how to stop tilecache_seed.py bailing out with HTTP 502 errors We are using TileCache http://tilecache.org/ to create caches of Ordnance Survey data of various flavours, for use in high demand web sites. We are tending to pre-seed the caches, then access them directly from disk, as this gives the best overall performance. However building the caches for the larger map scales is a significant task, requiring many days of processing and hundreds of gigabytes of storage.

This post addresses a particular problem – the pre-seeding routines failing intermittently with HTTP 502 errors.

Pre-seeding is carried out on the command line, issuing a command like this:

D:\Websites\UKBaseMap\scripts\tilecache\tilecache-2.11\tilecache_seed.py --bbox=0,0,600000,1300000 OSOpenOSGB 0 10

where OSOpenOSGB refers to a config section in tilecache.cfg (which in turn points at the WMS), and “0 10” means process levels 0 to 9.

The normal tilecache_seed.py looks like this:

================

#!/usr/bin/env python

# BSD Licensed, Copyright (c) 2006-2010 TileCache Contributors
"""This is intended to be run as a command line tool. See the accompanying
   README file or man page for details."""

import TileCache.Client

TileCache.Client.main()

================

But we have been finding that this fails intermittently with errors being raised, like:

urllib2.HTTPError: HTTP Error 502: Bad Gateway

We do not know the root cause, but apparently the CGI MapServer WMS is simply failing to respond sometimes, and unfortunately this causes the cache seeding process to stop. We are seeing this at varying intervals, from every 5 minutes to every couple of hours. Enough to completely disrupt the construction of a large cache.

So, in my first foray into Python goodness, I’ve enhanced tilecache_seed.py a little to recover from the failures and continue until the caching finished without an error:

================

#!/usr/bin/env python

# BSD Licensed, Copyright (c) 2006-2010 TileCache Contributors
"""This is intended to be run as a command line tool. See the accompanying
   README file or man page for details."""

# amended CF 22/2/2012 to resume after errors urllib2.HTTPError: HTTP Error 502: Bad Gateway
import time
import TileCache.Client
i = 1
while i > 0:
    try:
        i = 0
        TileCache.Client.main() # do the work
    except:
        i = 1
        print 'HTTPError occurred - resuming processing...'
        f = open('errorlog.txt', 'a') # open file for appending
        f.write('HTTP Error occurred at: ' + time.strftime('%x %X') + '\n')
        f.close()
        time.sleep( 1 ) # have a rest to let MapServer recover from whatever was troubling it
        continue # resume the loop
    else:
        print 'Processing completed!' # no error was encountered

================

This also logs the occurrence of errors into a log file like this:

HTTP Error occurred at: 02/23/12 23:50:00

The caching now bashes on regardless of intermittent errors. However, best to avoid the –f flag on the command line as this will re-start the job from the beginning and re-create all tiles, therefore may never complete if errors are occurring regularly.


Crispin Flower  ...]]>
https://www.esdm.co.uk/tilecache-how-to-stop-tilecache_seedpy-bailing-out-with-http-502-errors crispin.flower@idoxgroup.com (Crispin Flower) https://www.esdm.co.uk/tilecache-how-to-stop-tilecache_seedpy-bailing-out-with-http-502-errors https://www.esdm.co.uk/tilecache-how-to-stop-tilecache_seedpy-bailing-out-with-http-502-errors Fri, 24 Feb 2012 00:31:05 GMT
MapServer and GeoServer (and tilecache) comparison serving Ordnance Survey raster maps With two WMS running off identical data on the same server, I thought it would be interesting to compare speeds and the map output. So I lined up a map request with identical parameters to both services and ran them through Fiddler a few times (to get an idea of the response times).

Output quality and performance

  GeoServer MapServer
Response time 0.8 to 1.0 seconds with USE_JAI_IMAGEREAD=true
0.6 to 1.3 seconds with
USE_JAI_IMAGEREAD=false
0.4 to 0.6 seconds
Image size 63,574 bytes 78,327 bytes
The map ESDM_UK_BaseMaps-50KRasterColour[4] mapserv[4]
Quality Adequate, but quite fractured Lovely
Comments A significantly slower response than MapServer, and a worse image.
I couldn’t find any difference between the different re-sampling methods, perhaps suggesting that the settings were not taking effect (I’ve seen GeoServer GUI issues like this a couple of times now, requiring diving into the XML config files). Later – yes the settings were present in the XML config, but seem to make no difference to the output.
Faster and much nicer image.
If we turn off bilinear re-sampling, then the response time improves to around 0.3 seconds and the image deteriorates slightly as shown below (becoming similar in quality to the GeoServer map).
mapservCA53L4ZS_thumb[3]

Oh dear this is not looking good for GeoServer. But is it the whole story? Next I compared them in a more realistic scenario, as base maps in OpenLayers, trying both single tile and multi-tile modes. I configured the map to view the 1:50,000 mapping at a scale where no re-sampling could occur.

  GeoServer MapServer
Response time multi-tile (30 tiles each 256x256 pixels) 0.2 to 1.2 seconds, but on average a shade faster 0.3 to 0.7 seconds
Response time single-tile (1725 x 1173 pixels) 4 to 6 seconds 3 to 5 seconds
Image size (typical 256x256 tile) 30,447 bytes 22,674 seconds
Image size (1725 x 1173 pixels) 646,681 bytes 690,494 bytes
The map (256x256 tile) geoserv_256tile mapserv_256tile
Quality Very good Very good (pretty much identical)

OK so now it is looking better for GeoServer; when there is no re-sampling it can return identical images to MapServer, though slightly larger, and performance is pretty similar and possibly quicker.

Output quality and performance when re-projecting

Now time to see how they cope with serving the maps into a different coordinate system, and in particular the global spherical Mercator used in Google and Bing Maps.

  GeoServer MapServer
Response time multi-tile (30 tiles each 256x256 pixels) 0.2 to 1.2 seconds 0.1 to 0.5 seconds
Response time single-tile (1725 x 1173 pixels) 10 seconds 6 seconds
Image size (typical 256x256 tile) 29,169 bytes 37,261 seconds
Image size (1725 x 1173 pixels) 1,041,098 bytes 982,050 bytes
The map (256x256 tile) geoserv_900913

mapserv_900913

Quality Pretty horrible Very good

With default settings, GeoServer produces horrible output when re-projecting the raster maps. But this does need more investigation when I found out how to make it use different re-sampling algorithms.

Performance under load

Now time to see how they perform under load using multi-mechanize. I chose to compare MapServer, GeoServer and a tilecache at returning a 256x256 tile in EPSG:900913 (i.e. re-projecting on-the-fly). Each test ran for 100 seconds, starting with 10 virtual users, increasing by 10 every 10 seconds. The tilecache image was of course cached and simply being retrieved from disk. All tests were using the same web server, with files on the same drive.

Note: the vertical scale differs in each graph to fit the observed values.

  GeoServer MapServer Tilecache
Transactions 381 556 2188
Errors 57 0 0
Average response time 6.1 6.5 2.3
95% response time 26.8 17.6 5.1
Response graph All_Transactions_response_times All_Transactions_response_times All_Transactions_response_times
Transactions per second All_Transactions_throughput All_Transactions_throughput All_Transactions_throughput
Comments Completely broken after about 80 seconds – had to re-start GeoServer No errors, but significantly disrupted performance Slight slow-down under high load, but very reassuring performance

GeoServer clearly could not stand up to more than about 30 virtual users when re-projecting the maps, and ultimately the service stalled completely requiring a re-start.

I tried the tests again requesting the maps in OSGB so that the service did not involve re-projection. I also reduced the number of virtual users to 50. The results were quite surprising…

  GeoServer MapServer Tilecache
Transactions 2018 1114 2197
Errors 0 0 0
Average response time 1.3 2.4 1.1
95% response time 2.4 9.1 2.5
Response graph All_Transactions_response_times All_Transactions_response_times All_Transactions_response_times
Transactions per second All_Transactions_throughput All_Transactions_throughput All_Transactions_throughput
Comments Impeccable! Much better than when re-projecting, but only half as good as GeoServer. Only fractionally better than GeoServer

This is a hugely impressive result for GeoServer. My guess is that it is employing server-side in-memory caching, and because all my requests were identical the responses were very fast. MapServer on the other hand has to start a CGI process of each request and presumably cannot benefit from an in-memory cache.

When re-projecting on the other hand, perhaps it does not use a cache. And because all the GeoServer operations are running as one process, it is restricted to one virtual processor on our server whereas each MapServer exe can grab one for itself. Certainly when running these tests the server was running at about 40-80% CPU with MapServer, compared to 7% with GeoServer.

I ran this test again with the number of virtual users doubled back to 100, and GeoServer came through almost unscathed, this time processing 2072 transactions though with 3 errors.

  GeoServer MapServer Tilecache
Transactions 2072 907 2234
Errors 3 0 0
Average response time 2.4 5.9 2.3
95% response time 5.9 25.6 4.8
Response graph All_Transactions_response_times All_Transactions_response_times All_Transactions_response_times
Transactions per second All_Transactions_throughput All_Transactions_throughput All_Transactions_throughput
Comments Wow Suffering, but no errors The best as expected

I suspect this test may not predict real-world behaviour with lots of users requesting different WMS maps, where I suspect GeoServer would lose its advantage, but it is interesting anyway.

NB I had GeoServer running with AllowMultithreading=true

Summary

If the maps need to be re-projected then use MapServer – both performance and output quality are vastly superior. In a site that is going to get any significant load it is unwise to re-project raster mapping on-the-fly anyway, and it should definitely be cached. In this case the WMS performance doesn’t matter so much but output quality is paramount, so MapServer it is.  If I find a way of making GeoServer produce nice looking output I’ll come back and revise this post!

If no re-projection is required, then simple observation has the two roughly equal under light load. The maps looked almost identical, and performance was close. GeoServer had the advantage for me in that it didn’t suffer from MapServer’s tendency to choke on some requests; in practice this meant I went over my entire OSGB cache again with a GeoServer WMS as the source, to plug the gaps left by the MapServer service.

Under heavy (albeit contrived) load, the crucial factor was whether or not there was any re-projection involved. Without re-projection GeoServer produced astonishing performance, thrashing MapServer and nearly matching a tilecache. Introduce re-projection however, and GeoServer collapses to the extent that it can crash entirely – not good when it is running all your mapping services. MapServer never broke, and despite some gaps and slow responses, it managed to struggle through all tests without a single error.

So out of this experience I think we need to be using both, depending on the purpose.

Next to test performance on vector layers, including file based GIS formats and spatial databases…


Crispin Flower  ...]]>
https://www.esdm.co.uk/mapserver-and-geoserver-and-tilecache-comparison-serving-ordnance-survey-raster-maps crispin.flower@idoxgroup.com (Crispin Flower) https://www.esdm.co.uk/mapserver-and-geoserver-and-tilecache-comparison-serving-ordnance-survey-raster-maps https://www.esdm.co.uk/mapserver-and-geoserver-and-tilecache-comparison-serving-ordnance-survey-raster-maps Sat, 21 Jan 2012 17:53:00 GMT
Some notes about setting up ​Tilecache for the Ordnance S​urvey Open Data Background

Increasingly we are using Ordnance Survey Open Data as a background map option in web sites, and also in desktop systems where no other mapping is available. We have so far served this as a MapServer WMS. This puts a significant load on the web server, and will not scale beyond a few concurrent users. Making this scale requires converting the datasets into a cache of image tiles that can be accessed without running a GIS engine on the server. The most widely used tool for this is Tilecache from Metacarta http://tilecache.org

Usually OS mapping is used in a map in OSGB coordinate system, however my first two applications requiring faster mapping are a) the Angling Diary, and b) Ramblers Routes, both of which work in the spherical Mercator projection (EPSG:900913). These make it even more imperative to cache the data, as there is on-the-fly reprojection involved in using them as a WMS directly. Re-projection can also make the maps look bad, mitigated only by finding the best resampling settings (see below about MapServer resampling).

Information

http://tilecache.org/docs/README.html

which gives general instructions (but misleading about Windows), and this blog post which explains in more detail how to set up tilecache on IIS (but relates to IIS5 & Server 2000):

http://viswaug.wordpress.com/2008/02/03/setting-up-tilecache-on-iis/

Python

Tilecache requires Python. Check whether the server is 32/64-bit and get the appropriate download from here: http://www.python.org/download/

For our web7 server I am using the x64 version: python-2.7.2.amd64.msi

I accepted all defaults, and installed to here: C:\Python27\

This path needs to be added to the PATH environment variable. I have a feeling this does not take effect until a restart, but I'm not 100% sure.

Python Imaging Library

Then we also need the Python Imaging Library, from here http://www.pythonware.com/products/pil/

Again, select the correct version for the version of Python used above, though there seems to be no 64-bit version. I used file PIL-1.1.7.win32-py2.7.exe

Running this installer failed because it said Python was not installed - missing registry settings. A reboot did not fix this.

A bit of Binging leads to this page http://www.lfd.uci.edu/~gohlke/pythonlibs/ where we can find a 64-bit installer:

PIL-1.1.7.win-amd64-py2.7.exe

This installed fine.

Tilecache

Download Tilecache from here http://tilecache.org/

This gives us tilecache-2.11.tar.gz

I used 7-zip to extract from the gz to a folder that contains the .tar file. Then I extracted the files from .tar to end up with a folder: \tilecache-2.11

I don't think the "PaxHeader" folder parallel to this folder is needed, but I hung onto it anyway, and put both folders under a single Tilecache folder.

This folder has to be located somewhere CGI scripts can run, I put it below a Mapserver "scripts" folder that was already operational within a web site. Depending on where you put it, it may be necessary to create a virtual directory for Tilecache.

IIS stuff (the miserable bit)

Now we need to setup IIS to run Python scripts.

Web site > Handler mappings > Add Script Map

Request Path = *.py

Executable = "C:\Python27\python.exe" %s %s

Name = Python27 (or whatever you like)

I did not change anything in Request Restrictions.

 

I'm not yet sure whether this is strictly needed, but at the root level in IIS I also added the same path under "ISAPI and CGI Restrictions" > Add >

ISAPI or CGI path = "C:\Python27\python.exe" %s %s

Description = Python27

Tick to allow the path to execute.

 

Vish's article describes another step...

Open up the command prompt and change directory to ‘C:\Inetpub\AdminScripts’. Execute the following:

adsutil set w3svc/AllowPathInfoForScriptMappings True

adsutil set w3svc/1/AllowPathInfoForScriptMappings True

However on our server there was no AdminScripts folder in C:\Inetpub. Therefore it was necessary to install IIS6 Script Services, like this:

Server manager > Roles > Web Server > Add Role Services > Management Tools > IIS 6 Scripting Tools (which in turn requires adding others which it selects for you automatically).

Once this had been done, I was able to run the two command lines above. While in the role services area, check that CGI is enabled in IIS as well, because none of this will work without CGI; but if MapServer is working, then CGI must already be enabled.

Permissions

I initially gave the Internet Guest Account (IIS_IUSRS) modify permissions on the "Cache" folder. However I took these permissions off again, and it still worked. Vish said this was required, but it cannot be really, as it is not the web site user creating the tiles, it is the python process.

Tilecache itself

Rename tilecache.cgi to tilecache.py

Edit tilecache.py and remove the first line in it that reads '#!/usr/bin/env python'. Also, change the 'Service.Load'’ parameter to point at the correct path to tilecache.cfg (and be sure to use double back-slashes in the path).

Tilecache includes a web page with an OpenLayers map, that serves to check whether things are configured correctly, and also allows you to manually start caching tiles. This is index.html, which by default loads and caches the OpenLayers base map WMS. I copy this file to e.g. indexOS.html then edit as required.

This page requests maps from tilecache.py which in turn uses tilecache.cfg configuration. So to work with a different data source it is necessary to add the relevant configuration to both files.

Tilecache.cfg

Configure the type and location of the cache. Here we are using a local file cache on disk:

[cache]
type=Disk
base=D:\mypath\tilecache\tilecache-2.11\Cache

Configure the layer you want to cache, in this case our OS Open Data WMS. These settings were arrived at after much blood and sweat.

[OSOpenSphMerc]
type=WMS
layers=OSOpenData
url=http://mywebsite/scripts/mapserv.exe?map=D:\Websites\UKBaseMap\map\UKBaseMap.map
extension=png
extent_type=loose
srs=EPSG:900913
# this definitely required when calling in 900913
spherical_mercator=true
bbox=-20037508.34,-20037508.34,20037508.34,20037508.34
resolutions=78271.51695,39135.758475,19567.8792375,9783.93961875,4891.969809375,2445.9849046875,1222.99245234375,611.496226171875,305.7481130859375,152.87405654296876,76.43702827148438,38.21851413574219,19.109257067871095,9.554628533935547,4.777314266967774,2.388657133483887,1.1943285667419434,0.5971642833709717,0.29858214168548586

The "spherical_mercator=true" setting is supposed to remove the need for a resolutions setting, but in practice I got errors if it was not there. The settings above are basically the entire world in spherical mercator.

My matching OpenLayers code was:

        function init(){
            map = new OpenLayers.Map( $('map'), {
                projection: new OpenLayers.Projection("EPSG:900913"),
       	        units: "m",
               maxExtent: new OpenLayers.Bounds(-20037508.34, -20037508.34, 20037508.34, 20037508.34),
               resolutions: [78271.51695, 39135.758475, 19567.8792375, 9783.93961875, 4891.969809375, 2445.9849046875, 1222.99245234375, 611.496226171875, 305.7481130859375, 152.87405654296876, 76.43702827148438, 38.21851413574219, 19.109257067871095, 9.554628533935547, 4.777314266967774, 2.388657133483887, 1.1943285667419434, 0.5971642833709717, 0.29858214168548586],
               controls: [new OpenLayers.Control.Navigation(),
                                new OpenLayers.Control.PanZoomBar()],
                  }
            );
            OSOpenSphMerc = new OpenLayers.Layer.WMS( "OSOpenSphMerc",
                    "tilecache.py?", {layers: 'OSOpenSphMerc', format: 'image/png' },
                    {isBaseLayer: true}
            );

However, when browsing the map I kept getting errors on some tiles, along these lines:

"An error occurred: Current y value 7983694.728100 is too far from tile corner y 7944558.969625"

This problem got worse the further I zoomed in, and the further north in the UK. However, it was intermittent in the sense that one band of tiles might draw (north-south or east-west bands) then the next might not - a checkerboard effect. The exact same WMS requests going directly to the OSOpenData WMS worked fine - i.e. the BBOXes in the requests were good.

Seeding the cache

Therefore I tried seeding the cache directly using a command-line like:

D:\Websites\UKBaseMap\scripts\tilecache\tilecache-2.11\tilecache_seed.py -f --bbox=-1060000,6405978,242016,8700250 OSOpenSphMerc 1 11

(this bounding box being a rather unscientific box that includes the OS mapping and is divisible by 256 in both directions - whether the latter is important or not I do not know).

This seeding worked without any tile failures. Therefore, after much searching the web and head-scratching, I have concluded that the errors when used from OpenLayers may be down to Tilecache bugs.

Note about seeding: Level 0 raised an error about a zero length image and would not complete seeding, so I had to skip this and start at 1. Level 0 equates to looking at the earth from a long way away, so no big deal.

Another note about seeding: each run tended to produce a few errors. There were a lot of "Cache miss" entries coming back in the command window, plus some more serious errors occasionally (HTTP 502 Bad Gateway - which causes the seeding operation to bail out). Therefore I tended to run each level twice (or more if it had bailed out). The first time in, I used the "-f" flag to force re-creation of all tiles (in case I had any left over from testing and setting up resolutions). The second time I omitted this flag, so it would only re-create any missing tiles. I don't really know whether this was necessary, except in the few cases where caching a layer totally aborted.

Managing the cache

I'm now part way through seeding the cache - on level 16. The size of the cache grows exponentially with each level, so disk space may become an issue. With a small part of level 16 done (this being StreetView) we have over 3 million files and 20GB of space used.

There is a cache cleaning utility in Tilecache, which removes the tiles accessed least recently in order to reduce the cache to a specifified size. However, this would only be beneficial where all map requests are going through tilecache.py, so that tiles can be recreated where needed from the datasource. Maximum benefit is gained by pre-caching the entire dataset, and accessing it as a tile service, therefore no cleaning is possible.

Caching in GB National Grid

Once I was happy that the 900913 cache was building OK, I turned attention to a GB National Grid cache, which should in theory be much simpler.

The bounding box is 0,0,700000,1300000 though out respect for the many people on the web who have said it should be a multiple of 256, I altered this slightly (in OpenLayers map and tilecache.cfg) to 0,0,699904,1299968. Actually later I found this was missing edge tiles at large scales, and I then found the BBOX when seeding had no effect on the tiling behaviour - it simply governed the area of seeding; therefore I changed this to something like 0,0,800000,1400000 at large scales, and back to 0,0,700000,1300000 for lower ones.

But what about the resolutions? One technique I have found is to set the OpenLayers map to

maxResolution: 'auto', numZoomLevels: <some sensible number like 14>

Then try to start caching, and you quickly get errors back associated with the dreaded pink boxes in the map (you see the errors in Fiddler), which says that the required resolution was not found, and gives an array of available resolutions (I don't know the basis for the values it gives). These can then be used in the cfg and map settings.

This worked pretty well for the OS data, except that after 1:14000 it gave 1:7029, which shows StreetView rather too zoomed out and looking rubbish. Also, scales like 1:1757 are not user friendly, and although on-screen scale is not entirely meaningful, users still prefer a round scale like 1:2500 (in cases where scale is displayed). So perhaps we need to define our own array of scales, and reverse engineer an array of resolutions from them. How?

OpenLayers has a control that can shows the map scale (map.addControl(new OpenLayers.Control.Scale());), but does not have one to show the resolution (AFAIK). So we have to do this ourselves, as follows. Add a handler for the move event onto the (uncached) WMS layer:

  OSOpenDirect.events.on({
                 moveend: function(e) {
                     if (e.zoomChanged) {
                       showResolution();
                     }
                   }
        });
        }

which calls this function:

        function showResolution() {
            document.getElementById("res").innerHTML = map.getResolution();
        }

which requires a div like this on the page:

<div id="res">the resolution will be shown here</div>

Then simply set your map scales array to whatever you fancy, and the corresponding resolution will be shown.

On doing this however, we quickly find that the VectorMap and StreetMap datasets only look decent at a very confined range of scales. In the end I abandoned the quest for user friendly scales, because VectorMap only looks decent when unscaled, i.e. giving a map scale of 1:7087.

I ended up with this resolutions array, which prioritizes nice-looking maps over friendly scales:

resolutions=3000,2000,1000,500,250,150,100,50,25,12.5,5,2.5,1

Of course the maps you get also depend on how the layers are set up in the WMS, i.e. what scale thresholds are set for each layer. I had to tweak ours a bit.

Aside - re-sampling in MapServer

The smaller scale OS maps do not mind being scaled if there is good resampling being done at the MapServer end - in fact this is crucial for making the maps look decent, and the performance hit doesn't matter once the data is cached. I've achieved this with this directive on each layer:

PROCESSING "RESAMPLE=BILINEAR"

along with this output format:

OUTPUTFORMAT
NAME png
DRIVER "AGG/PNG"
MIMETYPE "image/png"
IMAGEMODE RGBA
EXTENSION "png"
TRANSPARENT ON
# these setting greatly reduce the size of the PNG image
FORMATOPTION "QUANTIZE_FORCE=on"
FORMATOPTION "QUANTIZE_COLORS=256"
END

An example of how much worse the maps look without this can be seen here:

http://andrewl.net/map/ordnance-survey-rasters-mapserver-tilecache

(which is otherwise another helpful resource for anyone using tilecache and OS OpenData on a Unix platform).

How long does it take, and how much disk space is required

Well of course this depends on the precise resolutions chosen. I have no exact figures on time, because several runs bombed out and needed re-starting. Essentially, levels 0 to 5 take only seconds to build. Levels 6 to 8 take minutes (in 27700 level 9 took around 20 minutes, with level 10 taking a few hours). Level 13 in 900913 took less than 5 hours, while level 14 took something like 20 hours. Levels 16 (900913) and 12 (27700) are looking like they will take days. The 27700 cache rate was higher than the 900913, presumably because MapServer was not having to reproject the maps.

Some directory sizes (along with scale and resolution) for my 900913 and 27700 caches:

 

 
Level Scale
(approx)
Resolution Size on disk Files
Spherical Mercator (EPSG:900913) Early layers omitted
05 1:7M 2445.9849046875 1.34 MB 300
06 1:3M 1222.99245234375 1.33 MB 175
07 1:2M 611.496226171875 4.19 MB 400
08 1:867K 305.7481130859375 14.3 MB 975
09 1:433K 152.87405654296876 47.7 MB 2,700
10 1:217K 76.43702827148438 151 MB 9,600
11 1:108K 38.21851413574219 439 MB 33,725
12 1:54K 19.109257067871095 1.54 GB 130,700
13 1:27K 9.554628533935547 2.88 GB 510,300
14 1:14K 4.777314266967774 10.3 GB 2,017,100
15 1:6771 2.388657133483887 53.9 GB 8,027,650
16 1:3386 1.1943285667419434 211 GB 32,051,575
    TOTAL 280 GB 42,785,884
GB National Grid (EPSG:27700)
00 1:9M 3000 61,136 2
01 1:6M 2000 129,385 6
02 1:3M 1000 264,569 15
03 1:1M 500 1,220,288 150
04 1:709K 250 6,144,271 600
05 1:425K 150 16,132,572 1,125
06 1:283K 100 43,106,957 2,100
07 1:142K 50 67,469,260 7,475
08 1:71K 25 115,948,919 28,125
09 1:35K 12.5 494,478,612 106,800
10 1:14K 5 1,958,013,298 649,000
11 1:7087 2.5 6,023,812,744 2,244,000
12 1:2835 1

23,706,682,416

58.9 GB on disk

13,919,200

21,924 folders

    TOTAL

32,433,464,427

75.1 GB on disk

16,958,598

31,400 folders

 

Some time later...

I've stopped caching level 16 in 900913 as the lowest priority, and I'm attacking level 15 as well as level 12 in 27700 with multiple processes (with the min/max Y of the BBOX set to have each process caching a slice of the country). With 7 processes running, our server is working like this:

Server taking everything in its stride

Which is just fine, leaving plenty of ooomph free for running web sites etc. The pressure point is probably on physical RAM.

The server spec is:

CPU 2 x E5520 Xeon (quad core) + hyper-threading, meaning in effect 16 processors

Memory: 8 GB

OS: Windows Server 2008 x64 SP2

Disks: 1.2TB as 2 x SAS drives in RAID 5 I think.

Accessing the cache in OpenLayers

There are two ways of accessing the cache. The first is as a WMS with tilecache.py as the address. In this case, python will check whether the image tiles exists in the cache, and build any missing ones from the original data source. As time goes on, and more tiles are cached, the faster it gets. However, in 900913 I was getting frequent tile failures, and I'm not sure this method gives any real benefit over direct calls to MapServer.

The second is as a "tilecache", where the client (OpenLayers) requests the tiles individually as images, with no checks as to whether or not they exist. This is faster, as it avoids the IIS/python overhead, but obviously requires a pre-built and complete cache.

I therefore included in my OpenLayers map a layer that pointed at the resulting cache as a tile service, to check the results:

            OSOpenSphMercCached = new OpenLayers.Layer.TileCache( "OSOpenSphMercCached",
                    ["http://mywebsite/scripts/tilecache/tilecache-2.11/Cache"], "OSOpenSphMerc", {'format': 'image/png'},
                    {isBaseLayer: true}
            );

On panning around the map, this causes tile requests like this:

http://mywebsite/scripts/tilecache/tilecache-2.11/Cache/OSOpenSphMerc/15/000/032/032/000/043/439.png

This will only work if the map is configured with resolutions exactly matching those in the table above for either 27700 or 900913. But it is not necessary to use every resolution; for example we may have a map where the user cannot zoom out beyond level 4 (in 27700).

Demonstration pages:

http://www.esdmwms.no-ip.co.uk/scripts/tilecache/tilecache-2.11/indexOS27700.html

http://www.esdmwms.no-ip.co.uk/scripts/tilecache/tilecache-2.11/indexOS.html

(both have a direct WMS as default base layer, with the cached layer as another option in the layer switcher).

Next we should put a handler page in front of these services to restrict access to specified domains.

Caching overlays

Having cracked the OS Open Data, I thought it worth trying an overlay, so I chose the Norfolk HER archaeological monuments layers from HBSMR. These are quite slow to draw as WMS from our web4 server, with about 60,000 point, line and polygon features stored in MapInfo tables in WGS84.

I used the same resolutions etc as the OS 27700 data. Caching levels 0 to 11 took a few minutes; level 12 is taking perhaps an hour or two. The cache is about 1GB.

The one small gotcha is how to use a tilecache layer as an overlay in OpenLayers - on my first attempt it refused to budge from the base layers collection.

Working syntax was thus:

            NHECached = new OpenLayers.Layer.TileCache( "NHECached",
                    "http://www.esdmwms.no-ip.co.uk/scripts/tilecache/tilecache-2.11/Cache", "NHE",
                    {'format': 'image/png', reproject: false, isBaseLayer: false}
            );

The results are fantastic. Of course a cache is broken if the data changes, but given the speed of creating this cache, it would be perfectly realistic to have this as a scheduled job to rebuild a cache automatically every night, or whatever is appropriate depending on how the source data is managed.

Preventing pink tiles

Where a dataset does not cover the entire map extent, Open Layers will still request tiles from a tilecache layer, giving the pink tiles where no image is returned. To prevent this, add this JavaScript function:

        OpenLayers.Util.onImageLoadError = function() {
            this.src = "../tilecache-2.11/blank256.png";
            this.style.display = "";
        };

And make sure there is an appropriate 256x256 image in place.

Multiple URLs for the Cache

The bottlecks with a tilecache prepared and accessed as shown above are a) disk access and b) http throttling. The second of these is likely to be significant before the first - AFAIK IIS limits to two concurrent requests to the same domain. With any one map operation potentially requests perhaps 20 tiles, this will be a pinch point.

Fortunately OpenLayers allows a tilecache (or in fact any grid based layer) to be accessed from multiple URLs. It then requests some tiles from each defined address.

            OSOpen27700Cached = new OpenLayers.Layer.TileCache( "OSOpen27700Cached",
                    [
                    "http://www.anglingdiary.org.uk/Data/Sites/3/userfiles/gisdata",
                    "http://www.esdmwms.no-ip.co.uk/scripts/tilecache/tilecache-2.11/Cache"
                    ],
                    "OSOpen27700", {'format': 'image/png'},
                    {isBaseLayer: true}
            );

I don't know whether it is possible to kid IIS into allowing multiple addresses that in fact point to the same web server (i.p.) and file cache. I have demonstrated that OpenLayers does its bit, but it may be that we would have to clone the cache onto two virtual servers with different i.p. addresses to obtain the true benefits. This would only become necessary once the cache is under heavy load.

Accessing a subset of ​the resolutions in OpenLayers

Sometimes we want a map that has a tilecached layer, but we do not want all of the resolutions. For example we may want the Spherical Mercator OS Open Data, but start viewing at a UK scale rather than global. Unfortunately you cannot just knock off a few resolutions in the OpenLayers layer definition and expect it to work. OpenLayers equates the first resolution with folder "0", the next with "1", etc. So knocking off a resolution has it looking for the wrong image tiles. There is a neat solution: in your web site set up symbolic links to the real cache folders using mklink, mapping a folder called "0" to the real folder "5" (or whatever is appropriate) - this means you now have a new level 0 at the appropriate scale for the OpenLayers layer. This is not a complete write-up of the solution - but we have implemented this in the Lincolnshire Heritage @ Risk web site (October 2011).

Example: in a folder for the new pseudo tilecache, run a command like this for each resolution:

mklink /D 0 D:\PathToMyCache\MyCacheName\5

The same technique allows composite caches to be constructed, for example containing one specific dataset that has been cached at one scale with others for other scales.

 


Crispin Flower  ...]]>
https://www.esdm.co.uk/some-notes-about-setting-up-​tilecache-for-the-ordnance-s​urvey-open-data crispin.flower@idoxgroup.com (Crispin Flower) https://www.esdm.co.uk/some-notes-about-setting-up-​tilecache-for-the-ordnance-s​urvey-open-data https://www.esdm.co.uk/some-notes-about-setting-up-​tilecache-for-the-ordnance-s​urvey-open-data Sun, 15 Jan 2012 18:22:00 GMT