"Too many redirects" error when downloading large countries

Avatar
  • updated
  • Not a bug

I've encountered a download-error when trying to download GeoJSON for all admin levels, land only, and no simplification for the following countries: Canada, France, and the United States. 


https://osm-boundaries.com/Download/Submit?apiKey=102600000402550d1a81e81bd669751e&db=osm20210712&osmIds=-1428125&recursive&format=GeoJSON&srid=4326&landOnly


When accessing the above url through Python, after a long waiting period I get "urllib.error.HTTPError: HTTP Error 307: The HTTP server returned a redirect error that would lead to an infinite loop. The last 30x error message was: Temporary Redirect". When doing it directly in Chrome, there's also a long waiting period and I get a similar error message about "Too many redirects". 


I suspect this has to do with the fact that these are all very large countries and that it simply takes too long to prepare the download on the server-side? Can this be fixed on your side, or is there any way to get around this on my end? 

Pinned replies
Avatar
Magnus
  • Answer
  • Not a bug

It's correct that it's because they are big/complex which leads to a long time to generate them. Since HTTP isn't made for several minute long calls we have to work around this. Even more so since we have Cloudflare as a reverse proxy in between, they don't allow more than 60 seconds.

The workaround used is that the download process goes through these steps:

1. Call to create the background job. The clients gets forwarded to Wait.

2. Call the Wait url. Server side makes short sleeps while polling for the job. After 20 seconds the client is redirected back to the same url again if the job hasn't completed. If completed the client gets redirected to the next url.

3. Third url, downloads the file.

The proper solution from the client is to allow an endless number of redirects, which is included in the curl-cli example in the download form. You can choose to abort your client yourself after another timeout if you wish. The background job will still process on our servers regardless. So if you need to download many files you can abort after for example five redirects and then just go on and ask for the next job to be queued. Once all jobs are queued you can use the same urls to try to download them again.

Most jobs are fairly fast (less than a few seconds). Others will take longer time. If using the "partially ready" databases certain jobs can take weeks.

Avatar
Magnus
  • Answer
  • Not a bug

It's correct that it's because they are big/complex which leads to a long time to generate them. Since HTTP isn't made for several minute long calls we have to work around this. Even more so since we have Cloudflare as a reverse proxy in between, they don't allow more than 60 seconds.

The workaround used is that the download process goes through these steps:

1. Call to create the background job. The clients gets forwarded to Wait.

2. Call the Wait url. Server side makes short sleeps while polling for the job. After 20 seconds the client is redirected back to the same url again if the job hasn't completed. If completed the client gets redirected to the next url.

3. Third url, downloads the file.

The proper solution from the client is to allow an endless number of redirects, which is included in the curl-cli example in the download form. You can choose to abort your client yourself after another timeout if you wish. The background job will still process on our servers regardless. So if you need to download many files you can abort after for example five redirects and then just go on and ask for the next job to be queued. Once all jobs are queued you can use the same urls to try to download them again.

Most jobs are fairly fast (less than a few seconds). Others will take longer time. If using the "partially ready" databases certain jobs can take weeks.

Avatar
Karim Bahgat

Thanks, this makes more sense now, that I have to allow more redirects, or just make sure to ping the download links at a later point. I might also just try to break each download into one administrative level at a time.