A lot of data is missing for Germany

Avatar
  • updated
  • Not a bug

Good evening!

For Germany, many and significantly important data have been missing for several weeks! This is the case as of the data slice of 07.08.2023.

Thank you.

Sven

Pinned replies
Avatar
-1
Magnus
  • Answer
  • Not a bug

Then use another database. We serve data as is in OSM. If the data is broken in OSM it's broken in our end as well. We do not spend any time to try to fix it, that would be financially impossible for us.


OSM always has broken boundary polygons somewhere, that's why there are several databases served.

See Multiple databases under https://osm-boundaries.com/Documentation .

Avatar
Sven Kasparz

Then the rules for import must be worked on! The relation of the land mass is clear:

this is not a boundary for Germany, only its land_area (see http://wiki.openstreetmap.org/wiki/Relation:boundary)

Therefore also type=multipolygon!

Therefore, no further administrative boundaries may be assigned to this boundary.

All higher admin_levels for Germany always belong to the corresponding border https://www.openstreetmap.org/relation/51477 This has type=boundary, as it should be!

As I see it, this should be generally valid!

Question: why are the dates as of 3.7.2023 still correct? You already had the land mass in there before!

Avatar
Magnus

Relation 62781 didn't exist in planet-230703.osm. So no, he land mass polygon didn't exist then.




Avatar
Sven Kasparz

I don't believe that at all!

I definitely do not believe that this relation was not included in the Planet!

Relation 62781 exists with version 1 since 1.1.2009 (!)

Relation 62781 currently has version 2426 (25.10.2023) [See History].

Germany appears in your data on 1.1.2014 as relation 51477.

Relation 62781 is not included for 2023:

- 2.1.2023

- 6.2.2023

- 13.3.2023

- 3.4.2023

- 1.5.2023

- 5.6.2023

- 3.7.2023

Shall I look further back? I'll be happy to... :(

For example 3.8.2020: Relation 62781 does not exist in your data! In the OSM data itself it does (see history!).

You have adapted your import routines for the worse and don't want to admit it... I would have many more comments!

In any case, the current state is totally chaotic for me.... Things are thrown together that have nothing to do with each other....

Primary: Land mass has absolutely nothing to do with administrative boundaries. That is the general error!

At the moment https://osm-boundaries.com/ is unusable for me!

Avatar
Magnus

I can't access the history, I am only getting timeouts. But yes, seems like the polygon is old, very old. Out of the 52 databases we have (52 planet.osm imports) relation 62781 exisist in six of them. It's the three latest, and then osm20140101, osm20120104 and osm20100317.

The reason for it not existing in the others is very likely because the relation has been invalid. Self-intersects, open-ended and whatnot. With the tool chain we use we only import valid polygon data. This is also documented on the site. Again, we present what exists in OSM, if it's broken in OSM it's broken/non existing on our site.

Still, nothing has changed in our end. In fact, there hasn't been a single code change since August the 12th, and that change was just to start tracking the time it takes to calculate some polygons. We haven't added this issue. It's a side-effect of our design and changes/additions in OSM. Our design has worked with almost ten years of planet.osm imports in a row, at least regarding Germany.

Other commits from Mars and forward this year:

  • Updated min/max priorities per worker.
  • Changed color of some web-ui box.
  • Fixed crash dump in worker.

See, no changes that causes this at all.

It's really annoying when you claim that we are lying with comments like "You have adapted your import routines for the worse and don't want to admit itI". Trust me, it doesn't make it more enjoyable to help you with your issues. You didn't start off well when you proudly linked a German page where you trashed talk our answers either. We tried to help you by answering a fairly vague question. I can respect there might be a language barrier though. But since you didn't specify a specific case (general question) you also got a very general answer. There wasn't a simple way for us to see what you weren't seeing. So I strongly believe that my first answer wasn't to your liking because the question wasn't informative enough.

On the upside I would like to thank you for having more specific replies later on. Things you point out and ask that I can actually look up, to confirm, deny or at least investigate and explain.

This site was primarily built for our own needs. We decided to make it public to give back to the OSM community. We have now provided this service for free for 3,5 years. We keep spending time and resources to add data to it every single month, even though we ourselves seldom update the data we use. In fact, we are updating small parts of our own data 2-3 times per year only.

Providing this service costs us money every single month. Not a lot, but probably 100-500 EUR/month. The total amount of donations we have received during this 3,5 years is up to around 500 EUR. As you can see it's quite a loss for us, from an economical perspective. So the wise decision would be to shut it down to be honest. That's is not the plan though.

As I mentioned, there are plans to further develop the site as well, but that will be when we have the time and resources. One of the things that most likely will be implemented is alternative trees, for example it's very likely that we will render a tree with boundary=administrative only in the future. That would solve this case, but it might also remove other polygons that you, or others, are looking for.

I am sorry to hear that the site now is useless for you. But the upside is, it was free to use when it was useful, and it doesn't cost anything to not use either.

I scrolled through the tree in one of the later DBs and I can actually see that Germany isn't alone. It's a bit weird that there are several countries with the issue when none had issues a few months back. I can see the following being affected by similar land mass polygons:

  • Antigua
  • Belgium
  • Germany
  • Guernsey
  • Jersey
  • Lithuania
  • Poland
  • Qatar
  • South Africa

It's weird that it's so many countries suddenly. I have a feeling someone took upon themselves to either add such polygons, or fix broken current ones, or both. This list is also interesting in the way that we can see a trend that's causing problems with the site. If not current, so at least for the future. And obviously current for you. So thank you for reaching out and being persistent and really making us look into this (I mean it).

We may very well find a method to counter what you complain about, but as I said several replies earlier, I don't know what to do about it, that also makes sense. We are determined that we will not do specific exceptions based on relation IDs at least. We build the tree on basic rules, polygons that fits into others and have a higher admin_level. Pretty much simple as that. And we want to keep it simple. We won't spend time reviewing relations with every new import, and manually adjusting, we do not have the resources for that.

Whatever we do should fit into the site as a whole, and work for the vast majority of the OSM-world. The change that makes most sense to me right now is what's already planned, alternative trees. But I also know that won't happen this year.

The most urgent things we need to fix this year is to upgrade to OSM's new oAuth system. The one we use is deprecated and will stop working two months from now.

All in all this isn't something that will be solved in a few days. At least if we aren't so lucky that we suddenly get a very good idea of how to solve it easily.

Avatar
Sven Kasparz

my concluding words:

You want to present as much and everything as possible on borders. Good. But you are losing sight of the complexity of all the borders!

...I have already written that land mass has nothing to do with administrative borders, land mass is not a boundary=administrative!!!!

There is a lack of structure...

For example, there is a lack of primary distinction from purely administrative borders (boundary=administrarive) to protected area borders...

For example, in your data relation https://www.openstreetmap.org/relation/4763316 is assigned to relation https://www.openstreetmap.org/relation/1388880. This may be correct in terms of location, but in terms of content it is total nonsense. Relation 4763316 is protect_class=4 and at most assigned to admin_level=4 (Brandenburg), but this is also only partly true. It is actually a completely separate area... That's how it runs through all the data!

Please concentrate primarily on the administrative limits, make sure that the data imports always work properly and make sure that the import intervals are shorter: for example, at least once a week!

This will help OSM a lot more!

Avatar
Sven Kasparz

Oh...

...still forgotten...

If you have further questions, don't be afraid to ask them at https://community.openstreetmap.org/ ! You know that we here in Germany have a lot of knowledge about this...

corresponding answers are guaranteed at https://community.openstreetmap.org/c/communities/de/56

Avatar
Magnus
Quote from Sven Kasparz

my concluding words:

You want to present as much and everything as possible on borders. Good. But you are losing sight of the complexity of all the borders!

...I have already written that land mass has nothing to do with administrative borders, land mass is not a boundary=administrative!!!!

There is a lack of structure...

For example, there is a lack of primary distinction from purely administrative borders (boundary=administrarive) to protected area borders...

For example, in your data relation https://www.openstreetmap.org/relation/4763316 is assigned to relation https://www.openstreetmap.org/relation/1388880. This may be correct in terms of location, but in terms of content it is total nonsense. Relation 4763316 is protect_class=4 and at most assigned to admin_level=4 (Brandenburg), but this is also only partly true. It is actually a completely separate area... That's how it runs through all the data!

Please concentrate primarily on the administrative limits, make sure that the data imports always work properly and make sure that the import intervals are shorter: for example, at least once a week!

This will help OSM a lot more!

The original idea was to use boundary=administrative. But since the site we develop, that uses data from OSM-Boundaries, uses data for all countries in the world, not just Germany, we also were hit by reality. The quality of the data in OSM isn't perfect. We needed data that wasn't properly tagged, and therefore had to add other polygon data as well. Again, the site is built upon our need primarily. We are just providing what we built, for free.

If we only show data that has boundary=administrative the site might be useful for you, but it would then by useless to us. We need a lot of polygon data that isn't tagged like that. For various reasons, but most often because less developed countries than Germany doesn't have as reliable data.

But it sounds like what you want is what I have suggested several times. A different tree, which only shows boundary=administrative.

And again, last time, the data imports work properly. The imports work exactly as planned and designed. You just don't seem to understand the rules that we set up, and why they are as they are. The imports works. And the data is as we wished it to be. We just wish that the OSM data was less of a Wild West.

And about an import every week. Sorry, that most likely won't happen. I don't think you appreciate the amount of data we process here. We spend 3-8 weeks of CPU power for every import we do. If you have used the site a lot you should have noticed, and read about, the processing time we need. This also tells us that it's theoretically impossible for us to do imports more often than we do. In fact, it might be that we have to do it less often, not the least when considering the increase in size of planet.osm by time. This can of course change, if someone chooses to fund us with the attached costs. If someone would be interested in that we would have to look into that, but it would most likely be north of 1000 EUR/month.

Avatar
Sven Kasparz

Offer to share information and correlate data.

...I would still like to make you this offer... But please do it in the forum: https://community.openstreetmap.org


This makes it easier for me to write directly in German and, above all, it involves a much larger circle of interested parties! (this is extremely important here!)


I have also looked around in neighbouring countries... It's similar there. In Poland, for example, there is a very strong allocation to churches, recognisable in your data by deanery=Dekanat* Example: https://www.openstreetmap.org/relation/15910987

In your data, this is assigned in a wide variety of places!

...sometimes directly, sometimes at different levels of the administrative structure!

Another example is that you assign protected areas to administrative borders in a completely arbitrary way. I don't have to look at Germany, that's just the way it is in neighbouring countries.

I only want data that can be used and structured to the best of our knowledge. If it takes a while, I don't care.

Sven

Avatar
johannes

Thanks for all the work you are doing to maintain this service!

We ran into the same issue, especially with the boundaries in Germany and Poland. In the further processing, we use the parents part of the data quite heavily so unfortunately, we can't just download more data and our problem would be solved. Currently the download selection for Germany, admin level 2 to 8 gives us this URL: https://osm-boundaries.com/Download/Submit?apiKey=…&db=osm20230904&osmIds=-51477&recursive&minAdminLevel=2&maxAdminLevel=8&format=GeoJSON&srid=4326. This still gives us recursively only all boundaries that are (in the tree) within Germany (not land mass relation). Which means we only get 562 features instead of the 12532 features we got in the April database.

For us it would be very helpful if there would be a version that only considers boundary=administrative features in OSM, as already suggested in the thread earlier. If understand it correctly it would solve all our issues as we only use these boundaries in the later processing anyway.


Is there any way we could contribute to achieve this goal? As far as I know the code is not open source (yet).

Regards

Johannes

Screenshot how the aforementioned API url was created:

Image 24

Avatar
Magnus
Quote from johannes

Thanks for all the work you are doing to maintain this service!

We ran into the same issue, especially with the boundaries in Germany and Poland. In the further processing, we use the parents part of the data quite heavily so unfortunately, we can't just download more data and our problem would be solved. Currently the download selection for Germany, admin level 2 to 8 gives us this URL: https://osm-boundaries.com/Download/Submit?apiKey=…&db=osm20230904&osmIds=-51477&recursive&minAdminLevel=2&maxAdminLevel=8&format=GeoJSON&srid=4326. This still gives us recursively only all boundaries that are (in the tree) within Germany (not land mass relation). Which means we only get 562 features instead of the 12532 features we got in the April database.

For us it would be very helpful if there would be a version that only considers boundary=administrative features in OSM, as already suggested in the thread earlier. If understand it correctly it would solve all our issues as we only use these boundaries in the later processing anyway.


Is there any way we could contribute to achieve this goal? As far as I know the code is not open source (yet).

Regards

Johannes

Screenshot how the aforementioned API url was created:

Image 24

No the code is closed source, and for now it will be kept like that. Therefore there aren't any real ways to contribute vs this.

It may sound like a small task, but it's a bit of a fundamental change to provide alternative tree views. One could wish that we had the idea from the beginning, then it would be easier now. It is planned though, but haven't had any time-frame at all. Now with this issue there is at least reason to prioritize it more. But I still wouldn't expect it to happen this year. As mentioned earlier, this site is a loss from an economical perspective and it's therefore very hard for us to prioritize it before other tasks that we have.