Result from @IHaveSweaters - "we expose and analyze a generic inability of CNNs to transform spatial representations between two different types: coordinates in (i, j) Cartesian space and coordinates in one-hot pixel space" https://eng.uber.com/coordconv/
Thanks! Appreciate the pointer! In this case I guess the results seem mathematically obvious (“why did they expect anything different? did they not understand ‘translational invariance’?”) but possibly I missed the point…