Discussion:
[QGIS-Developer] Whatever is wrong with QGIS 3.x SQL server driver
Bo Victor Thomsen
2018-12-06 09:05:00 UTC
Permalink
Hi list -



I've done some experiments with a dataset consisting of 440000 rows and
uploaded this to two database servers: Postgres and SQLServer. Both tables
has indexes on Primary key and the spatial column.



And then connected to both tables in QGIS. The SQL server is *3 times*
slower in retrieving the dataset than Postgres in QGIS!



And it's not SQL server's fault!! I've made the same experiment using
GeoServer as a client where Postgres is only slightly faster.



To make matters worse... In QGIS you can connect to the SQL server database
using the "Layer" > "Add Layer" > "Add Vector layer" - i.e. using the OGR
driver - to connect to the SQL Server table. This makes the retrieval of
SQL Server data nearly as fast as Postgres.



I have to conclude, that the native QGIS SQLServer driver is really, really
bad. What's the matter with this driver?



I recollect, doing the same experiment in QGIS 2.x, that a simple fetch of
SQL server data was nearly as fast as retrieving data from a comparable
Postgres database server (I’m not talking about complex spatial queries,
where Postgres blows SQL Server totally out of the water, but simple
retrieval of layer data using a MBR search)



Unfortunately, I can't just switch to the OGR driver since I have some
write issues with this driver.



What gives ???
--
Med venlig hilsen



Bo Victor Thomsen
Nyall Dawson
2018-12-06 09:17:29 UTC
Permalink
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows and uploaded this to two database servers: Postgres and SQLServer. Both tables has indexes on Primary key and the spatial column.
And then connected to both tables in QGIS. The SQL server is 3 times slower in retrieving the dataset than Postgres in QGIS!
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.

If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".

Let me know if this helps at all

Nyall
Bo Victor Thomsen
2018-12-06 10:05:16 UTC
Permalink
Hi Nyall -



Thanks for taking your time answering me :-)



I've tried switching the validity check off as described. As far as I can
measure, there is no time difference with or without the validity check.
When does the validity check kick in? Writing or reading the features? Or
both?



And the validity check doesn't explain the obvious time difference between
the OGR driver and the native QGIS driver for SQL Server



However, I *will* use your explanation about SQL Server's behavior
regarding invalid geometries as an argument for my customers to switch to
Postgres instead of using SQLServer :-)
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows and
uploaded this to two database servers: Postgres and SQLServer. Both tables
has indexes on Primary key and the spatial column.
Post by Bo Victor Thomsen
And then connected to both tables in QGIS. The SQL server is 3 times
slower in retrieving the dataset than Postgres in QGIS!
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen

Bo Victor Thomsen
Nyall Dawson
2018-12-06 10:16:58 UTC
Permalink
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I can measure, there is no time difference with or without the validity check. When does the validity check kick in? Writing or reading the features? Or both?
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
And the validity check doesn't explain the obvious time difference between the OGR driver and the native QGIS driver for SQL Server
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
However, I will use your explanation about SQL Server's behavior regarding invalid geometries as an argument for my customers to switch to Postgres instead of using SQLServer :-)
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.

Nyall
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows and uploaded this to two database servers: Postgres and SQLServer. Both tables has indexes on Primary key and the spatial column.
And then connected to both tables in QGIS. The SQL server is 3 times slower in retrieving the dataset than Postgres in QGIS!
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
Bo Victor Thomsen
2018-12-06 10:34:30 UTC
Permalink
Hi Nyall -

I'm running QGIS 3.4.2 on Windows ver.10.



Have there been any other changes to the SQLServer driver besides the
validity check? (I remember vaguely something about the internal
representation of spatial objects in the driver)

I'm asking, because I've done this type of testing QGIS 2.x before where
the time difference between Postgres and SQL Server was relatively small
when doing simple MBR based searches - somewhere in the vicinity of 20%


I would happily ditch MS SQLServer forever for spatial work and mainly use
Postgres. However, my customers have a different opinion :-(
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I
can measure, there is no time difference with or without the validity
check. When does the validity check kick in? Writing or reading the
features? Or both?
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
Post by Bo Victor Thomsen
And the validity check doesn't explain the obvious time difference
between the OGR driver and the native QGIS driver for SQL Server
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
Post by Bo Victor Thomsen
However, I will use your explanation about SQL Server's behavior
regarding invalid geometries as an argument for my customers to switch to
Postgres instead of using SQLServer :-)
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 10.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows
and uploaded this to two database servers: Postgres and SQLServer. Both
tables has indexes on Primary key and the spatial column.
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
And then connected to both tables in QGIS. The SQL server is 3 times
slower in retrieving the dataset than Postgres in QGIS!
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen

Bo Victor Thomsen
Nyall Dawson
2018-12-06 10:54:19 UTC
Permalink
On Thu, 6 Dec 2018 at 20:34, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi Nyall -
I'm running QGIS 3.4.2 on Windows ver.10.
Have there been any other changes to the SQLServer driver besides the validity check? (I remember vaguely something about the internal representation of spatial objects in the driver)
No, nothing that would explain this. Just minor bug fixing and the port to Qt 5.

I wonder if you could log the queries coming from QGIS and see if you
can identify any changes from 2.18?

Nyall
Post by Bo Victor Thomsen
I'm asking, because I've done this type of testing QGIS 2.x before where the time difference between Postgres and SQL Server was relatively small when doing simple MBR based searches - somewhere in the vicinity of 20%
I would happily ditch MS SQLServer forever for spatial work and mainly use Postgres. However, my customers have a different opinion :-(
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I can measure, there is no time difference with or without the validity check. When does the validity check kick in? Writing or reading the features? Or both?
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
And the validity check doesn't explain the obvious time difference between the OGR driver and the native QGIS driver for SQL Server
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
However, I will use your explanation about SQL Server's behavior regarding invalid geometries as an argument for my customers to switch to Postgres instead of using SQLServer :-)
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows and uploaded this to two database servers: Postgres and SQLServer. Both tables has indexes on Primary key and the spatial column.
And then connected to both tables in QGIS. The SQL server is 3 times slower in retrieving the dataset than Postgres in QGIS!
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
Bo Victor Thomsen
2018-12-06 11:14:15 UTC
Permalink
Ok, I'll take a look and monitor the queries from QGIS 2 and 3 and report
back. It will take a couple of days

And thanks Nyall
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:34, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi Nyall -
I'm running QGIS 3.4.2 on Windows ver.10.
Have there been any other changes to the SQLServer driver besides the
validity check? (I remember vaguely something about the internal
representation of spatial objects in the driver)
No, nothing that would explain this. Just minor bug fixing and the port to Qt 5.
I wonder if you could log the queries coming from QGIS and see if you
can identify any changes from 2.18?
Nyall
Post by Bo Victor Thomsen
I'm asking, because I've done this type of testing QGIS 2.x before where
the time difference between Postgres and SQL Server was relatively small
when doing simple MBR based searches - somewhere in the vicinity of 20%
Post by Bo Victor Thomsen
I would happily ditch MS SQLServer forever for spatial work and mainly
use Postgres. However, my customers have a different opinion :-(
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 11.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I
can measure, there is no time difference with or without the validity
check. When does the validity check kick in? Writing or reading the
features? Or both?
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
Post by Bo Victor Thomsen
And the validity check doesn't explain the obvious time difference
between the OGR driver and the native QGIS driver for SQL Server
Post by Bo Victor Thomsen
Post by Nyall Dawson
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
Post by Bo Victor Thomsen
However, I will use your explanation about SQL Server's behavior
regarding invalid geometries as an argument for my customers to switch to
Postgres instead of using SQLServer :-)
Post by Bo Victor Thomsen
Post by Nyall Dawson
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 10.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000
rows and uploaded this to two database servers: Postgres and SQLServer.
Both tables has indexes on Primary key and the spatial column.
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
And then connected to both tables in QGIS. The SQL server is 3
times slower in retrieving the dataset than Postgres in QGIS!
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's
*no
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server
connection's
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen

Bo Victor Thomsen
Bo Victor Thomsen
2018-12-06 14:13:07 UTC
Permalink
Hi Nyall - Back again ..

The 3 different sql queries from QGIS.....

-- QGIS 3, native driver
SELECT [cell100_key],[geom_flade] FROM [dbo].[cells3] WHERE
[geom_flade].STIsValid() = 1 AND
[geom_flade].Filter([geometry]::STGeomFromText('POLYGON((823546.46544876880943775
6100000, 936453.53455123119056225 6100000, 936453.53455123119056225
6150000, 823546.46544876880943775 6150000, 823546.46544876880943775
6100000))',25832)) = 1

-- QGIS 3 OGR driver
select [cell100_key], [geom_flade] from [dbo].[cells3] where
[geom_flade].STIntersects(geometry::STGeomFromText('POLYGON((823546.465448769
6100000,936453.534551231 6100000,936453.534551231 6150000,823546.465448769
6150000,823546.465448769 6100000))',25832)) = 1

-- QGIS 2, native driver
SELECT [cell100_key],[geom_flade] FROM [dbo].[cells3] where
[geom_flade].STIsValid() = 1 AND
[geom_flade].Filter([geometry]::STGeomFromText('POLYGON((826739.13043478259351104
6100000, 933260.86956521740648896 6100000, 933260.86956521740648896
6150000, 826739.13043478259351104 6150000, 826739.13043478259351104
6100000))',25832)) = 1

There is not much difference in the SQL Server execution times for the 3
queries (around 200 - 300 ms) , so that can't explain the time difference
in QGIS

QGIS 2 with native driver and QGIS 3 with ogr driver uses appr. the same
time showing the map.
QGIS 3 with native driver uses appr. twice the time to show the results on
the map.
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:34, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi Nyall -
I'm running QGIS 3.4.2 on Windows ver.10.
Have there been any other changes to the SQLServer driver besides the
validity check? (I remember vaguely something about the internal
representation of spatial objects in the driver)
No, nothing that would explain this. Just minor bug fixing and the port to Qt 5.
I wonder if you could log the queries coming from QGIS and see if you
can identify any changes from 2.18?
Nyall
Post by Bo Victor Thomsen
I'm asking, because I've done this type of testing QGIS 2.x before where
the time difference between Postgres and SQL Server was relatively small
when doing simple MBR based searches - somewhere in the vicinity of 20%
Post by Bo Victor Thomsen
I would happily ditch MS SQLServer forever for spatial work and mainly
use Postgres. However, my customers have a different opinion :-(
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 11.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I
can measure, there is no time difference with or without the validity
check. When does the validity check kick in? Writing or reading the
features? Or both?
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
Post by Bo Victor Thomsen
And the validity check doesn't explain the obvious time difference
between the OGR driver and the native QGIS driver for SQL Server
Post by Bo Victor Thomsen
Post by Nyall Dawson
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
Post by Bo Victor Thomsen
However, I will use your explanation about SQL Server's behavior
regarding invalid geometries as an argument for my customers to switch to
Postgres instead of using SQLServer :-)
Post by Bo Victor Thomsen
Post by Nyall Dawson
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 10.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000
rows and uploaded this to two database servers: Postgres and SQLServer.
Both tables has indexes on Primary key and the spatial column.
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
And then connected to both tables in QGIS. The SQL server is 3
times slower in retrieving the dataset than Postgres in QGIS!
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's
*no
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server
connection's
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen

Bo Victor Thomsen
Nyall Dawson
2018-12-06 23:09:15 UTC
Permalink
On Fri, 7 Dec 2018 at 00:13, Bo Victor Thomsen
There is not much difference in the SQL Server execution times for the 3 queries (around 200 - 300 ms) , so that can't explain the time difference in QGIS
QGIS 2 with native driver and QGIS 3 with ogr driver uses appr. the same time showing the map.
QGIS 3 with native driver uses appr. twice the time to show the results on the map.
Ok, thanks. In that case it appears as though this regression was
probably caused by the move to Qt5 or somewhere in the QGIS 3.0
refactoring (as opposed to a deliberate change)

Next things to try:

1. Download debugview
(https://docs.microsoft.com/en-us/sysinternals/downloads/debugview),
start it, and then open QGIS. Test your layer and see if there's any
relevant console output which may explain things

2. Run some little PyQGIS scripts to help narrow down where the issue
is. I.e. run these scripts on a large layer with the different
versions and time their execution:

# fetch attributes only, no geometry
req = QgsFeatureRequest().setFlags(QgsFeatureRequest.NoGeometry)
for f in iface.activeLayer().getFeatures(req):
pass

# fetch geometry only, no attributes
req = QgsFeatureRequest().setSubsetOfAttributes([])
for f in iface.activeLayer().getFeatures(req):
pass

This will help determine whether the regression is in the geometry
parsing, attribute parsing, or somewhere else...

Nyall
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:34, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi Nyall -
I'm running QGIS 3.4.2 on Windows ver.10.
Have there been any other changes to the SQLServer driver besides the validity check? (I remember vaguely something about the internal representation of spatial objects in the driver)
No, nothing that would explain this. Just minor bug fixing and the port to Qt 5.
I wonder if you could log the queries coming from QGIS and see if you
can identify any changes from 2.18?
Nyall
Post by Bo Victor Thomsen
I'm asking, because I've done this type of testing QGIS 2.x before where the time difference between Postgres and SQL Server was relatively small when doing simple MBR based searches - somewhere in the vicinity of 20%
I would happily ditch MS SQLServer forever for spatial work and mainly use Postgres. However, my customers have a different opinion :-(
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
I've tried switching the validity check off as described. As far as I can measure, there is no time difference with or without the validity check. When does the validity check kick in? Writing or reading the features? Or both?
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
And the validity check doesn't explain the obvious time difference between the OGR driver and the native QGIS driver for SQL Server
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
However, I will use your explanation about SQL Server's behavior regarding invalid geometries as an argument for my customers to switch to Postgres instead of using SQLServer :-)
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If you
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000 rows and uploaded this to two database servers: Postgres and SQLServer. Both tables has indexes on Primary key and the spatial column.
And then connected to both tables in QGIS. The SQL server is 3 times slower in retrieving the dataset than Postgres in QGIS!
It's probably the extra validity checks which were added. SQL Server
itself is broken by design when it comes to spatial data handling and
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But there's *no
way* for QGIS to detect when this occurs! Accordingly QGIS takes the
"safer is better" approach and forces a validity check and make valid
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries (and
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your tables,
you'll get data loss. The setting is under the SQL Server connection's
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
Bo Victor Thomsen
2018-12-07 07:19:27 UTC
Permalink
Thanks Nyall -

"I'll be back !" - in about a week or so
Post by Nyall Dawson
On Fri, 7 Dec 2018 at 00:13, Bo Victor Thomsen
Post by Bo Victor Thomsen
There is not much difference in the SQL Server execution times for the 3
queries (around 200 - 300 ms) , so that can't explain the time difference
in QGIS
Post by Bo Victor Thomsen
QGIS 2 with native driver and QGIS 3 with ogr driver uses appr. the same
time showing the map.
Post by Bo Victor Thomsen
QGIS 3 with native driver uses appr. twice the time to show the results
on the map.
Ok, thanks. In that case it appears as though this regression was
probably caused by the move to Qt5 or somewhere in the QGIS 3.0
refactoring (as opposed to a deliberate change)
1. Download debugview
(https://docs.microsoft.com/en-us/sysinternals/downloads/debugview),
start it, and then open QGIS. Test your layer and see if there's any
relevant console output which may explain things
2. Run some little PyQGIS scripts to help narrow down where the issue
is. I.e. run these scripts on a large layer with the different
# fetch attributes only, no geometry
req = QgsFeatureRequest().setFlags(QgsFeatureRequest.NoGeometry)
pass
# fetch geometry only, no attributes
req = QgsFeatureRequest().setSubsetOfAttributes([])
pass
This will help determine whether the regression is in the geometry
parsing, attribute parsing, or somewhere else...
Nyall
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 11.54 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:34, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi Nyall -
I'm running QGIS 3.4.2 on Windows ver.10.
Have there been any other changes to the SQLServer driver besides the
validity check? (I remember vaguely something about the internal
representation of spatial objects in the driver)
Post by Bo Victor Thomsen
Post by Nyall Dawson
No, nothing that would explain this. Just minor bug fixing and the port
to Qt 5.
Post by Bo Victor Thomsen
Post by Nyall Dawson
I wonder if you could log the queries coming from QGIS and see if you
can identify any changes from 2.18?
Nyall
Post by Bo Victor Thomsen
I'm asking, because I've done this type of testing QGIS 2.x before
where the time difference between Postgres and SQL Server was relatively
small when doing simple MBR based searches - somewhere in the vicinity of
20%
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
I would happily ditch MS SQLServer forever for spatial work and
mainly use Postgres. However, my customers have a different opinion :-(
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 11.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 20:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
I've tried switching the validity check off as described. As far
as I can measure, there is no time difference with or without the validity
check. When does the validity check kick in? Writing or reading the
features? Or both?
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's on read. Writing always uses a make valid call for SQL Server to
try to avoid triggering the issue.
Post by Bo Victor Thomsen
And the validity check doesn't explain the obvious time difference
between the OGR driver and the native QGIS driver for SQL Server
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Well, it would if OGR wasn't doing this check by default. What
platform are you connecting from? Windows or Linux?
Post by Bo Victor Thomsen
However, I will use your explanation about SQL Server's behavior
regarding invalid geometries as an argument for my customers to switch to
Postgres instead of using SQLServer :-)
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
There's also these points: https://www.pg-versus-ms.com/ (I think I
could write as much again on the spatial side of things alone.) If
you
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
have a choice, Postgres is far superior in so many ways.
Nyall
Post by Bo Victor Thomsen
Den tor. 6. dec. 2018 kl. 10.17 skrev Nyall Dawson <
Post by Nyall Dawson
On Thu, 6 Dec 2018 at 19:05, Bo Victor Thomsen
Post by Bo Victor Thomsen
Hi list -
I've done some experiments with a dataset consisting of 440000
rows and uploaded this to two database servers: Postgres and SQLServer.
Both tables has indexes on Primary key and the spatial column.
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
And then connected to both tables in QGIS. The SQL server is 3
times slower in retrieving the dataset than Postgres in QGIS!
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
It's probably the extra validity checks which were added. SQL
Server
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
itself is broken by design when it comes to spatial data handling
and
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
if it encounters an invalid geometry it will silently abort the
request and you'll be missing features from the layer. But
there's *no
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
way* for QGIS to detect when this occurs! Accordingly QGIS takes
the
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
"safer is better" approach and forces a validity check and make
valid
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
step as part of the queries sent to SQL Server. This avoids the
potentially missing features, but comes at a large cost.
If you're 100% sure that your tables have no invalid geometries
(and
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
never will have any!), you *can* switch this check off. But be
warned... if you ever introduce invalid geometries into your
tables,
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
you'll get data loss. The setting is under the SQL Server
connection's
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
Post by Bo Victor Thomsen
Post by Nyall Dawson
properties -- "skip invalid geometry handling".
Let me know if this helps at all
Nyall
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen
Bo Victor Thomsen
--
Med venlig hilsen

Bo Victor Thomsen
Loading...