sql dwh problem: updating

I have several SQL Mobile devices (WM5) syncronizing with a SQL 2005 database through merge replication. We are using the HOST_NAME parameter to filter several tables. A few of these filters use a sub-query to filter the data because of the complexity of the filter (need to do more than join to one table). The subquery will use the HOST_NAME parameter. For example, I want to get a list of all users in the current user's company I need to use host_name to get the current user, find their company ID, then filter the user table on that company ID.

Everything works fine until a user is removed from the table. In that case, the delete statement is not replicated to the device until they reinitialize the mobile database.

Thanks for the help!!

Mike

it's unclear what the problem is without knowing what your filter clause looks like. Could you script out your publication, article and any merge/join filter clauses?|||

One of the affected pieces consists of 2 tables. A USER_LOCATION table that contains the locations a user want to see and a LOCATION table that contains all locations (see script below). I filter the USER_LOCATION table with the following filter: WHERE cast(person_id as varchar(36)) = host_name(). I then use a join to extend the filter on the location_id to the LOCATION table. On the device, I do not use the USER_LOCATION table as it is really just used to filter the proper locations for the user. I have also tried a filter on the LOCATION table directly using a subquery (so that the user_location table never makes it to the device). The same problem still shows up: If a user removes a location from their list from our website (used to configure their settings, which are replicated to the device), the next sync does not delete the record from the device. In fact, the only way to get the delete to take is to reinitialize the database on the device.

Tables:

CREATE TABLE [dbo].[USER_LOCATION](

[user_location_id] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF_USER_LOCATION_user_location_id] DEFAULT (newsequentialid()),

[user_id] [uniqueidentifier] NOT NULL,

[location_id] [uniqueidentifier] NOT NULL)

CREATE TABLE [dbo].[LOCATION](

[location_id] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF__LOCATION__locati__48CFD27E] DEFAULT (newsequentialid()),

[name] [varchar](200) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[abbreviation] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL)

|||

Mike, it would be much easier if you scripted your publication, articles (maybe just for USER_LOCATION and LOCATION for now) and merge filter joins, can you do that please?

Just to be clear, when a row from LOCATION is deleted at the publisher, the delete isn't replicated down to the subscriber, correct? That would mean there's nonconvergence, and validation would fail. Could you verify that validation for the given subscriber fails?

|||

Actually, we just updated some of our code and upgraded to SP1 and the problem seems to be gone. Hopefully we won't see it in the future. Thanks for the help!

Mike

|||Hello,
I am facing the same problem. I have dynamic filters on most tables and the rows dont get deleted from the subscriber once they cease to satisfy the filter conditions.
HEre is a scenario: A Dynamic filter checks for suser_sname() and based on that(and a few other conditions) sends rows to the subscriber. when the data in the column used to compare the suser_sname() value changes at the publisher , the related row automatically got deleted from the subscriber. Recently this has stopped happening. Validation of the subscription also fails. Mike- What did you upgrade to SP1? We are running off SQL Server2005 SP2? thanks in advance. Rocky

Hi,

Currently we're a building a metadatadriven datawarehouse in SQL
Server 2000. We're investigating the possibility of the updating
tables with enormeous number of updates and insert and the use of
checkpoints (for simple recovery and Backup Log for full recovery).

On several website people speak about full transaction log and the
pace of growing can't keep up with the update. Therefore we want to
create a script which flushes the dirty pages to the disk. It's not
quite clear to me how it works. Questions we have is:
* How does the process of updating, insert and deleting works with SQL
Server 2000 with respect to log cache, log file, buffer cache, commit,
checkpoint, etc?
What happens when?
* As far as i can see now: i'm thinking of creating chunks of data of
1000 records with a checkpoint after the Query. SQL server has the
default of implicit transactions and so it will not need a commit.
Something like this?
* How do i create chunks of 1000 records automatically without
creating a identity field or something. Is there something like SELECT
NEXT 1000?

Greetz,

Hennie> * How does the process of updating, insert and deleting works with SQL
> Server 2000 with respect to log cache, log file, buffer cache, commit,
> checkpoint, etc?
> What happens when?

There's a discussion of this in the Books Online
<architec.chm::/8_ar_sa_8unn.htm>. In summary, all data modifications are
written to both the buffer cache and transaction log buffer cache. The log
buffer is written first. Modified buffer cache pages are written to disk by
either worker threads, the lazy writer or the checkpoint process. Worker
threads write data when they need to wait for a read i/o anyway. The lazy
writer process wakes up periodically and writes data in order to maintain a
reasonable number of free cache buffers. The checkpoint process writes all
dirty pages to disk periodically (about a minute by default) in order to
reduce startup recovery time.

Log buffers are written as they become full, when you issue a COMMIT and
during a checkpoint.

> * As far as i can see now: i'm thinking of creating chunks of data of
> 1000 records with a checkpoint after the Query. SQL server has the
> default of implicit transactions and so it will not need a commit.
> Something like this?
> * How do i create chunks of 1000 records automatically without
> creating a identity field or something. Is there something like SELECT
> NEXT 1000?

I'm not sure I understand what you are asking here. In the default
autocommit mode, each statement is a separate transaction.

It seems you are mostly concerned with transaction log space management.
The main consideration is to keep your transactions reasonably sized since
only committed data can be removed from the log during a log backup (FULL or
BULK_LOGGED recovery model) or checkpoint (SIMPLE model).

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Hennie de Nooijer" <hdenooijer@.hotmail.com> wrote in message
news:191115aa.0402090341.23275aae@.posting.google.c om...
> Hi,
> Currently we're a building a metadatadriven datawarehouse in SQL
> Server 2000. We're investigating the possibility of the updating
> tables with enormeous number of updates and insert and the use of
> checkpoints (for simple recovery and Backup Log for full recovery).
> On several website people speak about full transaction log and the
> pace of growing can't keep up with the update. Therefore we want to
> create a script which flushes the dirty pages to the disk. It's not
> quite clear to me how it works. Questions we have is:
> * How does the process of updating, insert and deleting works with SQL
> Server 2000 with respect to log cache, log file, buffer cache, commit,
> checkpoint, etc?
> What happens when?
> * As far as i can see now: i'm thinking of creating chunks of data of
> 1000 records with a checkpoint after the Query. SQL server has the
> default of implicit transactions and so it will not need a commit.
> Something like this?
> * How do i create chunks of 1000 records automatically without
> creating a identity field or something. Is there something like SELECT
> NEXT 1000?
> Greetz,
> Hennie|||Hennie de Nooijer (hdenooijer@.hotmail.com) writes:
> * How does the process of updating, insert and deleting works with SQL
> Server 2000 with respect to log cache, log file, buffer cache, commit,
> checkpoint, etc?
> What happens when?
> * As far as i can see now: i'm thinking of creating chunks of data of
> 1000 records with a checkpoint after the Query. SQL server has the
> default of implicit transactions and so it will not need a commit.
> Something like this?
> * How do i create chunks of 1000 records automatically without
> creating a identity field or something. Is there something like SELECT
> NEXT 1000?

I hope Dan's reply shed some light on how the log works.

As for breaking up in chucks there is nothing built-in like:

UPDATE tbl
SET yadayada
...
WITH COMMIT EACH 1000 ROW

For DML statements you will have to find out how to do the batching best
from your data. Note there is no reason to keep a strict batch size, but
if you have some column that divides the data nicely that can be used.
For instance, if you are handling order data, you can take one day or
month at a time.

The one case where you can tell SQL Server to commit by each n row is
when you bulk load with BCP or BULK INSERT.

I also like to point out that selecting the batch size, is a trade-off
between log size and speed. Particularly if the search condition to
identify a batch takes time to execute, too small batch sizes can be
costly. A little depending on row size, but 100000 rows at time is not
an unreasonable number. 1000 is by far to small.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||> > * How does the process of updating, insert and deleting works with SQL
> > Server 2000 with respect to log cache, log file, buffer cache, commit,
> > checkpoint, etc?
> > What happens when?
> There's a discussion of this in the Books Online
> <architec.chm::/8_ar_sa_8unn.htm>.
How does this works?

> In summary, all data modifications are
> written to both the buffer cache and transaction log buffer cache. The log
> buffer is written first. Modified buffer cache pages are written to disk by
> either worker threads, the lazy writer or the checkpoint process. Worker
> threads write data when they need to wait for a read i/o anyway. The lazy
> writer process wakes up periodically and writes data in order to maintain a
> reasonable number of free cache buffers. The checkpoint process writes all
> dirty pages to disk periodically (about a minute by default) in order to
> reduce startup recovery time.
Ok clear..

> Log buffers are written as they become full, when you issue a COMMIT and
> during a checkpoint.
It thought that a commit is not neccessary a write to the disk. A
commit is just a end mark of a transaction as far i can see in the
documentation. A checkpoint will write the transactions to the
database.

> > * As far as i can see now: i'm thinking of creating chunks of data of
> > 1000 records with a checkpoint after the Query. SQL server has the
> > default of implicit transactions and so it will not need a commit.
> > Something like this?
> > * How do i create chunks of 1000 records automatically without
> > creating a identity field or something. Is there something like SELECT
> > NEXT 1000?
> I'm not sure I understand what you are asking here. In the default
> autocommit mode, each statement is a separate transaction.
> It seems you are mostly concerned with transaction log space management.
> The main consideration is to keep your transactions reasonably sized since
> only committed data can be removed from the log during a log backup (FULL or
> BULK_LOGGED recovery model) or checkpoint (SIMPLE model).
Yup this is what i meant. Only the question is what is a reasonable
chunk? Perhaps i will create parameter for this so i can tune this.|||> I hope Dan's reply shed some light on how the log works.
Yup

> As for breaking up in chucks there is nothing built-in like:
> UPDATE tbl
> SET yadayada
> ...
> WITH COMMIT EACH 1000 ROW
> For DML statements you will have to find out how to do the batching best
> from your data. Note there is no reason to keep a strict batch size, but
> if you have some column that divides the data nicely that can be used.
> For instance, if you are handling order data, you can take one day or
> month at a time.
Yeah well i have identity column which i could use for chunking the
records. Disadvantage is that it is an implementation depending on the
functionality of a column in the table. It's not what i want but i'll
have to deal with it.

> The one case where you can tell SQL Server to commit by each n row is
> when you bulk load with BCP or BULK INSERT.
Well this is not possible. I'm reading and insert/updating from other
tables in SQL server and BCP reads only from files.

> I also like to point out that selecting the batch size, is a trade-off
> between log size and speed. Particularly if the search condition to
> identify a batch takes time to execute, too small batch sizes can be
> costly. A little depending on row size, but 100000 rows at time is not
> an unreasonable number. 1000 is by far to small.
Well 1000 was just a hunch (is this rightly written). I thinking of
creating a parameter from which we can tune the chunks..

Thx you all for your answers

Greetz
Hennie|||> > There's a discussion of this in the Books Online
> > <architec.chm::/8_ar_sa_8unn.htm>.
> How does this works?

From the Books Online, you can go directly to the referenced topic by
clicking Go-->URL and pasting in the URL 'architec.chm::/8_ar_sa_8unn.htm'.

> > Log buffers are written as they become full, when you issue a COMMIT and
> > during a checkpoint.
> It thought that a commit is not neccessary a write to the disk. A
> commit is just a end mark of a transaction as far i can see in the
> documentation. A checkpoint will write the transactions to the
> database.

A COMMIT ensures all committed data are written to the log file so that
committed data are permanently persisted to the database. Modified data may
or may not have been written to data files because dirty data pages are
written separately by worker threads, the lazy writer or checkpoint process.
This is transparent to your application because data are always retrieved
from buffer cache.

Only committed data before the oldest uncommitted transaction can be removed
from the log. This occurs during a log backup in the FULL or BULK_LOGGED
recovery model or during a checkpoint in the SIMPLE model. Rather than
trying to take control of the checkpoint frequency, all you really need to
do is ensure your transactions are reasonably short and small.

> Only the question is what is a reasonable
> chunk? Perhaps i will create parameter for this so i can tune this.

If you are running in the SIMPLE recovery model and have many small
insert/update/delete transactions, I suggest you simply let SQL Server
truncate the log automatically. However, if you modify a lot of data (e.g.
millions of rows) in a single statement or transaction, you can use the
techniques Erland mentioned to keep the log size reasonable. In my
experience, the optimal amount is somewhat less that the amount of memory
allocated to SQL Server.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Hennie de Nooijer" <hdenooijer@.hotmail.com> wrote in message
news:191115aa.0402110050.7797d139@.posting.google.c om...
> > > * How does the process of updating, insert and deleting works with SQL
> > > Server 2000 with respect to log cache, log file, buffer cache, commit,
> > > checkpoint, etc?
> > > What happens when?
> > There's a discussion of this in the Books Online
> > <architec.chm::/8_ar_sa_8unn.htm>.
> How does this works?
> > In summary, all data modifications are
> > written to both the buffer cache and transaction log buffer cache. The
log
> > buffer is written first. Modified buffer cache pages are written to
disk by
> > either worker threads, the lazy writer or the checkpoint process.
Worker
> > threads write data when they need to wait for a read i/o anyway. The
lazy
> > writer process wakes up periodically and writes data in order to
maintain a
> > reasonable number of free cache buffers. The checkpoint process writes
all
> > dirty pages to disk periodically (about a minute by default) in order to
> > reduce startup recovery time.
> Ok clear..
> > Log buffers are written as they become full, when you issue a COMMIT and
> > during a checkpoint.
> It thought that a commit is not neccessary a write to the disk. A
> commit is just a end mark of a transaction as far i can see in the
> documentation. A checkpoint will write the transactions to the
> database.
> > > * As far as i can see now: i'm thinking of creating chunks of data of
> > > 1000 records with a checkpoint after the Query. SQL server has the
> > > default of implicit transactions and so it will not need a commit.
> > > Something like this?
> > > * How do i create chunks of 1000 records automatically without
> > > creating a identity field or something. Is there something like SELECT
> > > NEXT 1000?
> > I'm not sure I understand what you are asking here. In the default
> > autocommit mode, each statement is a separate transaction.
> > It seems you are mostly concerned with transaction log space management.
> > The main consideration is to keep your transactions reasonably sized
since
> > only committed data can be removed from the log during a log backup
(FULL or
> > BULK_LOGGED recovery model) or checkpoint (SIMPLE model).
> Yup this is what i meant. Only the question is what is a reasonable
> chunk? Perhaps i will create parameter for this so i can tune this.|||[posted and mailed, please reply in news]

Hennie de Nooijer (hdenooijer@.hotmail.com) writes:
> Yeah well i have identity column which i could use for chunking the
> records. Disadvantage is that it is an implementation depending on the
> functionality of a column in the table. It's not what i want but i'll
> have to deal with it.

One thing that I should have added is that you should see if the
clustered index is good for chunking. That can speed up the time
it takes for SQL Server to locate the rows quite a bit.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Sunday, March 11, 2012

Dynamic Filter Not Updating Correctly

Wednesday, February 15, 2012

DWH problem: updating a table with every 1000 records a checkpoint

sql dwh problem

Blog Archive

About Me