Thursday, March 29, 2012

Dynamic Security Stored Procedure Repeatedly Called

I have implemented an SSAS stored procedure for dynamic security and I call this stored procedure to obtain the allowed set filter. To my supprise, the stored procedure is being called repeatedly many times (more than 10) upon establishing the user session. Why is this happening?Do you have more than 10 attributes in this dimension ? (i.e. number of calls to sproc should be the same as number of attributes in dimension).|||

Thank you. I don't quite follow. The dimension is wide (it has some 30 attributes). However, only one attribute hierarchy has an allowed set filter defined. Why does this need to be evaluated many times?

In addition, the dimension is large (some 2 mil plus members). Even with allowed set caching inside the stored procedure, it takes an enormous amount of time (some 10 min) for the filter to be applied, e.g. if the filter returns some 40,000 allowed members. Any optimization tips?

|||10 minutes to filter 40,000 members out of 2 million is way too much. I've written sprocs which were able to do comparable filtering in a matter of seconds. Of course, it depends on the logic inside sproc. Would you share more details please - what is the criteria for the filtering ?|||

Thank you for helping out.

As I mentioned, the dimension has some 2 mil members. I has a Security Filter attribute with some 50,000 members on which the allowed set is applied. The allowed set expression is StrToSet(<call to stored procedure here>). The stored procedure returns a comma-seperated list of the allowed members in the format [DimensionName].[Security Filter].&[key].

The stored procedure queries a database to get the set but this is very fast. Then, it caches the set to avoid repetative calls.

|||

> The allowed set expression is StrToSet(<call to stored procedure here>). The stored procedure returns a comma-seperated list of the allowed members in the format [DimensionName].[Security Filter].&[key].

This is not a good practice. Much better approach is to use Server Adomd.NET object model and return AdomdServer.Set from sproc instead of giant string. But since you query database table, I wonder whether you can make this table part of UDM and build joins inside UDM and avoid sproc altogether...

|||

OK, I will test both approaches (StrToSet and server-side Set) and post the results here.

-update

Thanks for the tip. Server Set resulted in much better performance. The first column shows the number of members in the allowed set. Time is in seconds.

Filer # StrToSet Set
50005235
100009058
20000171108
40000272144

end update

Would you mind eleborating more about the table idea? Do you mean a new dimension table that will slice the fact table or do you refer to a many-to-many dimension?

|||

> Would you mind eleborating more about the table idea? Do you mean a new dimension table that will slice the fact table or do you refer to a many-to-many dimension?

Since it seems that the information about which users can see which members is already stored in the table, you could build an aux measure group with User dimension and your dimension included, and then use something like

Exists(MyDimension.MyAttribute.MyAttribute.MEMBERS, StrToMember("[User].[" + Username + "]", "securitymeasuregroup")

as expression for the allowed set.

|||Thank you. But this will result in a very large fact table which will hold the allowed members for each user, corrrect? So, if I have 100 users and each user is allowed to see 100,000 members on average, the table will have 10,000,000 rows. I wonder how both approaches compare from a performance standpoint. Do you have any performance test results to share?|||Before we can compare the approaches - can you explain how the database table used by sproc looks like ? Doesn't it have the same structure (i.e. for user you need to be able to get list of allowed members), and therefore same size ?|||

To make the things simpler, let's assume that that in both cases the security policy will be materialized into the same table (dimension surrogate key, employee surrogate key). In the case of dynamic security, the stored procedure will query this table by user and apply the allowed set but the allowed set will not exceed the maximum number of members that the user can access. In the case of the security measure group, EXISTS can be potentially applied over millions of records. As the securiy table gets larger, I'd expect the benefits of EXISTS to level off. Or, will it?

|||Please note, that the call to Exists is always filtered by the current User, therefore there will never be a full scan of this measure group. I.e. if this measuregroup was implemented as pure ROLAP - it would generate exactly same SQL queries as you are likely doing from your sproc already. And, of course, in case of MOLAP it will be much more efficient. Therefore I beleive that the approach with EXISTS will always outperform approach with sproc no matter what sizes of tables will be.|||Thank you so much for you help, Mosha. I will post to this thread if I find otherwise.|||I think it will be very interesting to everybody to see results of your testing in either scenario. While I have seen approach that I recommended working well in number of installations, your data volumes (100 users each one having access to different set of 100,000 members out of 2 million) are interesting enough to see how well it will perform in the real world. My prediction is that it should perform well, but it is only a feeling without hard data behind it.|||

Got results. The materialized results are impressive! For the sake of testing, I generated all permutations of 250 employees with 52,000 securable items resulting in a fact table with some 13 mil rows. I compared this against the dynamic security approach where 52,000 securable items are converted to a Set. I tested the connect time from Excel and noted the query log events in the Profiler.

Dynamic Security Materialized Security (non-partitioned)

30-35 sec 8-12 sec

Not to mention that the dynamic security test currently excludes the application latency (records are read directly from a table). In real life, it would take additional time for stored procedure/service to prepare the security filter.

Thank you so much for the tip, Mosha! We will go probably with materialized security.

Do you think that partitioning the Security Filter measure group (if there is a way to logically group users and minimize the number of partitions) would decrease the query time even further?

sql

No comments:

Post a Comment