Thanks for reading this post.
We have an application that can import data from various sources
containined in various formats. This data is to be stored in one table!
How do we handle this scenario? The only way I could think of is to
have a generic table with columns called columnName, stringValue and
numericValue and store every field in each row of the input data as a
row in the generic table.
Is this good design or is there is a better to handle the problem
please.
Thanks...Hi,John
When you say > We have an application that can import data from various
sources
> containined in various formats? do you mean that you get the data from
> TEXT files ,Access Database ,Excel documents?
"John Smith" <postmaster@.sumanthcp.plus.com> wrote in message
news:1138098782.505092.292110@.g44g2000cwa.googlegroups.com...
> Thanks for reading this post.
> We have an application that can import data from various sources
> containined in various formats. This data is to be stored in one table!
> How do we handle this scenario? The only way I could think of is to
> have a generic table with columns called columnName, stringValue and
> numericValue and store every field in each row of the input data as a
> row in the generic table.
> Is this good design or is there is a better to handle the problem
> please.
>
> Thanks...
>|||Hi Uri,
Yes, we get data in different physical formats but the data we get can
have diferent schema/columns. We import everything into SQLSERVER and
use the imported data for reporting. If we recoginse some columns then
they will be used to produce some hash totals. I hope I have given you
the picture.
Thanks|||John Smith wrote:
> Thanks for reading this post.
> We have an application that can import data from various sources
> containined in various formats. This data is to be stored in one table!
> How do we handle this scenario? The only way I could think of is to
> have a generic table with columns called columnName, stringValue and
> numericValue and store every field in each row of the input data as a
> row in the generic table.
> Is this good design or is there is a better to handle the problem
> please.
>
> Thanks...
The external format shouldn't determine your database design. Design
the correct logical data model first. Then worry about how to get the
data into it.
Since you have given us zero information about the data you are
modelling I can't help you with the design. Anyway, newsgroups are not
the place to solve design problems. Good design requires more knowledge
about your business and requirements than we can reasonably expect to
learn through an online discussion.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||John Smith (postmaster@.sumanthcp.plus.com) writes:
> Thanks for reading this post.
> We have an application that can import data from various sources
> containined in various formats. This data is to be stored in one table!
> How do we handle this scenario? The only way I could think of is to
> have a generic table with columns called columnName, stringValue and
> numericValue and store every field in each row of the input data as a
> row in the generic table.
> Is this good design or is there is a better to handle the problem
> please.
Whether this a good design or not depends on what you will use the data
for. Without further knowledge, I would not put much faith in this
solution though. And you did mention reporting - that reporting will
not be any fun.
Probably you need to conduct further analysis of the data import, so you
can define more tables.
As for stringValue, numericValue etc, you may want to investigate the
type sql_variant, that can host any other datatype in SQL Server, save
text, ntext and image.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||The real question is - what are you doing with this data. When you have the
answer to that, you'll have the fundamentals of your logical model. Which
still has little to do with actually creating tables.
So, please tell us what this data is for (on your end) and we can guide you
to a solution. I can't imagine anyone just coming up with a solution based o
n
your post.
ML
http://milambda.blogspot.com/|||Look up various ETL tools for this kind of problem. They will do the
data scrubbing and format conversions. I would take a look at
Sunopsis.
numericValue and store every field [sic] in each row of the input data
as a row in the generic table. <<
There is no such animal as a "generic table" in a properly designed
RDBMS. Tables model one and only kind of entity and all the attributes
are clearly defined. To be is to be something in particular; to be
nothing in particular or everything in general is to be nothing at all
-- Aristotle.
The design flaw you have re-discovered is called EAV and you can
Google it.|||Thanks to all for your answers,
In fact we are trying to develop a tool which is capable of ELT and
that's why we need a central table to hold input data temporarily
before transforming it to create files/other databases.
The full picture is as follows: Our application will recieve
files/databases from external sources. The data we receive will have
some known attributes like for example - Quantity, Name, Cost etc. but
there may be more information (very likely) and we will not know what
those extra columns would be. We are planning to create two tables -
Core and Extra. Core will contain all the columns we know before hand
and will be used to run queries/reports (like checking hash values etc)
and Extra table which will be a EAV table used only during tranforming
data to other formats (formats are dynamic - users define them).
Is there an alternative elegant way of solving our problem or do we
have to live with the EAV flaw?|||John Smith wrote:
> Thanks to all for your answers,
> In fact we are trying to develop a tool which is capable of ELT and
> that's why we need a central table to hold input data temporarily
> before transforming it to create files/other databases.
> The full picture is as follows: Our application will recieve
> files/databases from external sources. The data we receive will have
> some known attributes like for example - Quantity, Name, Cost etc. but
> there may be more information (very likely) and we will not know what
> those extra columns would be. We are planning to create two tables -
> Core and Extra. Core will contain all the columns we know before hand
> and will be used to run queries/reports (like checking hash values etc)
> and Extra table which will be a EAV table used only during tranforming
> data to other formats (formats are dynamic - users define them).
> Is there an alternative elegant way of solving our problem or do we
> have to live with the EAV flaw?
I would echo Joe's suggestion that you consider off-the-shelf data
integration tools. Yours is precisely the type of application where
those packages have benefits over and above hand-coding your own
transformations. Specifically, you have changing metadata and so your
transformations won't be static. Therefore having a separate metadata
repository can work to your advantage because the tool will usually
take away some of the pain of generating the changing transformations.
As regards EAV I'd say that it won't sove your problem. There is no
substitute for developing properly normalized schema and EAV isn't
flexible enough to represent all non-relational data sources. How will
you model a XML hierarchy with EAV for example? Again, integration
tools can help because they will automate or semi-automate the process
of deriving a relational schema from the source. Also, bear in mind
that many of those tools expose their own API sothey are potential
still extensible with your own code.
Microsoft SQL Server Integration Services (SSIS)
www.microsoft.com/sql/technologies/...on/default.mspx
www.sqlis.com
Microsoft Data Transformation Services (DTS - the predecessor of SSIS)
www.sqldts.com
Other integration software
www-306.ibm.com/software/data/integration/dis/
www.abinitio.com
www.datamirror.com
www.datawatch.com
www.embarcadero.com/products/dtstudio/index.html
www.informatica.com
www.pervasive.com/solutions/
www.microsoft.com/biztalk/default.mspx
Enterprise Integration Patterns
www.enterpriseintegrationpatterns.com
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment