Tag Archives: Data Modeling

PowerDesigner do’s and don’ts

Many people consider PowerDesigner to be the de facto standard datamodelling tool. Many people are right. However, that does not mean the tool is perfect. As many users can testify, the version 16 release has been quite buggy in the beginning, only stabilizing a bit more with 16.5. And this is not exceptional. The repository is still buggy, projects are a recipe for pain, and let’s not start a discussion on license prices – we’d still be here next year.

However, if you avoid some practices and adopt others, using PowerDesigner is a breeze. Here is my take on things.

Do Not:

  • Use the repository
    The repository is a major cause of bugs. It looks nice, like a venus flytrap, and then it sucks you in and eats you for breakfast. Avoid it like the plague. You are better off spending some money on creating an extension to generate a model export to your own repository. You can buy this from I-Refact or other parties. The other functionality can be done better, cheaper and with less frustration and bugs by just using standard version control software (TFS, git, etc.). If you must compare models, you can do that from within PowerDesigner with very little effort – without losing parts of your model on the check-in/check-out.
    There is only one part of the repository that is actually semi-useful, which is the indication whether your model is out of date versus the repository version. As this functionality does not cooperate with replication or extensions that use that, there is little point in it once you evolve beyond the basics. Also, it is much better to split up your models so as to avoid getting in a situation with 10 people working on the same model. Even potentially. If this is a risk, appoint a designated datamodeller for such a model. The rest can get a read-only version.

  • Hide attributes on entities by hiding them
    Unless you use an extension to automate setting/un-setting this and also indicate this visually, it can create no end of trouble when the model shows tables and columns but leaves out certain columns that then get deployed anyway. It takes ages to debug that. If you must do this, make sure it’s an all or nothing proposition: either hide all standard attributes, or none.

  • Create shortcuts to other models
    While PowerDesigner does this automatically once you start creating mappings, there is no need to refer to models outside the scope of the folder, as this will render the models almost impossible to deploy without heaps of pop-ups asking about other models that you have not yet stored in the right location (and don’t even know where they should be located). Only consider this if you have an agreed-upon folder structure and even then I recommend you don’t do this.

  • Create Projects
    Sure, they’re good for having a dependency graph view. But you can create those anyway. And projects are buggy, especially when interacting with the repository. Half the bugs I found in PowerDesigner went away when I stopped using projects and moved to workspaces. No more disappearing models, or graphics. No more models that are impossible to check out or check in.

  • Work for long periods without saving
    The PowerDesigner auto-save function is nonexistent. After you work with PowerDesigner for a while, you will learn to save often. It becomes a reflex. Because it hurts when you lose hours of work through a crash. It’s not as bad as it was when you were still using version 16.5.0, with repository and projects, but still.

  • Use auto-layout without a current back-up
    Your gorgeous, handcrafted model could use a minor improvement and you used auto-layout. And then you pressed “save” automatically, because by now it’s a reflex. And when the screams died down, you realized you didn’t have a current backup. Ouch. Backup often. If you use Git: commit often.

  • Model the entire Logical Data Model as a hierarchy of subtypes
    I have seen them, with their entity types derived from the Object supertype and each other, six hierarchical layers deep. I dare you to try it with a non-trivial model and then generate a working physical model out of it. Go ahead, make my day…

  • Create a unique data domain for each attribute
    This sort of misses the point of data domains. Because while they are rather limited in PowerDesigner (no entity types or attribute groups), they are most useful when they provide a single point to change definitions of common attributes. Use them freely, but let the data architect decide which ones are available for use. It’s best to create a single model for this, that you can use as a template for the other models you create.

But Do:

  • Add metadata to your models
    Especially metadata that describes the following items: Title, Subject Area, Author, Version, Data (Model) Owner, Modified Date, Modifications, Validation Status

  • Add domains
    Create a list of standard attribute domains, then create a template model containing them. People can either copy the model file directly and use it as a template (this creates havoc in a repository though, because the internal model ID will be the same as that of the template model), or copy the attribute definitions into your own model. The definitions should be controlled by the data architect(s).

  • Add attribute groups
    If you create attribute groups of commonly grouped attributes in keyless entities, you can then inherit from multiple of these entities in order to combine them. Most useful when you have things like “firstname/lastname” pairs of attributes that you do not want to separate out to their own entity, for some reason. Use with caution.

  • Tie models together with separate workspaces for each project
    Workspaces are small files with zero overhead that tie different models together. They have no impact on the repository check-in/check-out, they are files that can be under source control, and they are pretty much bug-free. You can even edit them when necessary. Much better than projects.

  • Store your models in version control systems
    Seriously, I should NOT have to say this, but I keep meeting people who don’t seem to realize that MODELS ARE CODE. And with a VCS I do not mean that abortion they call the repository. I mean TFS, Git or even Subversion. Anything that works, basically.

  • Save often
    If you don’t, you’ll regret it.

  • Store backups
    Having version control is not the same as having backups, unless you commit often.

  • Create a folder structure that is the same for everyone and make it mandatory
    If you don’t, you’ll create unending pop-ups whenever someone opens a model they did not create themselves. If they check it in, it’s your turn the next time you open it from the repository.

Is data modeling the next job to be replaced by AI?


The more I read, the more I am convinced that data modeling is an activity that will be supported, perhaps even replaced, by deep learning AI like Watson real soon now and that the true focus of data governance should be the business conceptual model. All other models derive from there.

After posting this on LinkedIn I received some interesting feedback. The main question was of course “how so?”. Well, here are a number of the things I read and experienced that led me to this conclusion.

1. Data Modeling Essentials (Simsion/Witt, 3rd ed. 2005). It’s not very strict but very readable and has a nice division between the three layers: conceptual, logical, physical. This is where I started. It explained clearly that these different layers are distinct, but not exactly *how*. Or rather: I didn’t understand it at the time.

2. Applied Mathematics for Database Professionals (2007, Lex de Haan/Toon Koppelaars). This explains how the logical model is built and differs from the conceptual and physical models. It starts and ends with the math behind it all (predicate logic and set theory) yet is easy to understand for most data professionals. It made the distinction between the different layers MUCH clearer to me and especially how the logical model is a fully mathematical activity. That is a hard separation between the conceptual and the logical model, even though the semantic part of the logical model is still derived from the conceptual model. Because if you suddenly use different terms and verbs, it is obviously not a derivation but instead a different model.

3. The FCO-IM book by Bakema/Zwart/van der Lek, which explains how you can have a conceptual model that is fact-oriented and can then be transformed into a logical model. This means that when you can impose an ordering on the language of the conceptual model, you should be able to derive a logical model from it.

4. My own experience with fully verbalized data models, i.e. data models written down in sentences. Most data models have a small vocabulary: “One Client orders one or more Products. Our Company has a Relationship with a Party. A Client is a Party. A Legal Entity is a Party.” The amount is in fact unlimited if you so desire, but in practice can be boiled down so far there is even a standard for it: Semantics of Business Vocabulary and Rules (SBVR).

5. Very influential: The “ISO TR 9007 – Information processing systems – Concepts and terminology”. This defined the conceptual model and was created by a number of very well-known information modeling scientists. It influenced me heavily because it defines a conceptual model as essentially something that conveys semantic meaning, i.e.: words on paper. It starts, for instance, with the Helsinki principle.

6. Discussions with Martijn Evers (and numerous other colleagues at the Dutch Central Bank and outside it) about data modeling.

7. The sorry state of modeling tools in general. If PowerDesigner is the best we have, we’re in trouble. Sparx Enterprise Architect is actually pretty good, but you can’t program it so you have to make do with what it is. E/R Studio just crashed when trying to evaluate it last time round. And https://www.vertabelo.com/ looks nice and is webbased, but it just does the physical modeling. None of those tools do conceptual modeling anywhere near right. Colibra has the textual part down pat, but only relates business terms to other terms. An ontology, while nice, is the start of the process, not the end. Conceptual modeling is a text-based activity that encompasses the business glossary, the visualization, the business rules etcetera and it’s not just about creating lines and boxes on a canvas. There is currently no tool that does conceptual modeling as a text-based activity.

8. The rapidly advancing state of the art in deep learning. I had artificial intelligence courses in the 90’s, but it wasn’t very advanced at the time. Nowadays, it’s much further. I’ve been looking into Watson (IBM) capabilities over the past week. See for instance https://www.ibm.com/watson/services/natural-language-understanding/ where it says that Watson can: “Analyze text to extract meta-data from content such as concepts, entities, keywords, categories, relations and semantic roles.”

9. The experience of the Watson team that a combination of skilled human and decent AI will generally beat even a higher skilled AI and higher skilled human.

10. The fact that the conceptual model is verbalized, and that we now have Wikipedia and Facebook. I think we can have “social modeling media” now. This should increase the speed of modeling immensely. Example: suppose that DNB owns definitions in the Dutch part of the global Finance Business Glossary. And that we can derive logical models from that glossary. And physical models from that logical model. Then that would speed up the process of building reports and applications to conform to regulations enormously. It would also nail the nasty little business model of IBM and Teradata with its hide to the wall (and not by incident).

11. Finally, I can see as well as anyone the incredibly urgent need for data modeling and the serious and increasing lack of data modeling expertise in the labor market. We can either give up on modeling, or we can make it so easy everyone *can* do it. Therefore, the latter *will* be done. Example: the car market was once considered to be limited to the number of available professional chauffeurs. So, the need for professional chauffeurs was removed. Otherwise, Henry Ford wouldn’t have had a market.

I cannot say with certainty that AI will be able to fully automate the entire modeling process (especially not reverse engineering). But what I *can* say is that a large amount of what a modeler does can be automated even today. An even larger part can be done with AI, and the final part can be done together.

In the comments, one person said that AI deep learning and modeling are (mathematically speaking) fundamentally different activities and that therefore you cannot automate the modeling. I agree that they are different, but I do not think it matters for the end result. Playing Go is fundamentally different from translating text, but both can be handled by the same type of algorithm. I foresee this will be similar to modeling, for most cases. The fact that you cannot handle all cases is irrelevant: see item 9.

What about source models without meaning, that can be modeled today by humans, another comment asked. I do not think that that is a relevant issue: reverse engineering will remain difficult, but it’s not the issue I’m concerned with. Because reverse engineering means turning an illegible set of fields and tables into a conceptual, textual model. My point is that the reverse process of turning text into tables and fields is easy to automate, the other way round is inherently just as difficult as turning mayonaise into eggs: the arrow of time, entropy, just does not fly that way.

Martijn Evers stated that “given sufficient quality in models that represent semantic concerns, and sufficient quality in deriving logical models we can nowadays already generate all kinds of implementation models. No AI is needed here. This could change for very large and complex implementation models, but by and large we can do a lot without AI in this respect.” I agree. I just think that using AI reduces the necessary quality of the model to the point where even people with moderate amounts of training can handle this competently. Which is the point of item 11.

Most commenters agreed that a human-machine hybrid approach was in the works. One even pointed out an existing (and very good!) recent article about this topic: http://www.dataversity.net/artificial-intelligence-vs-human-intelligence-hey-robo-data-modeler/. The question thus is “when” and “how”, not “if”.

Please note: this article was also posted on Linked: https://www.linkedin.com/pulse/data-modeling-next-job-replaced-ai-ronald-kunenborg

Image credits: created by Alejandro Zorrilal Cruz [public domain], via Wikimedia Commons. Source: https://commons.wikimedia.org/wiki/File:Artificial.intelligence.jpg

ISO standards for Finance business data

When I define a business glossary to prepare for the high-level corporate data model, I try to incorporate as much of the relevant standards as I can. Because usually, knowing up front about a standard will make it much easier later on to integrate with other parties in the value chain, to report to regulatory authorities that use the same standards, and to apply Master Data Management. The more data that adheres to international standards, the less work you have in managing it.

Below, I have provided a list of ISO standards that can be used to aid in the governance of your business glossary and data models, standards that provide metadata specific to Finance and standards that provide identification schemes for key entities.

Note that there are more finance data and metadata standards than just the ISO standards. These will be listed in a different post that I will then link from here (and vice versa).

ISO standardAreaDescriptionID?
ISO 00639GeneralISO 639 defines language codes, as opposed to country codes. The standard consists of 6 parts, some more detailed than others. The preferred standard is ISO 639-3, which is the most comprehensive substandard. Usually, we restrict ourselves to a subset of supported languages.

See for more information: https://en.wikipedia.org/wiki/ISO_639
ISO 03166GeneralISO 3166 is a standard published by the International Organization for Standardization (ISO) that defines codes for the names of countries, dependent territories, special areas of geographical interest, and their principal subdivisions (e.g. provinces or states). The official name of the standard is "Codes for the representation of names of countries and their subdivisions". It consists of three parts:

  • ISO 3166-1 contains all codes currently in use
  • ISO 3166-2 contains all codes for subdivisions
  • ISO 3166-3 contains all codes no longer in use

The three standards contain several codes: alpha-2, alpha-3 and alpha-4. The alpha-2 code is the recommended code for general use.

See for more information: https://www.iso.org/iso-3166-country-codes.html
ISO 04217GeneralISO 4217 is the standard that defines codes for currencies, as well as funds and minor currency units. The codes can be represented as a 3 letter code, or a numerical code with 3 positions, which is usually the same as the numerical country code from ISO 3166-1. The minor currency is given as an exponent for the division, by 10. I.e. if the minor currency is "3", the currency can be divided into 1000 minor units. The name of the minor unit is not part of this standard.

The current version of the standard is ISO 4217:2015.

See for more information: https://www.iso.org/iso-4217-currency-codes.html
ISO 06166FinanceThe ISO 6166 standard is called "Securities and related financial instruments -- International securities identification numbering system (ISIN)". This standard describes and defines the International Securities Identification Number. The number applies to fungible and non-fungible securities and financial instruments.

ISINs consist of two alphabetic characters, which are the ISO 3166-1 alpha-2 code for the issuing country, nine alpha-numeric digits (the National Securities Identifying Number, or NSIN, which identifies the security), and one numeric check digit. The NSIN is issued by a national numbering agency (NNA) for that country. Regional substitute NNAs have been allocated the task of functioning as NNAs in those countries where NNAs have not yet been established.

ISINs are slowly being introduced worldwide. At present, many countries have adopted ISINs as a secondary measure of identifying securities, but as yet only some of those countries have moved to using ISINs as their primary means of identifying securities.

The current version of the standard is ISO 6166:2013.

See for more information: https://www.iso.org/standard/44811.html
http://www.isin.net/iso-6166/
Yes
ISO 08601GeneralISO 8601 is about "Data elements and interchange formats - information interchange - representation of dates and times". It details how to represent dates and times when exchanging them with other systems in an unambiguous way.

The current version of the standard is ISO 8601:2014.

See for more information: https://www.iso.org/iso-8601-date-and-time-format.html
ISO 09362FinanceThis ISO standard defines the Business Identifier Code (BIC). BIC is an international standard for identification of institutions within the financial services industry. BICs are used in automated processing. They unambiguously identify a financial institution or a non-financial institution. The ISO 9362 standard specifies the elements and the structure of a BIC. A BIC consists of either eight or eleven contiguous characters. These characters comprise either the first three, or all four, of the following components: party prefix, country code, party suffix, and branch identifier. The ISO has designated SWIFT as the BIC registration authority.

The EU regulation 260/2012, also known as the IBAN only rule, requires financial institutions to add the BIC code to IBAN payments.

The rule has applied to any domestic EURO payment since February 2014, to any cross-border EURO payment between EU countries since February 2016, and to any cross-border EURO payment from non-euro countries since October 2016.

See for more information: https://www.iso.org/obp/ui/#iso:std:iso:9362:ed-4:v1:en
https://www2.swift.com/sitesearch/#/?q=BIC
ISO 10383FinanceISO 10383 is called "Codes for exchanges and market identification (MIC)". It defines the Market Identifier Code (MIC).

This International Standard specifies a universal method of identifying exchanges, trading platforms, regulated or non-regulated markets and trade reporting facilities as sources of prices and related information in order to facilitate automated processing. Each such exchange, platform etc. receives a unique code from the registrar.

See for the current list: https://www.iso20022.org/10383/iso-10383-market-identifier-codes
Yes
ISO 10962FinanceISO 10962 defines the structure and format for classification of financial instruments approved by the International Organization for Standardization (ISO). There are many types of Financial Instruments used for saving, investing, trading, hedging and speculating. These instruments are generally organized in groups called "asset classifications." The most common asset classifications are generally described using terms like "Equities (Stocks)," "Debt (Bonds)," "Derivatives (Contracts)," "Currencies," and a few other generalized terms.

ISO 10962 provides a global standard for these classifications in the form of specific codes. Classification of financial instrument (CFI) Code is used to define and describe financial instruments as a uniform set of codes for all market participants. The code is issued by the members of ANNA, the Association of National Numbering Agencies. The group is currently working to simplify the structure so that it can be adopted more widely by non-governmental market participants.

The letters from the ISO basic Latin alphabet in each position of this 6 character code reflect specific characteristics intrinsic to the financial instruments that are defined at the issue of the instrument, and which in most cases remain unchanged during the lifetime of the instrument (or by the market on which the instrument trades).

See for more information: https://en.wikipedia.org/wiki/ISO_10962 or visit the registrar homepage
ISO 11179MetadataThe ISO/IEC 11179 Metadata Registry (MDR) standard) is an international standard for representing metadata for an organization in a metadata registry. ISO/IEC 11179 claims that it is (also) a standard for metadata-driven exchange of data in an heterogeneous environment, based on exact definitions of data.

The ISO/IEC 11179 model is a result of two principles of semantic theory, combined with basic principles of data modelling. The first principle from semantic theory is the thesaurus type relation between wider and more narrow (or specific) concepts, e.g. the wide concept "income" has a relation to the more narrow concept "net income". The second principle from semantic theory is the relation between a concept and its representation, e.g., "buy" and "purchase" are the same concept although different terms are used.

The standard consists of six parts:
ISO/IEC 11179-1:2015 Framework (referred to as ISO/IEC 11179-1)
ISO/IEC 11179-2:2005 Classification
ISO/IEC 11179-3:2013 Registry metamodel and basic attributes
ISO/IEC 11179-4:2004 Formulation of data definitions
ISO/IEC 11179-5:2015 Naming and identification principles
ISO/IEC 11179-6:2015 Registration

Part 1 explains the purpose of each part. Part 3 specifies the metamodel that defines the registry. The other parts specify various aspects of the use of the registry. An additional part, Part 7: Datasets is currently under development.

For use in the creation of data models, part 4 and especially part 5 provide common standards that could be used in data governance to govern the creation of data models.

See for more information: https://en.wikipedia.org/wiki/ISO/IEC_11179
ISO 13616FinanceThe International Bank Account Number (IBAN) is an internationally agreed system of identifying bank accounts across national borders to facilitate the communication and processing of cross border transactions with a reduced risk of transcription errors.

The ISO standard was split in two parts in 2007. ISO 13616-1:2007 "specifies the elements of an international bank account number (IBAN) used to facilitate the processing of data internationally in data interchange, in financial environments as well as within and between other industries" but "does not specify internal procedures, file organization techniques, storage media, languages, etc. to be used in its implementation". ISO 13616-2:2007 describes "the Registration Authority (RA) responsible for the registry of IBAN formats that are compliant with ISO 13616-1 [and] the procedures for registering ISO 13616-compliant IBAN formats".

The official IBAN registrar under ISO 13616-2:2007 is SWIFT.

The IBAN consists of up to 34 alphanumeric characters comprising: a country code; two check digits; and a number called the Basic Bank Account Number (BBAN) that includes the domestic bank account number, branch identifier, and potential routing information. The check digits enable a sanity check of the bank account number to confirm its integrity before submitting a transaction.

The current version of the standard is ISO 13616:2007

See for more information: https://en.wikipedia.org/wiki/International_Bank_Account_Number
Yes
ISO 15022Metadata - FinanceISO 15022 is the precursor to (and superseded by) ISO 20022.

See for more information: https://www.iso20022.org/15022/uhb
ISO 17442BusinessThe International Organization for Standardization (ISO) 17442 standard defines a set of attributes or legal entity reference data that are the most essential elements of identification. The Legal Entity Identifier (LEI) code itself is neutral, with no embedded intelligence or country codes that could create unnecessary complexity for users.

Four key principles underlie the LEI:

  • It is a global standard.
  • A single, unique identifier is assigned to each legal entity.
  • It is supported by high data quality.
  • It is a public good, available free of charge to all users.
Once a legal entity has obtained an LEI, it will be published together with the related LEI reference data by the organization that has issued the LEI. This means the full data on the entire LEI population is publicly available for unrestricted use by any interested party at all times, including the set of reference data for each LEI code.

The LEI code is structured as follows:

  • Characters 1-4: Prefix used to ensure the uniqueness among codes from LEI issuers (Local Operating Units or LOUs).
  • Characters 5-18: Entity-specific part of the code generated and assigned by LOUs according to transparent, sound and robust allocation policies. As required by ISO 17442, it contains no embedded intelligence.
  • Characters 19-20: Two check digits as described in the ISO 17442 standard.
The current version of the standard is ISO 17442:2012.

See for more information: https://www.iso.org/standard/59771.html or visit the homepage of the registrar
Yes
ISO 18774FinanceISO 18774 defines the Financial Instrument Short Name. The new standard for the Financial Instrument Short Name (ISO 18774) standardizes short names and descriptions for financial instruments. The standard was approved in September 2014.

As of July 1 2017, the FISN will be globally assigned concurrently with the ISIN (ISO 6166) and CFI (ISO 10962) at the time of issuance of a new financial instrument.

The ISO 18774 standard incorporates the issuer short name and the abbreviated characteristics for the financial instrument. It has a maximum length of 35 alphanumeric characters.

Unlike other ISO-standard financial instrument identification codes, the FISN is not meant to be machine-readable, but to provide a short format for essential information about a security for human use.

See for more information: http://www.anna-web.org/standards/fisn-iso-18774/
ISO 19773MetadataThis International Standard specifies small modules of data that can be used or reused in applications. These modules have been extracted from ISO/IEC 11179-3, ISO/IEC 19763, and OASIS EBXML, and have been refined further. These modules are intended to harmonize with current and future versions of the ISO/IEC 11179 series and the ISO/IEC 19763 series.

Part of the standard are, amongst others:

  • a data structure for UPU postal data
  • a data structure for ITU T E.164 phone number data
The current version of the standard is ISO/IEC 19773:2011.

See for more information: https://www.iso.org/standard/41769.html
ISO 20022Metadata - FinanceISO 20022 is an ISO standard for electronic data interchange between financial institutions. It describes a metadata repository containing descriptions of messages and business processes, and a maintenance process for the repository content. The standard covers financial information transferred between financial institutions that includes payment transactions, securities trading and settlement information, credit and debit card transactions and other financial information.

The repository contains a huge amount of financial services metadata that has been shared and standardized across the industry. The metadata is stored in UML models with a special ISO 20022 UML Profile. Underlying all of this is the ISO 20022 meta-model - a model of the models. The UML profile is the meta-model transformed into UML. The metadata is transformed into the syntax of messages used in financial networks. The first syntax supported for messages was XML Schema.

The standard contains a number of external reference code lists, that are available on the website in the form of spreadsheets and documentation. The data dictionary present in ISO 15022 is no longer available as a spreadsheet, but can be downloaded as a 96MB xml-file.

See for more information: https://www.iso20022.org/
ISO 20275FinanceThe ISO 20275 standard defines Entity Legal Form (ELF) worldwide. The latest (and first) version is ISO 20275:2017 (en). It covers the legal forms available per country (or country grouping), as long as that country has an ISO 3166-1 alpha-2 code.

The standard can be obtained from the ISO but the codelist itself is maintained by the Global Legal Entity Identifier Foundation (GLEIF) and can be obtained here.

Interesting to note is that where the AnaCredit list goes slightly off the rails with the European Legal Form "Societas Europaea", this list solves it in a nicer way by repeating the SE legal form for all countries involved. Although using EU is actually allowed by ISO 3166-1 standards as it is an alpha-2 code reserved for the special use of the EU, this way is cleaner as you now only deal with countries. Since the AnaCredit list goes off the rails in more ways than this, you may want to use this ISO standard as your main reference data set and add a mapping to the ECB's rather ragtag list of legal forms.

Currently there are legal forms for 55 countries in the list so locations that are currently missing need to be added through a feedback form.

The ELF for companies in the LEI register has been made public as of March 1st, 2018.

Certified Anchor Modeler

As of today, I am certified as Anchor Modeler. My thanks go to Lars Rönnbäck (UpToChange.com), the best teacher you could have, as well as Juan-José van der Linden for inviting me and to Essent for hosting the course.

While the community of Anchor Modelers is still quite small, it will likely expand as the concurrent-reliance-temporal model is extremely interesting. The notion of positors and reliance combined with the positing and changing time is quite advanced. I’m looking forward to combining this with Martijn Evers’ notions about timeline choices with respect to Consistency/Accuracy/Availability.

Presentation: history of DWH modeling

Dear readers, on june 6th I held a keynote presentation in front of 300 people, summarizing the state of DWH modeling. The conference proceedings of the day are available at BI-Podium .

My own presentation is available here as well: Next Generation DWH Modeling 2013 conference keynote speech

The Anchor Modeling folks also wrote a summary: Next Generation DWH Modeling

Anchor Modeling


Anchor Modeling is a new method of modeling a domain in a database. The method splits up all the attributes in their own table. This seems complex, but this actually simplifies maintenance. Furthermore, the method is flexible, quite resilient to change over time, does not need updates and is highly scalable.

These are good properties for a data warehouse model. In the article I explain how Anchor Modeling works and why you should at least take a look at it.
The article appeared in november 2009 in Database Magazine, Dutch magazine for database professionals. However, the magazine is now defunct and superseded by Business Information Magazine.

Download the PDF