Tag Archives: best practice

PowerDesigner do’s and don’ts

Many people consider PowerDesigner to be the de facto standard datamodelling tool. Many people are right. However, that does not mean the tool is perfect. As many users can testify, the version 16 release has been quite buggy in the beginning, only stabilizing a bit more with 16.5. And this is not exceptional. The repository is still buggy, projects are a recipe for pain, and let’s not start a discussion on license prices – we’d still be here next year.

However, if you avoid some practices and adopt others, using PowerDesigner is a breeze. Here is my take on things.

Do Not:

  • Use the repository
    The repository is a major cause of bugs. It looks nice, like a venus flytrap, and then it sucks you in and eats you for breakfast. Avoid it like the plague. You are better off spending some money on creating an extension to generate a model export to your own repository. You can buy this from I-Refact or other parties. The other functionality can be done better, cheaper and with less frustration and bugs by just using standard version control software (TFS, git, etc.). If you must compare models, you can do that from within PowerDesigner with very little effort – without losing parts of your model on the check-in/check-out.
    There is only one part of the repository that is actually semi-useful, which is the indication whether your model is out of date versus the repository version. As this functionality does not cooperate with replication or extensions that use that, there is little point in it once you evolve beyond the basics. Also, it is much better to split up your models so as to avoid getting in a situation with 10 people working on the same model. Even potentially. If this is a risk, appoint a designated datamodeller for such a model. The rest can get a read-only version.

  • Hide attributes on entities by hiding them
    Unless you use an extension to automate setting/un-setting this and also indicate this visually, it can create no end of trouble when the model shows tables and columns but leaves out certain columns that then get deployed anyway. It takes ages to debug that. If you must do this, make sure it’s an all or nothing proposition: either hide all standard attributes, or none.

  • Create shortcuts to other models
    While PowerDesigner does this automatically once you start creating mappings, there is no need to refer to models outside the scope of the folder, as this will render the models almost impossible to deploy without heaps of pop-ups asking about other models that you have not yet stored in the right location (and don’t even know where they should be located). Only consider this if you have an agreed-upon folder structure and even then I recommend you don’t do this.

  • Create Projects
    Sure, they’re good for having a dependency graph view. But you can create those anyway. And projects are buggy, especially when interacting with the repository. Half the bugs I found in PowerDesigner went away when I stopped using projects and moved to workspaces. No more disappearing models, or graphics. No more models that are impossible to check out or check in.

  • Work for long periods without saving
    The PowerDesigner auto-save function is nonexistent. After you work with PowerDesigner for a while, you will learn to save often. It becomes a reflex. Because it hurts when you lose hours of work through a crash. It’s not as bad as it was when you were still using version 16.5.0, with repository and projects, but still.

  • Use auto-layout without a current back-up
    Your gorgeous, handcrafted model could use a minor improvement and you used auto-layout. And then you pressed “save” automatically, because by now it’s a reflex. And when the screams died down, you realized you didn’t have a current backup. Ouch. Backup often. If you use Git: commit often.

  • Model the entire Logical Data Model as a hierarchy of subtypes
    I have seen them, with their entity types derived from the Object supertype and each other, six hierarchical layers deep. I dare you to try it with a non-trivial model and then generate a working physical model out of it. Go ahead, make my day…

  • Create a unique data domain for each attribute
    This sort of misses the point of data domains. Because while they are rather limited in PowerDesigner (no entity types or attribute groups), they are most useful when they provide a single point to change definitions of common attributes. Use them freely, but let the data architect decide which ones are available for use. It’s best to create a single model for this, that you can use as a template for the other models you create.

But Do:

  • Add metadata to your models
    Especially metadata that describes the following items: Title, Subject Area, Author, Version, Data (Model) Owner, Modified Date, Modifications, Validation Status

  • Add domains
    Create a list of standard attribute domains, then create a template model containing them. People can either copy the model file directly and use it as a template (this creates havoc in a repository though, because the internal model ID will be the same as that of the template model), or copy the attribute definitions into your own model. The definitions should be controlled by the data architect(s).

  • Add attribute groups
    If you create attribute groups of commonly grouped attributes in keyless entities, you can then inherit from multiple of these entities in order to combine them. Most useful when you have things like “firstname/lastname” pairs of attributes that you do not want to separate out to their own entity, for some reason. Use with caution.

  • Tie models together with separate workspaces for each project
    Workspaces are small files with zero overhead that tie different models together. They have no impact on the repository check-in/check-out, they are files that can be under source control, and they are pretty much bug-free. You can even edit them when necessary. Much better than projects.

  • Store your models in version control systems
    Seriously, I should NOT have to say this, but I keep meeting people who don’t seem to realize that MODELS ARE CODE. And with a VCS I do not mean that abortion they call the repository. I mean TFS, Git or even Subversion. Anything that works, basically.

  • Save often
    If you don’t, you’ll regret it.

  • Store backups
    Having version control is not the same as having backups, unless you commit often.

  • Create a folder structure that is the same for everyone and make it mandatory
    If you don’t, you’ll create unending pop-ups whenever someone opens a model they did not create themselves. If they check it in, it’s your turn the next time you open it from the repository.

How to learn JavaScript

I’ve been busy with JavaScript for some time now – with various degrees of succes – and I thought it would be nice to list a few resources that I found both quite helpful, and accessible.

Highly recommended, but not used by me because I only found out about it after the fact:

Once you know a bit more about JavaScript (or ECMAScript, as it is properly called) you probably want to use it in something interesting. I’ve built a few things with the JavaScript graphics library D3 that give immediate results in just a few lines of code, which is a great motivator.

If you have any suggestions for improvements or additions, feel free to let me know in the comments!

Encryption is not a silver bullet

Have I been pwned?Recently, well-known security researcher Troy Hunt, responsible for the website Have I been pwned? described how someone lost 324000 records with full creditcard details, including security codes, by posting them on a public server. There were two parties suspected of the data breach, but neither could find any breach at first. So both parties stated categorically that there was no breach, all data was 100% encrypted and completely secure on their servers so the problem had to lie elsewere. And they were right, all the data was encrypted.

Now, encrypted data should be safe. And to be honest, encryption is more and more the mainstay of securing your data. Firewalls can be breached, servers and companies infiltrated, but if the data is encrypted it should remain secure even if you publish it on the internet. This is somewhat correct – barring adversaries like national intelligence services, who are very likely to be able to decrypt most schemes at the moment. It’s well known that the Dutch National Intelligence and Security Service (AIVD) is investing heavily in quantum computing research, for instance, which means that the NSA probably has one working right now. But apart from those entities, it’s still quite hard to crack decently encrypted data.

That is why in the new SQL Server edition, SQL Server 2016, it is now possible to keep the data encrypted all the time. Only the client can decrypt the data with their own keys. Barring vulnerabilities in the implementation this is a huge step forward: it is impossible for the database administrators to access data they aren’t allowed to see and the loss of a key only affects data stored for that client. Both are very important steps forward to enable clients to trust databases in the cloud. Which is one reason why Microsoft is pressing forward on this, because they will become entirely dependent on Azure in less than a decade, according to their own predictions. This means that trust in Azure will be a make-or-break issue for the company and their focus on improvements in security reflects this knowledge.

And let me be clear: this is a huge leap forward. The old situation could encrypt some data with server-side keys, but when you made a backup it was decrypted. And in several other scenarios it didn’t work if your data was encrypted. But now it works all over the database, you can set it up quite easily and even choose whether columns are encrypted in a deterministic way that gives the same result every time you encrypt the same value, which enables searching and joining, or random: every time you encrypt the value is different. The latter gives more protection from attackers who encrypt “likely values” and see if they match, which is a classic attack against password-files (see: rainbow tables / dictionary attacks).

In the picture you can see how it works by storing the keys on the client:
Always Encrypted SQL Server 2016

This means we can now store creditcard information and sensitive information in the cloud while not having to rely solely on the goodwill of the Azure database administrator.

There is unfortunately also a downside. The fact that data is now safer does not mean it is safe in all circumstances. The way “always encrypted” works has consequences for your implementation that could blow your encryption scheme right out of the water if misused. So while the temptation to store sensitive but potentially very interesting data because hey, “it’s encrypted” and thus safe, can overcome common sense and even regulations, we should still firmly ignore that temptation.

Because the case I linked in the beginning showed everyone that even if data is encrypted, it is not always safe. In the case which I quoted at the start of the article, the data was encrypted too, and it still leaked. The reason was that the encryption keys were known to the organisation involved and used to decrypt data for analysis. That decrypted textfile was then stored on a publicly accessible server. Encryption cannot mitigate that scenario if the keys are part of the webapplication and the owner of the application can also access the data. Anyone who can get to the keys, can decrypt the information. After that, the security of the data once again depends on what that person does with it – such as putting it on a public server.

This is the reason that if you want to process creditcard information, for instance, you need to be PCI compliant. This is a set of regulations drafted by the financial industry that tell you what data you can store and how. Very sensitive details such as the security code should NEVER be stored. They don’t give security regulations for the storage of the security code: storing it violates all the rules, no matter what you do. The case with Regpack shows that this is still true. What you store will eventually leak, even with encryption. Once quantum computers become available widely, all current encryption schemes are broken and that nicely encrypted data on the internet that wasn’t a problem… is suddenly readable text.

So while “always encrypted” is a step forward, you still need to be very careful about what you store and it still needs to be secure – processing encrypted data on an insecure platform means your data is just as insecure, as the data can be intercepted in memory. While solutions are in the works (Philips, IBM and others are working on homomorphic encryption schemes) this is currently not an option.

recommendations

My recommendations on this subject are as follows.

  • Do not store any data you are not allowed to store.
    If you do this anyway and lose the data, you will get fined or even shut down when this comes to light.

  • Do not store any sensitive data you do not have to store.
    Everything you store is a security risk, if you don’t store anything there are no risks. Being smart about what data to store is a big part of any security strategy.

  • If you do store sensitive data, let the owner of the data hold the key to that data if at all possible.
    After all, a file where every line is encrypted with a different key you don’t have, is a file that will be pretty hard to decrypt and certainly can’t be decrypted by accident by one of your employees.

  • If you cannot do even that, and your application does the encrypting, make sure the decryption key is locked in hardware like a smart card that is NOT reachable on any computer without physical presence.
    Violating this simple rule was what destroyed the Dutch Public Key provider Diginotar.

Some companies prioritize time-to-market and lower cost over data security. But eventually, those companies will be destroyed over that practice. The current digital environment is just too hostile to survive such practices for very long.