Unplanned database design: A Howler
I clearly remember the days of my project work when I was about to graduate. We were a team of three students working on our final project which used a database to store data. In the initial phase of the project, everything was as smooth as butter and we were making our progress faster than expected. The real problem started once we started taking inputs from multiple resources and started storing them in the database. Some of the issues that we faced are below,
- Incorrect reports: One of the objectives of our project was the ability to generate reports from the data that was collected over time so that the data can be used for some data analysis. Though we had one of the most stable databases (Oracle 11c) to our rescue, yet the reports generated was faulty in terms of the total revenue generated, a number of units sold, etc. On further investigation, we could see that there was no check on duplicate values getting entered in the tables and hence the blunder.
- Long Search Timings: Any application despite of the size and usability, is expected to have the innate ability of searching a data in the database within a short span of time. Since our database was not properly designed and there were multiple tables interlinked with each other and a lot of duplicity of data, the search functionality was getting slower as the data volume increased which posed a serious threat on the scalability of the project. The search timing had to come down to enhance the user experience.
- Lack of proper data flow: Data in any application is analogous to blood flowing in the blood vessels of a human being. To have the clarity of data flow within an application is extremely important to troubleshoot network issues or writing effective database queries and also to make the picture clearer for someone who joins the team. In our application, no one had a clear picture of how data flows in the application and when we tried understanding the same we were lost because of a lack of proper database design.
- Limited usage: Since our application lacked a proper structure, making it scalable for increasing data size and complex business hierarchy was a great challenge because changing the same would then mean changing the underlying database structure every time an upgrade was to be done to ensure the application is optimized which was not possible. Hence, due to a lack in a proper database structure and relations defined between the data in our application was static and non-scalable with a very limited number of usages and functionalities.
If you are a professional who is reading this, you might have got the scientific term which could have solved all the issues stated above. If you are a fresher or someone who is about to graduate, you might have some idea on how do we solve this issue and if you are a layman you might be thinking why not spend some time in blueprinting the database and then create them from scratch with a better design and better understanding of the same. All of your queries are answered by a simple term known as Data Modelling.
In this article, I will be taking you through a brief of data modeling followed by 10 data modelling tools which makes the life of a database architect much easier.
The Remedy: Data Modeling
Data modeling deals in preparing a data model of the data involved in an information system which optimizes the database design and also helps in understanding the dataflow within the information system. A data model on the other hand is an abstract model of the data which relates to specifics such as how the data is captured, how the data flows within the system, how is data entered in individual tables and what checks and constraints apply to the data before storing them in the databases?
Having seen the definition of the terms Data Modelling and Data models, let us now have a look as to why data modelling is needed? The simple answer to this would be to avoid howlers similar to what we did when we were developing our final year project just before graduation. Extending the same in a more formal way, below are some of the reasons why data modelling is needed,
- To get an accurate understanding and idea of all the data objects that are used in an information system.
- A data model helps in designing the database in a much efficient and optimized way.
- A data model gives the idea of the tables that has to be there within a data base, the primary keys and the foreign keys detail and various other constraints and checks that needs to be in place for the same.
- A data model gives a thorough representation of the blueprint of the database which can then be used to develop the actual database.
- A good data model ensures that no duplicate values are entered in the tables and that the critical data is always available. In short, it avoids redundancy of data and also ensures there are no blank values in the database tables.
- A data model can be used as a reference while trying to scale up the application for a wider usage in a more complicated and sophisticated business scenario.
Now that we have seen why data modeling is critical to our applications, lets now turn our heads towards some of the challenges that still linger around even though a data model is in place and also towards challenges that surface up once the data modeling exercise is done,
- Data Modeling is a time-consuming process because it has to be in sync with the business use case and thus needs domain experience.
- While modeling the data in the data modeling tool, data modelers often are concerned with the data objects and thus there are chances that the actual physical data which is stored in these data objects might get overlooked while modeling.
- Even a minute change in the data model will need a change to the entire database and hence this can compromise system availability and performance leading to a significant downtime of the application which can, in turn, impact real-world transactions and user experience.
In general, data modeling is of utmost importance before developing a data base for an application but it comes with its own set of challenges which has to be kept in mind while coming up with a data model for an application. Often, data modeling can be extremely tedious even for experts in the field, nevertheless, there are ways in which this tedious job can be somewhat simplified and one of the those is to use data modeling tools available in the market. In the next segment, we will take a very close look into the data modeling tools available to us.
Making designing simpler: Data Modeling Tools
Anything that makes the tedious job of building data models easier for the professionals can be regarded as a data modelling tool. There are a plenty of data modelling tools available for the same; many of which run on almost all the operating systems while some of them might be restricted to some operating systems such as Linux, Unix, etc. Some of these tools are available online and can be used directly without the need of any installation and thus reduces the cost of maintenance. Many of these tools come equipped with a lot of additional features such as creating data objects from the relationship diagrams, data import-export facilities for cross platform usage, enhanced documentation to aid with the documentation of the data model, connectivity to multiple databases and also with the ability to be integrated with Hadoop and other big data platforms.
This brings me to the heart of the article, the top 10 data modelling tools that are currently being used in the software industry. Please note however that the list is purely based on information available from various sources on the internet and the inputs from various professionals in the industry. There might be other data modelling tools which better suits your needs and might not be mentioned here. Also note that this is not a ranking based list of data modelling tools.
- Erwin Data Modeler: The Erwin data modeler (popularly known as Erwin DM) has been in the Data Modelling industry for about 30 years and is one of the most trusted data modelers. This perhaps makes the Erwin data modeler as one of the most widely used and trusted Data Modelling tools used by several organizations. Currently, it is being used by globe’s top healthcare, financial and IT companies including many of the fortune 500. Some of the unique features as per the website of the product are as follows,
- Flexible and Customizable: The Erwin data modeler comes equipped with customizable user interface making the job of the data modeling fairly straightforward and simple. It also has inbuilt functionality of automating tiresome jobs such as database design, setting standards, testing and deployment. It also provides additional features such as model visualization and automatic generation of data models which makes data modeling less of a headache.
- Ability to integrate with other Databases: Both On-Cloud and On-Premise applications talk to each other in highly technical file formats such as XML, cxml, CSV, JSON, etc. Erwin DM allows to integrate with all these applications using simple graphical models which internally translates the models into data that can be modeled using the modeler. It also provides built-in interfaces for exchange of data between other modeling platforms, data management platforms and other data exchange formats.
- Inbuilt Comparison Tools: Erwin DM comes equipped with lots of inbuilt comparison tools that facilitates bidirectional data syncs between Erwin DM and legacy modelers, databases and existing models. Erwin DM also hosts interfaces to generate custom scripts where selective updates can be done to data models.
- Forward and Reverse Engineering: Erwin DM provides features such as forward and reverse engineering of the data models, integration with NOSQL platforms and other modeling platforms. This ensures efficiency, effectiveness and consistency in enterprise database management.
- Generation and maintenance of metadata: Erwin DM is one of the best modeling tools for enabling data governance and intelligence. The metadata thus generated as a result of the data modeling can be directly pushed to Erwin DM hosted data catalogue and business glossary.
- SAP Powerdesigner: SAP Powerdesigner is an award-winning enterprise data modeling tool. It comes armed with capabilities such as capturing, analyzing and presentation of business data. It follows the standard industry best practices to give comprehensive coverage on the metadata storage and get an understanding of the input data. SAP Powerdesigner uses link and sync mechanisms and metadata management which can be used to capture, analyze and generate insights from the data which can be stored in a single repository that can be accessed organization-wide.
SAP power designer can be used in the below scenarios,
- When in need of sharing your understanding of the data and the data landscape with the entire organization.
- When in need of a standard interface between various business processes and relics by following the industry standards.
- When in need of a tracking mechanism to track various processes and data sharing between various applications and services across the business landscape.
- When in need of finding critical paths or processes or bottlenecks in a business landscape.
The features which stand out in case of SAP Powerdesigner are as follows,
- It enables you to find out dependencies by an interactive user interface which is simple to use and not highly technical.
- It provides the drag and drop functionality across platforms and other tools and services.
- Enables the creation of comprehensive reports and data dictionaries by creating mappings and definitions of the data and metadata.
- It provides highly secure access to the metadata and hence access control is automatically taken care of.
- It supports web reporting.
- It provides seamless integration with various services and applications in the business landscape using the above mentioned link and sync technology.
- Since it stores the metadata in a single repository with controlled access, it enhances the capability of the organization to respond to change.
- MYSQL Workbench: The MYSQL Workbench is a unified data modelling tool for database architects, developers and database admins. This is supported by Windows, Linux and MAC. MYSQL workbench provides tools for configuration, administration, backup and deployment. The following are the key features of the MYSQL Workbench which makes it fit for various data modeling exercises,
- Visual database design: MYSQL Workbench provides tools for model-driven database designing scheme while catering to the changing business needs. It helps the architects prepare designs, communicate the same to different stakeholders and get feedback on the same within very short time frames. There are different validations in place for getting the models validated as per the industry norms and standards to ensure there are no errors or mistakes while generating models, ER diagrams or block diagrams.
- Bidirectional engineering: MYSQL Workbench provides for functionalities that create physical databases from the data model with simple clicks. SQL code with proper queries and syntax is generated automatically which makes the process relatively simple and efficient and thus reduces human prone errors while converting the model into a physical database. It also enables the architects to take an existing database and reverse engineer the same into a data model to solve for runtime errors or debugging the code or for getting an understanding of the database design. It also has the capability to import existing scripts and run them to get the physical databases created.
- Managing changes: The change management process is a complex, risky and time taking process which at times involves the change to the entire physical database and thus encompasses a change in the data model as well. MYSQL workbench provides for functionalities where the database designers can take an existing database and compare it with the new database and see the changes explicitly which saves on time and effort.
- Database documentation: Database documentation is a dreary job and is also time-consuming. MYSQL workbench provides a functionality called the DBDoc which enables point and click documentation of the database. The documentation can be of HTML format or text format whichever is convenient.
- Enterprise Architect: Enterprise architect is the ideal tool amongst most data modelling tools currently available for enterprise-wide modelling schemes. It comes with a lot of functionalities and strategies for analyzing, visualizing, testing and maintaining all the data in any enterprise landscape. It uses a unique technology called perspective based modelling which simplifies the data models further. It is available in four variants depending upon the need and the business landscape one is dealing with. The variants are called as,
- Professional: Suited for entry-level data modelling.
- Corporate: Suited for team-based data modeling.
- Unified: Suited for advanced data modeling and simulations.
- Ultimate: This is the full-fledged solution for the Enterprise Architect data modelling tool.
The key features of the Enterprise Architect data modelling solution are as follows,
- Manageability: Enterprise Architect helps organizations with modelling and managing the complex information that is available in the business landscape. Some of the ways in which this tool achieves significant manageability of data are as follows,
- Diagram based modeling.
- The model need not be at an organization level; it also provides for having models which are domain-specific.
- Version management to track changes to the model and the databases.
- Access control management of the data models.
- Flexibility: The Enterprise Architect pulls in data from various domains and segments of the enterprise and has the ability to develop a single, unified version of the model with the help of the following,
- Using Open standards such as UML, BPMN and SysML
- Recognizing and prioritizing critical tasks to enhance model development over the time period.
- Enhanced test and debug functionalities to have a check in the data model.
- Ability to simulate the data model and confirm on the process design and the business landscape.
- Modeling: Enterprise Architect has the ability to capture business requirements and design and deploy the data models with impact analysis to changes in the model and database structure. The modelling features of this data modelling tool can be used to,
- Define and design a requirement model.
- Track the implementation of the model elements.
- Generate reports
- Perform impact analysis of a change on the data model and the physical database.
- Performance: Enterprise Architect is known for its high performance compared to other data modelling tools available to us. It has a high performance and a high speed import interface to load extremely large files into the system within seconds. It also comes with features for collaborating across cloud based servers.
- IBM Infosphere Data Architect: The IBM Infosphere Data Architect is a data modelling tool from IBM which is built on the Eclipse Integrated Development Environment. It enables the database architects to create both physical and logical designs of the database. IBM infosphere is known for its ability to discover patterns within the data, model the data, find relations and also standardize the interfaces between various applications, servers and existing databases. Some of the key features which make IBM Infosphere data architect unique are as follows,
- Collaboration: IBM infosphere Data Architect is known for its cross role, cross-department and cross-organization collaboration making it one of the most popular enterprise data modelling tool in large scale organizations.
- Visualization: The IBM Infosphere Data Architect provides functionalities to visualize the data in your organization and thus making the process more efficient, less time consuming and sturdy. It also enables the developers to model existing databases in the application.
- Simplification: The IBM Infosphere Data Architect is used to determine the existing data and also plan for new data that might enter into the process or organization. It aims at simplifying the understanding of the dimensional data in the following ways,
- Sharing the data model with the data warehousing teams so that they can create warehouses which are in sync with the data model.
- Sharing the data model with the business intelligence teams so that they can come up with strategies.
- IBM Infosphere Data Architect is the only data modelling tool which can work with the data models that were created by the IBM Netezza data warehouse models and other tools. It also has several functionalities to transform existing data models in the organization into IBM Netezza data models.
- Oracle SQL Developer Data Modeler: Oracle SQL Developer or simply Data Modeler is a data modelling tool which also supports physical database design. It comes equipped with a lot of functionalities that aim at capturing data, exploring data, managing data and getting insights from the data. The major features that make Oracle SQL developer Data Modeler one of its kind are the following,
- Design Import and Export: Like many of the data modelling tools that are available for use, Oracle SQL developer data modeler also provides the functionalities for saving a design, exporting a design from the modeler and also import designs in the modeler which was designed in a different data modelling tool. The Data modeler has capabilities where the design can be exported out both in formats recognized by oracle and also in the format that is not supported by it. In this way, the modeler makes sure that the exported design is available for usage in other data modeling tools.
- Database Designing: Oracle SQL developer data modeler works on the concept of Open Database Design which consists of a single logical model, an optional relational model and further an optional physical model. The database designs that are built with the data modeler can be saved in form of a XML file with the .dmd extension. This also ensures that the saved database design can be exported out of the modeler and can be injected into a different data modeler if need be.
- Build Models: The Oracle SQL developer data modeler is used to build the logical, physical and relational models of the databases. The logical model also called the ER model or the Entity Relation model is at the centre of the data modelling tool on which various models are designed.
- Reduce human errors: This is done by generating Data Definition Language from the data modelling tools. The data modeler brings together a data definition language file editor which can be used to generate data definition language for new and existing databases which thus reduces the human error which might get into the picture if this is done using a simple text editor software.
- Create Reports: One of the important aspects of data modelling is to create reports of the same which can be circulated among various stake holders so that any corrections or deviations are highlighted. The data modeler has the capacity of generating reports in the HTML format by assigning them a unique name in the report repository.
- Microsoft Visio: Microsoft Visio is another one of the many data modelling tools available to a data modeler. Microsoft Visio is popular for creating computer diagrams, data visualization and modelling in a very easy to use and intuitive graphical interface. Microsoft Visio as is the case with many other data modelling tools comes heavily armed with functionality that enables us to import data models from various other Microsoft software such as excel, access and word. Microsoft Visio comes loaded with the below functionalities that make it stand out,
- Data Visualization: Microsoft Visio has a data visualizer for creating diagrams by taking the data from excel sheets. As mentioned above, Visio comes with an import functionality that enables us to import data from excel which can then be processed to generate diagrams and flowcharts for the data model.
- Bidirectional Engineering: Microsoft Visio enables us to create a data model which later can be changed to a fully functional database and also enables us to work backwards in getting the data model from an already existing and functional database. Visio has the ability to connect seamlessly to various other databases such as MYSQL, SQL Server, oracle, etc. and generate data models in what is called reverse engineering of the existing databases.
- Standardized templates: Microsoft Visio comes loaded with a number of industry standard templates that can be used to generate data models which adheres to the recent trends in the data modeling industry.
- Cross-Platform availability: Microsoft Visio is available in an online version too which makes it more convenient to use without compromising the memory consumption in your local computers or personal laptops. With the recent releases that are planned, it will be made available for iPad in an app called the Visio Viewer App.
- Navicat Data Modeler: Navicat Data Modeler is one amongst the many data modelling tools available which deals in the development of the conceptual, logical and physical data models. It comes loaded with functionalities for various operations such as bidirectional engineering, design import and export, generate data models depending on the business landscape and also print the models on paper. Navicat Data modeler supports a wide range of database systems some of which are MySQL, Oracle, SQL Server. PostgreSQL and many more. Some of the features of the Navicat Data Modeler are as follows,
- Create Database Objects: Navicat Data Modeler provides features to create complex database objects without having to write a single line of code or SQL query. It supports three kinds of standard notations such as Crow’s Foot, IDEF1x and UML for enforcing standards in the data models and database designs.
- Highly flexible model types: Navicat Data Modeler provides utilities to switch from one model type to other while doing the data modeling for the databases. One can easily switch from conceptual model to logical model and vice versa which saves time and effort. It also enables ardent change management procedure which makes handling the change management lifecycle easy to handle.
- Bidirectional Engineering: Navicat Data Modeler supports bidirectional engineering of data from data to data model and vice versa. It also enables us to view existing databases in the form of diagrams and blocks using reverse engineering and makes the data modeling simpler and convenient. It also supports direct connectivity between many databases.
- Comparing Databases: Navicat Data Modeler supports comparison of the existing database with the new database and generates a synchronization file that can be imported to take care of the change management of the existing databases or data models.
- Automated Code generation: Navicat Data Modeler not only enables us to create ER diagrams and database designs but also allows us to export SQL code which can then be used in the creation of a physical database.
- ER/Studio: Enterprise Data Modelling, Architecture and Governance or simply ER/Studio is one of the popular enterprise data modelling tools which has left its mark in the industry with the following use cases,
- The ability of the tool to build an enterprise data model.
- Maintaining, generating and representing the data structures related to the business in terms of various data attributes such as data type, data scope, data visibility, data relationships, etc.
- Discovering and documenting various attributes and data across landscapes and applications.
- Enables to keep track on the ETL process of the data handling in enterprise scenario.
- Ability to perform impact analysis of changes to the existing models, physical database design and data structures.
- It also ensures that the various data models, data structures and databases are consistent with one another across platforms, applications and teams.
Besides all the above mentioned use cases, the ER/Studio also provides robust data modeling, maintaining enterprise data dictionary and business glossaries. Some of the key features which makes ER/Studio stand out from the rest of the data modelling tools are as follows,
- Documentation: ER/Studio comes loaded with several functionalities for data modeling and metadata management and documenting the same. The documentation also contains the sources of the data, the attributes and the relationship with other data structures in the data landscape. It also has the capability to model databases from other database management software.
- Standardization: ER/Studio data modelling tool comes with a control feature which can be used to build naming conventions and naming standards for various data models, entities and relationships. This is enforced while the models are generated and are automatically applied to the models to maintain consistency between various departments, applications and databases.
- Collaboration: With the ER/Studio Data modelling tool one can build business and data glossaries for data models. Other teams within the business landscape can view and update the glossaries to keep them updated with the latest business landscape. The data models thus created can be saved into repositories that can be shared among various teams to enhance collaboration. One can also set various permissions in regards to the visibility of the objects and metadata.
- Data Governance: ER/Studio has the provision for setting up of data governance on the data model. Security and privacy in terms of access control can be set up for sensitive data as per the regulatory norms of the organization and data protection in general as per industry standards.
- Integration: ER/Studio provides seamless integration with complex enterprise software such as SAP, Salesforce and CRM. Data model files are prepared by analyzing the data from complex software and that model can be pushed to the ER/Studio for further processing.
- Toad Data Modeler: The toad data modeler is a database design tool amongst the humongous number of other data modelling tools. The features which make it a distinguished tool is the ease with which things can be carried out visually without having explicit expertise in coding or SQL. Toad Data Modeler is primarily used to design, maintain and document databases. Some of the highlights of the Toad Data Modeler are as follows,
- Data Visualization: Toad Data Modeler is used to visualize existing databases through a technique called reverse engineering of the databases and also develop models for new databases. It also has the ability to generate a report in HTML, PDF and RTF formats.
- Data Modelling: The Toad Data Modeler is used to model structures and relations between various attributes. It also takes care of any change that has to be done in the existing databases and data structures and helps us build complex ER relations between various data in the business landscape.
- Flexibility: Toad Data Modeler offers seamless integration with a wide number of other databases such as SAP, MYSQL, SQL Server, PostgreSQL, DB2, etc. It also enables to create data structures by taking the data models from one database and getting converted into another database.
In addition to all the highlights mentioned above, some of the features which makes Toad Data Modeler one of the finest Data Modelling Tools are as below,
- Reporting: Generating reports with Toad Data Modeler is very simple and it also creates hyperlinked reports in HTML and PDF.
- Access Control: Toad Data Modeler provides access control to the repositories to ensure restricted access to sensitive data and models.
- Customization: Model customization comes handy with the Toad Data Modeler where the existing data model can be improved upon using custom scripts and macros.
- Migration: The Toad Data Modeler provides functionalities to migrate models and data structures across different tools and landscapes.
Unplanned database design: No More a Howler
As we come to the end of this article, let me ask you a question. Do you still remember the blunder that I and my teammates did when we were working on the final year project? Do you think this could have been the case if I had known about all the data modelling tools that are available to our disposal? If this question was asked to me, I would say probably yes and the reason for my answer would be no matter how good we model the database and how good we anticipate the changes, at the end of the day it is the application development itself that matters the most. Data modelling can be used to ensure that the backend database is sturdy and rigid with a minimum amount of flaws in the data that is entered in the tables.
As a quick recap, we looked into what is data modelling and why it is needed before starting to work with data and start developing an application for an enterprise. We then turned our heads to the 10 data modelling tools which are available to us for doing the same. We looked into the following data modelling tools,
- erwin Data Modeler
- MYSQL Workbench
- Enterprise Architect
- IBM Infosphere Data Architect
- Oracle SQL developer Data Modeler
- Microsoft Visio
- Navicat Data Modeler
- Toad Data Modeler