This is going to be an interesting topic to dive into. During my many years working in the IT industry, I've come across a lot of different databases and seen many takes on naming conventions ranking from good to total chaos. Obviously I have my likes and dislikes and they will affect how I choose naming conventions, but I will try to combine personal and common suggestions as much as possible below.
Create a naming convention and stick to it!
Yes, it really does not matter if the naming convention is less than "perfect" according to you, anyone else in the team or some globally accepted definition - what matters most is that there IS a naming convention defined, written down, communicated and enforced. Everyone that is developing against the database should know how different objects in the database are named and if possible - why.
I've seen many instances of strange naming conventions, but as long as they are consistent in the database, everything will be OK. This is the main goal of the naming convention in my eyes - to create consistency. With a structured and consistent database, development will be faster and onboarding new database developers to the team will go a lot faster.
Do not go rouge and "fix" a bad naming convention
This is specifically for all short term employees (consultants or similar) out there. When encountering a naming convention that is considered bad, some developers will try to change it to be "better" in the new code that they add. But if only the new objects follow these new made up (and often not communicated) conventions - it will confuse the rest of the team and create a database that is structurally broken. It does not matter if the new objects were created, designed properly and works perfectly - the database structure is now fractured and headed for disaster.
A better option is to ask for the naming convention before creating anything. Is there is none (happens surprisingly often), then offer your help and expertise to create one. Not only will this help to identify possible inconsistencies, but it will also help current and future developers to make more consistent additions to the database. At the very least - study the existing database and mimic the style as much as possible. It might sound obvious, but developers that inject inconsistencies are very common. Especially now when many quick-n-dirty developers use a collection of AI tools to generate code and then just copy-paste it without knowing exactly what it does or caring about how it looks. Please don't be that guy...
A "bad" naming convention is better than none
I'm writing this again, but it is important. Swallow your pride and accept the current naming convention if there is one regardless of what you think about it.
If the entire team is in complete agreement that the current naming convention is horrible, than consider a full restructuring. Do not think that a gradual adaptation of the new naming convention will end well though - things will most likely be chaotic during such an endeavour. Also consider if there is actually any value for your company or customers to perform such a massive restructuring. If there are no measurable positive effects (other than the developers "feeling good" about it) then I would suggest to just reconsider and accept the "bad" existing convention regardless of what you think about it.
With all that said, let's go ahead and create a basic naming convention so that we don't have to make massive, time consuming and costly changes to the database in the future. Let's try to make it right from the beginning.
General considerations
Let's start off with some more general considerations and global conventions.
Documentation
Make sure that the naming convention is stored somewhere accessible to all members of the team. It might sound obvious, but sometimes this type of documentation is hidden which obviously does not help future developers. Make sure that developers knows that the documentation exist and where to find it.
Meaningful Names
Obfuscated, abstract, shortened or in any way cryptic naming conventions will probably only create confusion. Even if the current developers all know what these cryptic names mean, new or temporary developer teams might struggle with them.
While it might sound good at the time, consider if there are any benefits of cryptic naming. If you cannot come up with a good answer, use meaningful and obvious names instead.
Shortened or abbreviated names should be avoided, but if you have to use them, make sure that the same abbreviation is used everywhere and is a part of the naming convention document.
Specify Language
This is often overlooked, especially if the database is small and only intended to be used at a local office somewhere. But what if that local function suddenly gets noticed and it is decided that all offices across the globe should use it? It's not as unusual as you might think, and suddenly this database is destined for some major restructuring.
Personal recommendation is to use English. It is the one language that is generally supported and most people that work in a global environment knows how to read, write and speak.
Case Sensitivity
MS SQL (and to some degree MySQL installed on Windows) is case-insensitive, making queries work regardless of using the same casing as the database objects. I would however recommend to use case-sensitive criteria for database object names as it makes the queries a lot more readable.
SQL Keywords
When writing database code I tend to capitalize all database specific keywords such as SELECT, INSERT, UPDATE and all of the rest. I think it looks more clear and helps those that reads the code in a basic text editor instead of some kind of editor with syntax highlighting. It might also help to review text-based log files.
My preferred naming convention
These are my preferred ways of naming different objects within a database. I'm sure that at least one of them will be up for debate, and that's OK. Every team of developers have their way of doing things and every database have their unique quirks that will shape the way you create your naming convention. So whatever you find below is not gospel, it's just recommendations from an old developer. Feel free to adopt any way you see fit.
I will try to create a convention that fits both MySQL and MS SQL that are the two databases I've worked with the most, so some things might seem strange on one platform, but make sense on the other. Hopefully it will make sense on both.
Database name
I prefer to use lowercase for the database name. The name should be easily identifiable in a single word. I would also accept snake_case, but I usually try to avoid that. These statements have many options that should be investigated before actually creating the database, but that is beyond the scope of this article.
CREATE DATABASE databasename;
Tables
Should you use plural (Users) or singular (User)? I understand the case for singular, but the way I look at it is "In this table I store a collection of Users", thus putting me on team plural. There are many arguments for each side of this debate, but my main point here is - choose one and stick to it.
I also prefer to use PascalCase (the first letter of each compound word is capitalized) for tables. This is one of those things that I simply think looks better and I don't have a more extensive reason for it. The tables in some default system databases for both MS SQL and MySQL are mixed so in an attempt to bring order to chaos - I choose PascalCase.
CREATE TABLE Persons (
PersonId INT,
LastName VARCHAR(64),
FirstName VARCHAR(64)
);
One thing to remember here though - if you are using MySQL on windows (a local dev/test for example) the above command will create a table with only lowercase letters. The windows implementation is case-insensitive so everything will work just fine, but if you later use the exact same code on a database hosted on a Linux server, this will change. As long as the rest of the code (queries, functions, procedures etc.) uses PascalCase, everything will work just fine, but keep in mind that code that does work on your local Windows dev database MIGHT NOT work on Linux. An easy solution to avoid this confusion would be to create the MySQL database on the same operating system that will later host the global dev/test/staging/production servers. If you really need a local MySQL database on your windows computer, consider installing MySQL on WSL (Windows Subsystem for Linux).
Many to many tables
These types of tables simply link 2 other tables in a many-to-many relationship, usually without adding any extra information. They just contain of 2 foreign key columns pointing to the tables they link together. An example would be the tags I use for my articles on this site. An article can contain many tags and the tags can be shared by many articles. I want to make sure that these types of tables are clearly identifiable so I use the name of the two tables that it links together, separated by an underscore.
CREATE TABLE Articles_Tags ...
Table columns
As you can see in the CREATE TABLE statement above, I choose to use PascalCase for columns (and compound words in general) such as PersonId. I have worked with a number of databases that capitalize ID as well (usually as an extra indication that this column is the primary key), but that goes against PascalCase so I'm not going to do that.
Another thing about the PersonId column is that I use the full PersonId name instead of simply "Id". The main reason for this is that I think it helps making things more clear when joining tables. You could work around it by using the table name (Persons.Id) in the join operation, but I still prefer writing the entire PersonId for added clarity. However, this does not apply to the rest of the columns (such as prefixing each column name with the table name like "PersonFirstName", only the Id column so that it does not make joining more confusing.
Some databases allows the use of names containing spaces (Person Id) or even some semi-restricted words such as name, but I would suggest to avoid using them. Some database tools will interpret "name" and "description" (just to mention a few) as built in types or functions and display them as highlighted in a different color. While that might not be a major issue and will work, it still adds a little bit to overall confusion.
Key constraints
At some point (almost immediately) we want to join other tables to make sense of our data. I am a big believer of Third Normal Form (3NF) meaning that I often create many tables to keep that structure. Joining tables are done with primary and foreign keys and to avoid duplicate data, we use unique constraints. I usually prefix these with PK_, FK_, and UQ_ to easily separate them.
PK_ -- Primary Key
FK_ -- Foreign Key
UQ_ -- Unique
To add to the CREATE TABLE code earlier, here is an example of how to add these keys.
CREATE TABLE Persons (
PersonId INT NOT NULL, -- Must contain a primary key (duplicate values not allowed)
CustomerNumber INT, -- Does not allow multiple non-null and non-unique values
LastName VARCHAR(64),
FirstName VARCHAR(64),
Created DATETIME,
CONSTRAINT PK_Person PRIMARY KEY (PersonId),
CONSTRAINT UQ_Person UNIQUE (CustomerNumber)
);
CREATE TABLE PersonLogs (
PersonID INT NOT NULL,
Created DATETIME,
LogEntry VARCHAR(64),
CONSTRAINT FK_PersonLogs_Persons FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);
There are other ways to add constraints to tables, but the above code works in both MySQL and MS SQL so I prefer that.
Indexes
To retrieve data more quickly, you will probably need to add an index at some point. Note that updating a row that contains an index will take more time than a table without one. So it depends on how the data will be used if you want to use it or not. Managing indexes is a whole topic on its own, but I'm not going to dive deeper into that here.
IX_ -- Index
Example:
CREATE INDEX IX_LastName ON Persons (LastName);
CREATE INDEX IX_FullName ON Persons (LastName, FirstName);
Triggers
I would avoid using trigger in general, mainly due to how they might cause confusion when "things happen" without developers knowing exactly why. They are not immediately obvious to new developers and can easily be missed. Trigger are very useful in some situations, but I would avoid them or at least keep them as simple as possible. When triggers fail, it will cause the operation that was called (such as INSERT) to fail as a result and debugging such an error might turn out to be more confusing than it have to. But this is a article about naming, so here it is:
TR_ -- Trigger
Examples:
-- MS SQL
CREATE TRIGGER TR_Person ON Persons FOR INSERT
AS
UPDATE
Persons
SET
Created = GETDATE()
WHERE
PersonID IN (SELECT DISTINCT PersonID FROM Inserted)
GO
-- MySQL
DELIMITER $$
CREATE TRIGGER TR_Person BEFORE INSERT ON Persons
FOR EACH ROW
BEGIN
SET NEW.Created = NOW();
END$$
DELIMITER ;
Keep in mind that these are just examples. The above problem would be better solved by creating the table with a default GETDATE() or NOW() value for the Created column instead of using triggers.
Table name prefixes
The final bit to consider when it comes to naming tables is if it is suitable to use prefixes to group tables. Generally speaking I would say that this is only applicable for data warehouse situations or very large databases with clearly divided logical modules. In that case it might be helpful to add a prefix in order for developers to quickly filter what part of the data they are working with. There is nothing wrong about doing this, but if there are a lot of overlap between the different logical structures, it might cause problems instead. So I would suggest the following; for larger and clearly logically separated databases - consider it, but for medium and smaller databases - don't bother.
Views
It's not uncommon to name views using the main table name with a purpose added to it. Views are used in a similar way as tables, so it makes sense to name them using the same criteria. However - I want to separate them from tables by using a prefix.
VW_ -- View
Example:
CREATE VIEW VW_PersonsWithCustomerNumber AS
SELECT
PersonId, CustomerNumber, LastName, FirstName
FROM
Persons
WHERE
CustomerNumber IS NOT NULL;
Functions
I like to use functions in my databases. It creates a way to save code that is commonly used in the database in one location. Using a prefix makes it obvious how to use it in other parts of the database, such as a stored procedure.
FN_ -- Function
Example:
-- MS SQL
CREATE FUNCTION FN_AddNumbers (@number1 INT, @number2 INT)
RETURNS INT AS
BEGIN
RETURN @number1 + @number2;
END;
-- MySQL
DELIMITER $$
CREATE FUNCTION FN_AddNumbers (number1 INT, number2 INT)
RETURNS INT
DETERMINISTIC
BEGIN
RETURN number1 + number2;
END$$
DELIMITER ;
Stored Procedures
This is where things get serious. I use stored procedures extensively in my databases and I would like to explain that in detail, but I think that should a topic for another article in the future. Anyway, I use a SP_ prefix and use PascalCase for the name. A common use would be to create a SP that upsert (insert or updates) an object. Therefore I usually name the SP's by its table name, followed by a verb describing what the procedure does.
I will agree that putting the verb first in the name makes it more readable (UpsertPerson) but my reason for turning it around has to do with grouping procedures in order to find them easier. No matter what fancy database tools are being used, at some point your entire database will be presented as a list of objects or files in a folder - both ordered by name. For developers not completely familiar with the database, finding what procedures are available to modify a specific object will be a lot faster.
SP_ -- Stored Procedures
Example:
-- MS SQL
CREATE PROCEDURE SP_PersonUpsert
@PersonId INT,
@CustomerNumber INT,
@LastName VARCHAR(64),
@FirstName VARCHAR(64)
AS
IF(EXISTS(SELECT 1 FROM Persons WHERE PersonId=@PersonId))
BEGIN
UPDATE Persons SET CustomerNumber=@CustomerNumber, LastName=@LastName, FirstName=@FirstName WHERE PersonId=@PersonId;
END
ELSE
BEGIN
INSERT INTO Persons (PersonId, CustomerNumber, LastName, FirstName) VALUES (@PersonId, @CustomerNumber, @LastName, @FirstName);
END;
-- MySQL
DELIMITER $$
CREATE PROCEDURE SP_PersonUpsert (
_PersonId INT, -- Note that I use _ to separate params from column names
_CustomerNumber INT,
_LastName VARCHAR(64),
_FirstName VARCHAR(64)
)
BEGIN
IF EXISTS(SELECT 1 FROM Persons WHERE PersonId=_PersonId) THEN
UPDATE Persons SET CustomerNumber=_CustomerNumber, LastName=_LastName, FirstName=_FirstName WHERE PersonId=_PersonId;
ELSE
INSERT INTO Persons (PersonId, CustomerNumber, LastName, FirstName) VALUES (_PersonId, _CustomerNumber, _LastName, _FirstName);
END IF;
END$$
DELIMITER ;
Summary
Well that was a lot more text than I initially expected. I know that I tried to validate my choices above, but perhaps I tried a bit too much. Anyway - these are my suggestions for a naming convention, but whatever you choose - most important is that you make a choice, write it down, communicate it with everyone involved and then stick with it.
-- Database name
lower case (snake case if needed)
-- Tables, columns and compound words in general
PascalCase
-- Prefixes for various objects
PK_ -- Prefix for Primary Keys
FK_ -- Prefix for Foreign Keys
UQ_ -- Prefix for Unique constraints
IX_ -- Prefix for Indexes
TR_ -- Prefix for Triggers (use in moderation)
VW_ -- Prefix for Views
FN_ -- Prefix for Functions
SP_ -- Prefic for Stored Procedures