I've seen this topic pop up many times and it's an issue that still catches me off guard. For me - the problem appeared again in a recent project that is worked on (with some added confusion) so I will do my best to describe the problem and give my opinion about the matter.
The project that sparked the inspiration for this post was almost like a perfect storm of confusion. DATETIME and TIMESPAN was used seamingly at random and often changed between tables even though the columns were named the same (I.e. "created"). Some columns did not even contain the time-part. To make matters even worse, the server and database timezones were set to different values in the TEST, STAGING and PRODUCTION environments. My guess is that this mix of datatypes was the result of years of developers contributing their own preferred datatype over time. Needless to say - this really created a challenge for current developers.
General information about DATETIME, TIMESTAMP and DATE in MySQL can be found in the documentation here: https://dev.mysql.com/doc/refman/8.0/en/datetime.html.
TLDR
- Do not mix - choose one
- Use UTC and define a standard for the entire system
- TIMESTAMP range: "1970-01-01 00:00:01" -> "2038-01-09 03:14:07"
- Consider using BIGINT instead
Main differences
DATETIME
The DATETIME datatype is stored and displayed in the "YYYY-MM-DD HH:MM:SS" format.
The supported range is "1000-01-01 00:00:00" -> "9999-12-31 23:59:59".
To store a specific value, DATETIME should be used - the value is not converted when stored and/or retrieved.
TIMESTAMP
The TIMESTAMP datatype has varying properties (precision) depending on MySQL version and SQL mode.
The supported range is "1970-01-01 00:00:01" UTC -> "2038-01-09 03:14:07".
TIMESTAMPs are generally used to track changes to records and are often updated when a record is changed. In MySQL 5 and above, TIMESTAMP values are converted from the current timezone to UTC for storage and then converted back from UTC to current timezone for retreival.
The timezone confusion
The database I was referring to in the intro of this post also made things confusion by having multiple different timezone settings. The timezone settings was also updated manually for daylight saving time twice each year (not always on time). This is where the ugly side of DATETIME vs. TIMESTAMP really shows.
To get the current timezone:
SHOW VARIABLES LIKE '%time_zone%';
+------------------+-----------------------+
| Variable_name | Value |
+------------------+-----------------------+
| system_time_zone | W. Europe Summer Time |
| time_zone | SYSTEM |
+------------------+-----------------------+
This is the result of my local MySQL database. The timezone is set to SYSTEM wich will use the system_time_zone. This will automatically adjust for daylight saving time (depending on how the OS is configured). The database I worked with however, set the system_time_zone to a static '(UTC+01:00) Central European Time(Europe/Amsterdam)' were the time_zone was altered to an offset of '+01:00' or '+00:00' to adjust for daylight saving time.
To demonstrate the problem I created a simple table:
CREATE TABLE datetest (mydatetime DATETIME, mytimestamp TIMESTAMP);
And added some data:
INSERT INTO datetest VALUES (NOW(), NOW());
SELECT * FROM datetest;
+---------------------+---------------------+
| mydatetime | mytimestamp |
+---------------------+---------------------+
| 2023-06-25 09:45:52 | 2023-06-25 09:45:52 |
+---------------------+---------------------+
I then changed the timezone (could use named time zones here if they are installed) by setting an offset:
SET time_zone="+05:00";
SELECT @@GLOBAL.time_zone, @@SESSION.time_zone;
+--------------------+---------------------+
| @@GLOBAL.time_zone | @@SESSION.time_zone |
+--------------------+---------------------+
| SYSTEM | +05:00 |
+--------------------+---------------------+
And then selected the data again:
SELECT * FROM datetest;
+---------------------+---------------------+
| mydatetime | mytimestamp |
+---------------------+---------------------+
| 2023-06-25 09:45:52 | 2023-06-25 12:45:52 |
+---------------------+---------------------+
Note that the data has not actually been changed as the TIMESTAMP data is stored in UTC, but the data returned is now 5 hours off. This could lead to confusion such as a user that is created at a certain point in time, made their first purchase 5 hours before existing. Also when data is migrated between multiple databases this could lead to a loss of all records created in the 5 hour span. There are more examples of how this could be dangerous, but I think you get the point I'm trying to make here.
What I would do
In the case of the database I'm referring to, we had to jump through a few hoops to get it to work. We did not have full control over the database so we could not change any of the settings or the schema. We basically had to build a custom conversion system and that is not something I would recommend anyone doing, but in this particular case we had no other option.
The biggest sin in that project was the seamingly random use of the 2 datatype. A better way to handle it would have been to simply choose one or the other datatypes and refactor the existing database. If it is a new and smaller database, my personal recommendation would be to use DATETIME, but obviously that depends on what the database is being used for and the range and precision of data being stored. In this scenarion I'm choosing DATETIME over TIMESTAMP mainly beacuse of the wider range of data stored and the fact that the DATETIME datatype returns the actual data stored without any conversion.
Regardless of what datatype is being used, there should be consensus of how a date and time is handled throughout the entire system (not just the database). If the system is supposed to be used in more than one timezone (or could be used in more timezones in the future) I would absolutely suggest to store everything in UTC and configure the OS and database to reflect that. Local timezone conversions would then happen further down the line in user-facing operations when needed. The database would then always return the data in a specific format in a set timezone.
Alternatives could be to use DATE or BIGINT instead. The DATE datatype is better if the data to store does not contain time (birthday and similar), but technically you could end up with the timezone issue here as well. BIGINT could be used to store the number of milliseconds from a specific point in time (Epoch) and then convert that to a readable datetime when needed. With BIGINT you don't have to worry about datatype precision between different versions of the database, but it is a more "custom" option.