Postgres extract() function
Extract date and time components from timestamps and intervals
The Postgres extract()
function retrieves specific components (such as year, month, or day) from date/time values where the source is of the type timestamp
, date
, time
or interval
.
This function is particularly useful for data analysis, reporting, and manipulating date and time data. For example, it can be used to group data by year, filter records for specific months, or calculate age based on birth dates.
Function signature
The extract()
function has the following form:
field
: A string literal specifying the component to extract. Valid values includecentury
,day
,decade
,dow
,doy
,epoch
,hour
,isodow
,isoyear
,microseconds
,millennium
,milliseconds
,minute
,month
,quarter
,second
,timezone
,timezone_hour
,timezone_minute
,week
, andyear
.source
: The date, time, timestamp, or interval value from which to extract the component.
The function returns a numeric value representing the extracted component.
Example usage
Let's consider a table called events
that tracks various events with their timestamps. We can use extract()
to analyze different aspects of these events.
This query extracts the year and month from each event's timestamp.
You can use the extracted components for further analysis, filtering, or grouping. For example, we can count the number of events by quarter:
This query groups events by year and quarter, providing a count of events for each period.
Advanced examples
extract()
with different fields
Use You can use extract()
with various fields to analyze different components of timestamps:
This query demonstrates how extract()
works with different fields, ranging from century
to microseconds
.
extract()
with interval data
Use When working with the INTERVAL
type, the extract()
function allows you to pull out specific parts of the interval, such as the number of years, months, days, hours, minutes, seconds, and so on.
This query extracts the specified parts from the interval. Note that the extract
function extracts only the value for the specified part in the interval. For example, EXTRACT(DAYS FROM INTERVAL '2 years 3 months 15 days')
returns 15
for days, not the total number of days in the interval.
Additionally, it should be noted that for non-normalized intervals, the extracted values may not be as expected.
A normalized interval automatically converts large units into their equivalent higher units. For example, an interval of 14 months
is normalized to 1 year 2 months
because 12 months make a year.
A non-normalized interval keeps the units as specified, without converting to higher units. This is useful when you want to keep intervals in the same unit (like months or minutes) for easier manipulation or calculation.
When extracting values from non-normalized intervals, Postgres returns the remainder after converting to the next higher unit. This can lead to results that might seem counter-intuitive if you expect direct conversion without accounting for normalization.
For example, consider this query and its output:
Interval '32 months':
- A year is composed of 12 months.
- 32 months can be broken down into 2 years and 8 months (since 32 Ă· 12 = 2 years with a remainder of 8 months).
- When you
EXTRACT(MONTH FROM INTERVAL '32 months')
, it returns 8 because that’s the remaining months after accounting for the full years.
Interval '80 minutes':
- An hour is composed of 60 minutes.
- 80 minutes can be broken down into 1 hour and 20 minutes (since 80 Ă· 60 = 1 hour with a remainder of 20 minutes).
- When you
EXTRACT(MINUTE FROM INTERVAL '80 minutes')
, it returns 20 because that’s the remaining minutes after accounting for the full hour.
extract()
for time-based analysis
Use Let's use extract()
to analyze user registration patterns for a hypothetical social media application:
This query uses extract()
to analyze user registration patterns by day of week and hour of day.
Additional considerations
Performance considerations
For large datasets, consider creating indexes on frequently extracted components to improve query performance:
This creates an index on the year and month components of the event timestamp, which can speed up queries that filter or group by these components.