At Smaato, we use several different database management systems to create, retrieve, update and manage data. These include MySQL, Vertica, Aerospike and, the one we will explore today, PostgreSQL.
As a PostgreSQL fan and long-time user (since 8.4), I have come across many “gotchas” in my work with the system when it comes to managing user permissions.
In order to help my fellow PostgreSQL users not get caught in these same pitfalls, this blog post highlights two of these common “gotchas” and outlines how to avoid them.
How We Use PostgreSQL at Smaato
The Smaato Demand Platform (SDX) is a powerful tool for mobile programmatic buyers to conduct sophisticated, self-service targeting. The SDX backend is a Python-based application (Django), which uses PostgreSQL (9.5 on RDS) for storage. Our talented team of developers uses Django migrations, as well as custom database migration scripts for pulling necessary data into PostgreSQL.
Since we do use multiple data sources at Smaato, it can become challenging to not confuse one source with another. Despite using PostgreSQL since 8.4, I have come across several “gotchas” — common pitfalls that can trick developers into making mistakes.
Gotcha #1: Making Tables Visible to Tertiary Users
When using PostgreSQL on Amazon Relational Database Service (RDS), the default superuser is called rds_superuser
. For other ACL cases, the Systems Engineering team, which manages the infrastructure for Smaato, had created two more superusers (let’s call them admin1
and admin2
), application-specific users (e.g. owner1
) and tertiary users (e.g. tertiary1
).
While enforcing ACL on our database, our Systems Engineering team had made sure that each user could only see the stuff they were supposed to see — which was a good thing. However, there was an underlying problem that caused recently created tables to not be visible to tertiary users:
- We were using a different owner for running migrations (
owner1
) and reading data (tertiary1
)
- Access privileges were only granted for initial schema to tertiary users on initial schema
- PostgreSQL is not MySQL (who said “thank goodness”?)
In summary, this means that if you execute the following command in PostgreSQL, the privilege will only apply to existing objects (above tables – OODBMS FTW!) in the given database:
-- connect to database
\c appdb1;
-- grant privileges
GRANT SELECT ON ALL TABLES IN SCHEMA public TO tertiary1;
This means that if a new table is created in appdb1
, the user “tertiary1
” would not be able to access it — and that is exactly what we had done.
Solution:
Apply default privileges for schema with tertiary users:
-- connect to database
\c appdb1;
-- Grant privileges for all since we created new tables which are not visible and they won’t be affected by default privileges statement below.
GRANT SELECT ON ALL TABLES IN SCHEMA public TO GROUP tertiary1;
-- Add default privileges, i.e. a privilege that would be applied to all new objects (tables) created henceforth
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO tertiary1;
To learn more about default privileges on PostgreSQL, read the PostgreSQL documentation.
Gotcha #2: Assigning Ownership in Vertical Tech Teams
The most important feature of PostgreSQL default privileges is that a default privilege will only apply to new objects created by the role(s) that created the privilege.
In other words, let’s say owner1
created a default privilege that allows tertiary1
to SELECT
from all tables:
- If
owner1
or any other user from the same role creates a new table, tertiary1
can execute SELECT
on them
- If
admin1
creates a new table, tertiary1
cannot execute SELECT
on them
Solution:
Our team had executed the default privileges statement as admin1
because that is the account that the Systems Engineering team uses for accessing the database. In order to have default privileges apply to objects created by other roles, we executed ALTER DEFAULT PRIVILEGES
as owner1
.
Tips for Avoiding PostgreSQL Gotchas
- Read the documentation in its entirety. Trying out new features after merely skimming the manual wastes time and effort.
- Even if it’s a small application, take your user management/ACL hierarchy seriously. Mapping it out thoroughly before implementation can save you hours of debugging.
- Communication might be the most overused word in agile principles, but putting interactions over processes is vital for vertical teams.
On a side note: the PostgreSQL documentation is one of the most well-maintained documentations of any open source project that I have come across, and it was the reason that I was able to fix these gotchas. No StackOverflow. No Googling. PostgreSQL truly is “the world’s most advanced open source database.” A big thank you to the whole community.