Introduction
Recently, I’ve been running a lot of one-off queries for other teams who need access to data from production PostgreSQL databases hosted in Amazon RDS. I would do this either from my local environment, or from a jump box in the AWS environment. For quick queries, I would use the psql PostgreSQL interactive terminal.
Installation
In my local environment running on Ubuntu, I would install the PostgreSQL client using the following command:
sudo apt install postgresql-client-14
On AWS jump boxes running Amazon Linux 2, I would install the PostgreSQL client using the command:
sudo yum install postgresql
If this command does not find the PostgreSQL client, you may need to install software from the amazon-linux-extras library using this command:
sudo amazon-linux-extras install postgresql14
You can confirm the version of the PostgreSQL client by running psql --version
.
Connecting to the Database
To connect to the database, use the following string:
psql -h hostname -p 5432 -d databasename -U adminusername -W -x
To break this down:
- -h hostname is the host address of the database. If this is local, you can use 127.0.0.1 or localhost. If it is hosted in Amazon RDS, you would use the database endpoint, such as database.address.region.rds.amazonaws.com. You can retrieve this from the RDS section in AWS, under “Connectivity & Security”.
- -p port is the port to connect to, which the database is running on. The default port for PostgreSQL is 5432.
- -d databasename is the name of the database to connect to.
- –U username is the username to connect as, instead of the default.
- -W forces psql to prompt for a password before connecting to the database. Once you run the string, you will be prompted to enter the password to connect.
- -x turns on expanded table formatting mode. This is the equivalent to running \x in the console. When running queries against the database, this will make the results more readable.
Setting up to run queries
Once logged in to the database, you may need to run the SET ROLE command, to set the current user identifier to one that has permissions to run the SQL commands. The command would be:
set role user;
Once this is done, you should see SET as a response, to confirm the role name is set. To confirm the user you are logged in as, and the current user whose role you are using, you can run the following command:
SELECT SESSION_USER, CURRENT_USER;
Running queries against tables
Now that you’re set up with the proper connection and permissions to the database, you can start running queries. To see which tables are present, you can run the \dt
command. If you’re not familiar with the database, you may want to limit the number of rows returned so you don’t scan the entire table. For example:
SELECT * FROM TABLENAME LIMIT 10;
Output to file instead of screen
When running queries which return many rows or a lot of output, it’s easier to have the output returned to a file, rather than dumped on screen. This can be set up in the initial connection using the -o filename
argument, but can also be done by using \o filename
while in the psql prompt.
Quitting the psql terminal
To quit the PSQL terminal, type \q
to exit. You should be returned to your normal shell.
Bonus: Automating Queries with Ansible
If you need to run the same query against multiple databases, this can be automated using the postgresql_query module. This is especially helpful if you have to connect to a jump box to connect to the database.
I would recommend installing the following software on each host that will be connecting to the database, to ensure it has all of the prerequisites to run properly. I ran into some issues with psycopg2 bugs, and ended up using the binary version using PIP to solve this.
---
- name: Install required modules
become: yes
yum:
use_backend: yum
name:
- gcc
- python3
- python-pip
- postgresql
- python-devel
- postgresql-devel
- python-psycopg2
state: present
- name: Install required PIP modules
pip:
executable: /usr/bin/pip3
name:
- psycopg2-binary
The same values used in manual queries can be automated in the following YAML block:
---
- name: Query database
postgresql_query:
login_host: databasehostname
db: databasename
port: 5432
login_user: databaseusername
login_password: databasepassword
session_role: roletoset
query: sqlquery
register: dboutput
- debug:
var: dboutput.query_result
I would strongly recommend not putting your passwords in the Ansible files. One way around this is to create a passwords.yml file formatted as follows:
---
db_password : 'password'
And in the Ansible yaml, use login_password: "{{ db_password }}"
to obtain the password. Then just make sure you add the passwords.yml to your .gitignore file so it doesn’t get committed to your repository.
Alternatively, you can store and retrieve the passwords in a secure location or secrets engine like Hashicorp Vault.
Resources
- PSQL documentation – switches.
- PostgreSQL SET ROLE documentation.
- Ansible PostgreSQL module documentation.
- Hashicorp Vault – PostgreSQL Database Secrets Engine.