User Guide

Overview

RapidRows is a low-code configurable API server. It aims to be the easiest way to bring up an API server that mainly has to read and manipulate data living in PostgreSQL databases.

  • Designed for PostgreSQL: RapidRows is designed for PostgreSQL and an ever-growing number of PostgreSQL-compatible servers. It has a built-in connection pooler, and features like forwarding notifications from PostgreSQL channels into WebSockets and Server Sent Events.
  • Lets the database people write the queries: Unlike tools that inspect the database schema and provide an automatic REST or GraphQL API, with RapidRows the SQL queries are at the backend, deployed on a server. This lets the database people write, test and deploy complicated SQL queries, especially for OLAP use cases, and allows the frontend developers deal with the UI and a much simpler REST API.
  • Scheduled jobs: Running periodic jobs to maintain a PostgreSQL database is just the way of life. RapidRows has a built-in CRON-like scheduler that can execute SQL statements or JavaScript code. Easily create partitions, refresh materialized views and more.
  • Must-haves included: Configurable CORS, response compression, parameter validation, server-side caching of query results, query timeouts, transaction options, connection pooling, and extra debug logging per endpoint. It embeds the QuickJS JavaScript engine so you can write add extra validation and tweak query results.
  • Easy to adopt: RapidRows is a zero-dependency single-binary CLI tool that is easily deployed. It is open-source and licensed under Apache 2.0. There are no PostgreSQL extensions to be installed. Parts of the configuration are similiar to OpenAPI 3.0.
  • Easy to use: The configuration is provided by a single JSON or YAML file, and running it is as easy as invoking rapidrows path/to/config.json on the command line.

RapidRows is brought to you by RapidLoop, the makers of pgDash. RapidRows mostly grew out of our own internal tooling needs.

RapidRows Pro

We’re also building RapidRows Pro, an enhanced, commercial version of RapidRows that includes a GUI to administer the server, edit configuration files, monitor metrics and more. Follow us on Twitter @therapidloop to keep up to date.

How It Works

A configurable HTTP server

The biggest part of RapidRows is a configurable HTTP server. The routes of this server are configured very similar to OpenAPI (Swagger), but the RapidRows configuration also specifies how to respond to this route.

Specifying the response

In RapidRows terminology, a route is an endpoint. The response of an endpoint can be:

  • running a query on a PostgreSQL server and returning the result in JSON format
  • run the query but return the result in CSV format
  • run the query but only return the number of rows affected (for queries that do not return any data like UPDATE)
  • run the specified JavaScript code. This code can then examine the input, decide what queries to run and what output to return.
  • return static data (text or json)

To be able to run JavaScript code, RapidRows uses Fabrice Bellard’s QuickJS engine. QuickJS supports ES2020 and is blazingly fast. The endpoint script has access to a $sys object, using which it can connect to databases, run queries and return results.

Parameters

Each endpoint can have any number of parameters. This is similar to, but a loose subset of, the OpenAPI parameter specification.

RapidRows parameters can appear as part of the URI path, as a query parameter, or as a JSON or POST form body.

A parameter can be an integer, number (a number is an integer or a float), boolean, string or array. Arrays in turn can have only number, integer, boolean on string elements. Nested arrays and objects are not supported.

Validation rules can be specified per parameter. These include minimum and maximum (for numbers and integers), maximum length and regex pattern match (for strings), minimum and maximum number of elements (for arrays); as well as a enumerated list of possible values (for numbers, integers and strings).

More details can be found in the sections below, in the reference and in the examples pages, but here is a quick example:

version: '1'
endpoints:
- uri: /movies/by-genre-and-year
  implType: query-json
  script: |
    select F.title, C.name, F.release_year
      from film F
      join film_category FC on F.film_id = FC.film_id
      join category C on FC.category_id = C.category_id
     where C.name = any($1::text[])
       and F.release_year = $2    
  datasource: pagila
  params:
  - name: genres
    in: query
    type: array
    elemType: string
    minItems: 1
    required: true
  - name: year
    in: query
    type: integer
    minimum: 1952
    maximum: 2022
datasources:
- name: pagila
  dbname: pagila

(The dataset is from Pagila.)

This returns the films released in a given year, belonging to one of the specified genres.

Sample output:

$ curl -i 'http://localhost:8080/movies/by-genre-and-year?genres=Sci-Fi&genres=Comedy&year=2006'
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 26 Sep 2022 05:11:35 GMT
Transfer-Encoding: chunked

{
  "rows": [
    [
      "AIRPLANE SIERRA",
      "Comedy",
      2006
    ],
    [
      "ANNIE IDENTITY",
      "Sci-Fi",
      2006
    ],
... snip ...
    [
      "ZORRO ARK",
      "Comedy",
      2006
    ]
  ]
}

JavaScript API

The JavaScript environment has all the global objects specified by ES2020, as well as one added by RapidRows, called $sys. It is used like this:

// acquire a connection to the datasource named 'pagila', from the pool
let conn = $sys.acquire('pagila')

// do a query. The input parameters are available in the object $sys.params
let result = conn.query(
	'select title, rating from film where release_year=$1 order by title asc',
	$sys.params.year
)
// result is an object like this:
// { rows: [ ["ACADEMY DINOSAUR", "PG"], ["ACE GOLDFINGER", "G"] ] }

// execute sql
result = conn.exec(
	'update rental set return_date = now() where rental_id = $1',
	$sys.params.rental_id
)
// result is an object like this:
// { rowsAffected: 1 }

// return a JSON result (will be sent back as application/json)
$sys.result = { filmCount: result.rows.length }

// result a text result (will be sent back as text/plain)
$sys.result = 'No films found.'

// error during acquire/query/exec are 'throw'n

The $sys object is available to JavaScript code that executes as an endpoint script, or as a scheduled job script. You can see some examples here and here.

Caching

Each endpoint can be configured to cache the results, at server-side, for a specified number of seconds. If this is set, RapidRows will cache in-memory the final response that was produced on the first invocation of the endpoint. For further invocations, if the cache timeout has not expired, it will reuse the same response.

Note that the cache entry’s key includes the URI and the actual values of the arguments.

See an example.

Transactions

Queries can be run in the context of a transaction, by specifying the transaction parameters in the endpoint configuration. In PostgreSQL, a transaction can be read-only or read-write, they can have different ISO levels (read-committed, repeatable-read or serializable) and they can be deferrable (only for serializable transactions).

By specifying the transaction type you want for an endpoint, the SQL query for the endpoint will be run within such a transaction.

See an example.

Timeout

A timeout can be specified for the query that implements the endpoint.

See an example.

Datasources and connection pooling

In RapidRows, a datasource refers to a PostgreSQL database. A datasource can optionally be configured with a connection pool. Queries are always specified to run against a datasource, referred to by it’s name.

All the usual libpq connection parameter keywords can be set for specifying a datasource. Here is an example that configures two datasources:

datasources:
- name: pagila-dev
  dbname: pagila
  host: dev.proj.example.com
- name: pagila-prod
  dbname: pagila
  host: pgbouncer.prod.example.com
  params:
    application_name: rapidrows
    statement_timeout: 60

Connection pooling can be configured so that RapidRows will establish a minimum number of connections to the database at startup. Optionally it can be configured not to exceed a maximum limit, and also to close connections that have been idle for a period or have been active for a period. Connections can also be established lazily (on-demand).

See examples.

Notifications over WebSocket and SSE

RapidRows can forward notifications sent to PostgreSQL channels to WebSockets or Server Sent Events. In RapidRows terminology, such websocket or sse endpoints are called streams.

It’s easy to configure a stream, and it looks like this:

version: '1'
streams:
- uri: '/new_payments'
  type: 'websocket'
  datasource: 'pagila'
  channel: 'payment_received'
datasources:
- name: 'pagila'
  dbname: 'pagila'

Any number of clients may connect to a single websocket/sse stream. Each notification sent using NOTIFY or pg_notify() will be sent to all applicable connected clients.

Scheduled jobs

RapidRows includes a CRON-like daemon to run scheduled jobs. Unlike other solutions, it does not require database extensions, nor does it store data in your database.

A job itself is either a set of SQL statements, or a JavaScript code.

The schedule can be specified using the standard CRON syntax, or by specifying the repetition interval as @every X where X is the number of hours, minutes and seconds, put together like 10h3m5s.

Similar to endpoints, scheduled job with SQL statements can also have transaction options and a timeout.

See here for a job that creates a partition for a table for the upcoming month using JavaScript code. There are also other examples.

CORS

Cross Origin Resource Sharing can be configured at the root level, and applies to all endpoints and streams.

You can set the origins, methods and headers that are allowed to access any URI.

You can also configure the value returned as the response headers:

  • Access-Control-Expose-Headers
  • Access-Control-Max-Age
  • Access-Control-Allow-Credentials

See an example of how CORS configuration looks like.

If the cors setting is not specified, no CORS-related headers will be present in the responses. This can potentially mean browsers may not be able to call some or all URIs defined in the configuration. If the cors setting is specified as an empty object, CORS will be setup to allow all access to all URIs. This is probably too open and insecure a configuration.

Debugging

Some objects can be flagged as debug: true, which causes RapidRows to emit more logs only for that object. Here is an example. Objects for which this flag can be set are:

  • Endpoint
  • Stream
  • CORS
  • Scheduled Job

Build and Install

RapidRows pre-built binary releases are available on GitHub. If there isn’t one for your platform yet, you can download the source and build it yourself using:

go install github.com/rapidloop/rapidrows/cmd/rapidrows@latest

You’ll need a Go compiler v1.18 or above, and gcc/clang.

See this page for more details on development of RapidRows.

Running RapidRows

RapidRows is a command-line tool.

$ rapidrows --help
Usage: rapidrows [options] config-file
RapidRows is a single-binary configurable API server.

Options:
  -c, --check            only check if the config file is valid
  -l, --logtype string   print logs in 'text' (default) or 'json' format (default "text")
      --no-color         do not colorize log output
  -v, --version          show version and exit
  -y, --yaml             config-file is in YAML format

(c) RapidLoop, Inc. 2022 * https://rapidrows.io

To start the server, simply invoke the executable with the path to the config file:

$ rapidrows --yaml hello.yaml
2022-09-26 07:49:44.887 INF API server started successfully listen=127.0.0.1:8080

If the config file is in YAML format (as opposed to JSON), include the -y or --yaml option.

After a successfull start, rapidrows will wait for SIGINT (if running attached to the console you can ^C), and upon receiving that will proceed to shutdown gracefully.

$ rapidrows --yaml hello.yaml
2022-09-26 07:49:44.887 INF API server started successfully listen=127.0.0.1:8080
^C2022-09-26 07:55:14.089 INF stop request received, shutting down timeout=60000
2022-09-26 07:55:14.089 INF API server stopped

Log lines are written to stdout. If stdout is a tty, they will be colorized, unless colorization is explicitly turned off using the --no-color option. To make rapidrows emit JSON log lines, use the --logtype=json (or -l json) option.

Before starting, it is recommended to check the configuration file using the -c (or --check) option:

$ rapidrows --check --yaml hello-bad.yaml
warning: endpoint "/hello": timeout -3 is <=0, will be ignored
error: endpoint "/hello2": invalid implementation type "sttic-text"
error: endpoint "/hello2": invalid script: empty

hello-bad.yaml: 2 error(s), 1 warning(s)

Running RapidRows as a Service

To run RapidRows as a systemd service, create a file under /etc/systemd/system, with a name like rapidrows-svc1.service and contents like:

[Unit]
Description=RapidRows service 1
After=network.target

[Service]
ExecStart=/usr/local/sbin/rapidrows path/to/config.json
WorkingDirectory=/
StandardOutput=append:/path/to/log/file
Restart=on-failure
RestartSec=5s
User=www-data

[Install]
WantedBy=multi-user.target

Set the paths to the rapidrows executable, the config file and the log file as appropriate.

You can also change the user that the service runs as – rapidrows itself does not have any privilege or permission or group requirements.

© RapidLoop 2022. All rights reserved. This page was last updated 03 Oct 22 10:24 UTC.